Linking Chips With Light For Faster AI
Stephen Cass: Hi, I’m Stephen Cass, for IEEE Spectrum’s Fixing the Future. This episode is brought to you by IEEE Xplore, the digital library with over 6 million pieces of the world’s best technical content. Today I have with me our own Samuel K. Moore, who has been covering the semiconductor beat pretty intensely for Spectrum for— well, how many years has it been, Sam?
Sam Moore: 7 years, I would say.
Cass: So Sam knows computers down at the level most of us like to ignore, hidden underneath all kinds of digital abstractions. This is down where all the physics and material science that make the magic possible lurk. And recently, you wrote an article about the race to replace electricity with light inside computers, which is letting chips talk to each other with fiber optics rather than just using fiber optics to communicate between computers. I guess my first question is, what’s wrong with electricity, Sam?
Moore: I have nothing against electricity, Stephen. Wow… It knows what it did. But really, this all comes down to inputs and outputs. There just aren’t enough coming off of processors for what they want to do in the future. And electronics can only push signals so far before they kind of melt away, and they consume quite a bit of power. So the hope is that you will have better bandwidth between computer chips, consuming less power.
Cass: So it’s not just a question of raw speed, though, when you talk about these signals and melting away, because I think the signal speed of copper is about, what, two-thirds the speed of light in a vacuum. But then I was kind of surprised to see that, in a fiber optic cable, the speed of light is about two-thirds of that in a vacuum. So what’s going on? What’s kind of the limitations of pushing a signal down a wire?
Moore: Sure. A wire is not an ideal conductor. It’s really resistance, inductance, and capacitance, all of which will reduce the size and speed of a signal. And this is particularly a problem at high frequencies, which are more susceptible, particularly to the capacitance side of things. So you might start with a beautiful 20 GHz square wave at the edge of the chip, and by the time it gets to the end of the board, it will be an imperceptible bump. Light, on the other hand, doesn’t work like that. It has things that— there are things that mess with signals in optical fibers, but they work at much, much, much longer length scales.
Cass: Okay, great. So you talked about there are two companies that are in this sort of race to put light inside computers. So we can talk a little bit? Who are they, and what are their different approaches?
Moore: Sure, these are two startups, and they’re not alone. There are very likely other startups in stealth mode, and there are giants like Intel that are also in this race as well. But what these two startups, Ayar Labs, that’s A-Y-A-R—and I’m probably pronouncing it a little weird—and Avicena, those are the two that I profiled in the January issue. And they’re representative of two very different sort of takes on this same idea. Let me start with Ayar, which is really sort of the— it’s sort of what we’re using right now but on steroids. Like the links that you find already in data centers, it uses infrared laser light, kind of breaks it into several bands. I can’t remember if it’s 8 or 16, but so they’ve got multiple channels kind of in each fiber. And it uses silicon photonics to basically modulate and detect the signals. And what they bring to the table is they have, one, a really good laser that can sit on a board next to the chip, and also they’ve managed to shrink down the silicon photonics, the modulation and the detection and the associated electronics that makes that actually happen, quite radically compared to what’s out there right now. So really they are sort of just— I mean, it’s weird to call them a conservative play because they really do have great technology, but it is just sort of taking what we’ve got and making it work a lot better.
Avicena is doing something completely different. They aren’t using lasers at all. They’re using
microLEDs, and they’re blue. These are made of gallium nitride. And why this might work is that there is a rapidly growing microLED display industry with big backers like Meta and Apple. So the problems within that you might find with a new industry are kind of getting solved by other people. And so what Avicena does is they basically make a little microLED display on a chiplet, and they stick a particular kind of fiber. It’s sort of like an imaging fiber. It’s similar to if you’ve ever had an endoscopy exam, you’ve had a close encounter with one of these. And basically, it has a bunch of fiber channels in it. The one that they use has like 300 in this half a millimeter channel. And they stick the end of that fiber on top of the display so that each microLED in the display has its own channel. And so you have this sort of parallel path for light to come off of the chip. And they modulate the microLEDs, just flicker them. And they found a way to do that a lot faster than other people. People thought they were going to be real hard limits to this. But they’ve gotten as high as ten gigabits per second. Their first product will probably be in the three gigabytes– gigabits, sorry, kind of area, but it’s really surprisingly rapid. People weren’t thinking that microLEDs could do this, but they can. And so that should provide a very powerful pathway between microprocessors.
Cass: So what’s the market for this technology? I mean, I presume we’re not looking to see it in our phones anytime soon. So who really is spending the money for this?
Moore: It’s funny you should mention phones—and I’ll get back to it—because it’s definitely not the first adopter, but there may actually be a role for it in there. Your likely first adopter are actually companies like Nvidia, which I know are very interested in this sort of thing. They are trying to tie together their really super powerful GPUs as tightly as possible so that they can— in the end, ideally, they want something that will bind their chips together so tightly that it’s as if it was one gigantic chip. Even though it’s physically spread across eight racks with each server having four or eight of these chips. So that’s what they’re looking for. They need to reduce the distance, both in energy and in sort of time, to their other processor units and to and from memory so that they kind of wind up with this really tightly bound computing machine. And when I say tightly bound, the ideal is to bind them all together as one. But the truth is the way people use computing resources, what you want to do is just pull together what you need. And so this is a technology that will allow them to do that.
So it’s really the big iron people that are going to be the early adopters for this sort of thing. But in your phone, there’s actually a sort of bandwidth-limited pathway between your camera and the processor. And Avicena in particular is actually kind of interested in putting these together, which would mean that your camera can be in a different place than it is right now with regard to the processor. Or you could come up with completely different configurations of a mobile device.
Cass: Well, it almost sounds like when you were talking about this idea of building essentially a computer, even kind of a CPU, even with many cores, but on the size of racks, I was thinking that reminded me of ENIAC days or even IBM, the IBM 360s where the computer would take up several racks. And then we invented this cool microprocessor technology. So I guess it’s sort of one of these great technological cycles. But you mentioned there the idea about giant chips. That is an approach that some people are trying, these massive chips to solve this bandwidth communication problem.
Moore: That’s right. They are trying to solve the exact same problem at
Cerebras. I shouldn’t say trying. They have their solution. Their solution is to never go off the chip. They made the biggest chip you could possibly make by just making it all on one wafer, and so the signals never have to leave the chip. You get to keep that really broad pathway all the way along, and then your limit is just—a chip can only be, oh, the size of a wafer.
Cass: How big is a wafer?
Moore: Oh man, it’s 300 millimeters across, but then they have to cut off the edges so you get a square. So a dinner plate, your face if you have a big head.
Cass: So what are some of the other approaches out there to solving this issue?
Moore: Sure. Well, if you look at— Ayar and Intel are actually a good contrast in that they’re really doing kind of the same thing. They’ve got silicon photonics designed to modulate and detect infrared laser light. And they’ve got– each of their lasers has 8 channels or colors rather, or sometimes 16, I think, is where they’re moving to. The difference is that Ayar keeps its laser outside of the package with the GPU. And I should kind of explain something else that is indicative of why this is the right time of it. And I’ll get back to that, but my point is, Ayar keeps its laser separate. It’s almost like a utility. You wouldn’t think of putting your power converter in the same package with your GPU. Electricity is sort of like a utility. They use laser light like a utility kind of. Intel, on the other hand, is really gung ho on integrating the laser with their silicon photonics chips, and they have their own reasons for doing that. And they’ve been working on this for a while. And so you wind up with a slightly different-looking configurations. Intel’s just one connection. Ayar will always have a connection from the laser to the chip and then out again once it’s been modulated. And they each have sort of their own reasons for doing that. It’s kind of hard sometimes to keep, for instance, the laser stable if you don’t tightly control the temperature it’s at. And if you’re in the package with the GPU, do you have control over the temperature? Because the GPU is doing its own thing until it feels fine about this clearly. And Ayar is just a startup, and they are just trying to get in with somebody who wants to integrate it into their own stuff. Other—
Cass: Because that’s something you’ve reported before on the challenge of integrating photonics with silicon so you don’t have to go off-chip. But there’s kind of been a long and somewhat—don’t want to say troubled—but a challenging history there.
Moore: Yeah, and the reason it’s become suddenly less challenging, actually, is that the world is moving towards chiplets, as opposed to monolithic silicon system on chips. So even just a few years ago, everybody was just making the biggest chip they could, filling it up. Moore’s Law has been not delivering, you know, quite as much as it has in the past.
And so there’s a new solution. You can add silicon by finding a way to bind two separate pieces of silicon together almost as tightly as if they were one chip. And this is a packaging technology. Packaging is something that people didn’t really care about so much 10 years ago, but now it’s actually super important. So there’s 3D-packaging-type situations where you’ve got chips stacked on chips. You’ve got what are called 2-and-a-half-D, which is really— it’s 2D. But they’re within less than a millimeter of each other, and the number of connections that you can make at that scale is much closer to what you have on the chip. And then so you put these chiplets of silicon together, and you package them all in one. And that is sort of the way advanced processors are being made right now. One of those chiplets, then, can be silicon photonics, which is a completely different— it’s a different manufacturing process than you would have for your main processor and stuff. And because of these packaging technologies, you can put chips made with different technologies together and sort of bind them electrically, and they will work just fine. And so because there is this sort of chiplet landing pad now, companies like Avicena and Ayar, they have a place to go that’s kind of easy to get to.
Cass: So you mentioned Nvidia and GPUs there, which are really now associated with sort of machine learning. So is that’s what’s driving a lot of this is these machine learning, deep learning things that are just chewing through enormous amounts of data?
Moore: Yeah, the real driver is that things like ChatGPT and all of these natural language processors, which are sort of a class that are called transformer neural networks. I’m a little unclear as to why, but they are just huge. They have just ridiculous, trillions of parameters like the weights and the activations that actually sort of make up the guts of a neural network. And there’s, unfortunately, sort of no end in sight. It seems like if you just make it bigger, you can make it better. And in order to train these— so it’s not the actual— it’s not so much the running of the inferencing, the getting your answer, it’s the training them that is really the problem. In order to train something that big and have it done this year, you really need a lot of computing power. That was sort of‑ that was the reason for companies like Cerebras where instead of something taking weeks, taking hours, or instead of something taking months and months, taking it a couple of days means that you can actually learn to use and train one of these giant neural networks in a reasonable amount of time and frankly, do experiments so that you can make better ones. I mean, if your experiment takes four months, it really slows down the pace of development. So that’s a real driver is training these gigantic transformer models.
Cass: So what kind of time frame are we talking about in terms of when might we see these kind of things popping up in data centers? And then, I guess, when might we see them coming to our phone?
Moore: Okay, so I know that Ayar Labs, that’s the startup that uses the infrared lasers, is actually working on prototype computers with partners this year. It’s unlikely that we will actually see the results of those from them. They’re just not likely to be made public. But when pressed, 2025-’26 kind of time frame, the CEO of Ayar thought was an okay estimate. It might take a little longer for others. Obviously, their first product is actually going to be just sort of a low-watt replacement for the between-the-racks kind of connections. But they promised a chiplet for in-package with the processor sort of hot on its heels. But again, the customers are gigantic. And they really have to— they really have to feel that this is a technology that is going to be good for them in the long term. So there aren’t that many. There’s Nvidia, there’s some of the giant AI computer makers, and some supercomputer makers, I imagine. So the customer list is not enormous. But it has deep pockets, and it’s probably kind of conservative. So it may be a little bit–
Cass: Cool, and so to the phone? Ten years?
Moore: Oh, yeah. I don’t actually know. Right now, I think that’s just sort of an idea. But we’ll see. Things could develop faster in that field than others. Who knows?
Cass: So is there anything else you’d like to add?
Moore: Yeah, I just want to kind of bring back that those two startups are indicative of what’s likely a larger group, some of that are— some of which are probably in stealth mode. And there’s plenty of academic research on doing this in totally different ways like using surface plasmons, which are sort of waves of electrons that occur when light strikes a metal surface, with the idea of being able to basically use smaller, less fiddly components to get the same– to get the same thing done because you’re using the waves of electrons rather than the light itself. But yeah, I look forward to honestly seeing what else people come up with because there’s clearly more than one way to skin this cat.
Cass: And they can follow your coverage in the pages of Spectrum or online.
Moore: Yes, indeed.
Cass: So that was great, Sam. Thank you. So today in Fixing the Future, we were talking with Sam Moore about the competition to build a next-generation of high-speed interconnects. I’m Stephen Cass for IEEE Spectrum, and I hope you’ll join us next time.
IEEE Spectrum