When I first saw the die of Intel's Silverthorne (now part of the Atom family), my initial reaction was: "It's the same shape as an old-style DRAM. I wonder why that is". However, that's not the curious thing about Silverthorne, once I worked out – with the help of the paper presented at the International Solid-State Circuits Conference (ISSCC) – that the shape is pretty much governed by the memory bus logic.
The curious thing about Silverthorne is that it is just a processor – in a market where everything it will compete with will be a system-on-chip (SoC). OK, SoC is a bit of a misnomer in the context of a portable computer as you still need a stack of chips around the main one to make anything usable. The iPhone, for example, has them stacked and squeezed together to get everything it needs into a phone-sized package. However, Intel is looking at markets where devices such as Texas Instruments' Omap rule.
It is doubly curious when you consider Silverthorne's die size: 25mm2 is tiny. It is a little less than a quarter the size of the dual-processor Penryn, which clocks in at 107mm2. For a desktop Intel processor, the Penryn is surprisingly small. Intel has been known to go double that size for the first iteration of a processor.
By going with something that is much closer to the die size of a 'mature' Intel processor, Penryn should yield very nicely. The yield of a chip is, roughly, inversely proportional to the square of its area. Intel can only make so-called reticle busters such as the recent Itaniums by employing redundancy – each one is bound to have a defect on it. Something like a Penryn is far less likely to suffer manufacturing defect. When you consider that you can hundreds of them on a 300mm wafer - which is going to cost a couple of grand to make – it is easy to see how Intel could do very nicely with Penryn.
Given that, Silverthorne should do way better. However, chip economics do not work quite like that – once you get below 50mm2 or so, you don't typically achieve that much of a benefit in cost, not compared with shaving 25mm2 off a design that needs 150mm2 in total. It may be that 45nm wafer processing a little bit different to previous technology nodes. But given that Intel spent a lot of time on the manufacturability of Penryn, even to the extent of changing the shape of the contacts used to connect transistors to the wiring, it seems likely that the same approach has translated to Silverthorne.
Yes, there are microcontrollers that are smaller than 25mm2 but they are intended for systems where the microcontroller and a ragbag of jellybean chips are all that are needed. Silverthorne is just a processor, so it needs a chunk of memory and a pile of peripherals. Intel could easily have doubled the die size to get a pretty good stack of memory controllers and other goodies onboard. But it didn't: it just shrank the processor.
You can see the effect in the practical transistor density of Silverthorne. It contains 47 million transistors. Penryn contains 420 million on a die that is just over four times bigger. That means the effective transistor density of Silverthorne is roughly half that of Penryn. How so? The area needed for I/O – the 533Mtransfer/s bus is actually larger than the processor core itself. And that's including the level-one caches. I/O cells are big, even on a device that is flip-chipped into place in its package. So, the key to cost is making sure that you have a good ratio of core transistors to I/O cells. Going the SoC route helps a lot with that ratio.
I disagree with Kenton Williston's comparison of processor area in EETimes. He claims that the Silverthorne processor core is much larger than that of ARM's Cortex-A8. However, ARM quoted an area of "less than" 4mm2 for the dual-pipeline A8 on 65nm but that is without the Neon SIMD unit. Silverthorne has some SIMD units, such as an integer multiplier and a built-in floating-point unit. The A8 needs Neon to run floating-point instructions. The floating-point unit on Silverthorne is around 1.5mm2. However, ARM quoted its area based on a slightly larger level-one cache, although we are talking way less than 1mm2 difference.
The Silverthorne processor has to carry a penalty in the form of its "front-end controller", needed to work out how the x86 instructions can be packed off to the execution units together. A RISC architecture such as ARMv7 needs an instruction decoder that is not nearly so complicated. Call it the CISC tax. However, the differential is not nearly as big as Williston's estimate – he reckons the Silverthorne core needs 9mm2. I think a fairer estimate is somewhere between 6 and 7mm2. And don't forget that this thing is meant to run at up to 1.8GHz, not the 1GHz that ARM was targeting with the A8. That was on a 65nm process but, when you're talking low-power consumption, the difference between 65nm and 45nm performance is not going to be that great.
By the time you've lobbed on the bus-interface unit – which is bigger than the floating-point unit and the level-two cache, the difference between the size of the A8 and Silverthorne's processor core is not all that much. Intel has a lot to learn in the MID space but it's far from a smackdown for ARM.
But the troubling part about Silverthorne is the decision to make a just a processor. It feels like the whole thing is an experiment to see what the MID builders really want – then Intel can do a quick rev of the design and deliver that. One notable aspect about Silverthorne's design is its design. Gone are the days when Intel processor engineers lovingly hand-crafted big chunks of the chip. With Silverthorne, engineers worked at a relatively high level and do much of the layout. More than 90 per cent of the core was generated using logic synthesis. It means that, for derivatives, Intel can easily optimise the shape of the processor to fit around the caches and peripherals.
As a result, first-generation Atom-based devices could be clunky and surprisingly expensive: having separate Northbridge and Southbridge controllers hardly helps with size. But the second generation will probably iron out a lot of the kinks by going for more of an all-in-one design. Diamondville could well benefit from that – the features needed in an OLPC are pretty well-known so, rather than cut the core down any further, the cost reductions will come through taking what might be three or four parts in a regular PC and mushing them all together on one die. The only thing that works against this is that Intel has traditionally made its support chipsets on relatively old processes – it's a big porting process trying to get that stuff down onto 45nm.
However, the big problem for Intel remains: do you really want Windows (or OS X) on a MID? Because, if you don't, there is not that much point in selecting an x86 over ARM. System builders know that, if they pick the x86, they run the risk of increasing Intel's strength with probable bad results for their own margins whereas there is plenty of competition among chipmakers who are offering ARM-based products. Let's face it, Intel has found the going tough outside its normal, PC-oriented customers base.
You would want a compelling reason to go with the x86 in any sub-notebook market. And Microsoft has not made Intel's job any easier by focusing on high-end machines with Vista and making sure that Windows CE runs well on ARM. Similarly, Apple has spent a lot of time and effort to get a version of OS X to run on ARM for the iPhone - something that gives the Mac maker a lot of clout over suppliers.