My PhD is about operating systems on Network-on-Chip (NoC) computers. I have not actually done any research yet, so don’t expect anything here – but I have been playing with some existing data and I think it gives some interesting results.
NoCs are part of the semiconductor industry’s response to the failure of “Dennard scaling”: Moore‘s law says we can double the number of transistors on a chip every 18 – 24 months and we are still able to do that. Dennard scaling was the thing that made that super useful – because it meant the power requirements for the processors stayed constant even as they acquired more transistors. Now it has broken down, building faster chips becomes that much more difficult because, bluntly, they would burn up unless we limited the power.
NoCs aim to get round this by replacing one fast and power hungry processor on a single chip with several less powerful processors on the same chip – the idea being if we can attack the problem with several slower processors we can get the job done more quickly than if we used just one faster processor.
But there’s a catch, a big one, as captured in Seymour Cray‘s question:
would you rather plough a field with two strong bulls or a thousand chickens?
NoCs do not replace one fast chip with a dozen not quite as fast chips – they parcel out the power eaten by that one chip to the whole network on the chip – it’s not quite as bad as dividing the computing power by the number of chips (for that was the case there would be no advantage at all), but it is not fantastically above that.
Using work published by Hadi Esmaeilzadeh from the University of Washington along with others from the University of Wisconsin – Madison, the University of Texas at Austin and Microsoft Research, my projection is that, if we took one of today’s fast chips and parcelled up the power, then we would see computing power decline like this:
- One processor: 100% computing power
- Two processors: 65% computing power each
- Four processors: 38% computing power each
- Eight processors: 21% computing power each
- Sixteen processors: 11% computing power each
- Thirty-two processors: 6% computing power each
- Sixty-four processors: 3% computing power each
Now, 64 x 3 = 192, so that might look like quite a good deal – a 92% speed up. But it is not that simple because some part of the code, even if just the bit that starts and ends your process, can only run on one processor even if all the rest can be split into 64 equal parts. And the pieces that will only run on one processor are now 33 times slower than they were before. The key balance is this: how much code can you run at nearly twice the speed (92% speed up) versus how much do you have to run at 33 times slower than before?
The answer is that you have to run a lot of code in the fast zone before you really see a big advantage.
As the graph suggests you would need to have about 99.9% of your code capable of running in parallel before you saw a guaranteed speedup with 64 processors in your NoC. Plenty of such code exists – such as in image handling and so on – but you are not likely to be running too much of it on your desktop computer at any given time (except perhaps when you are running a streaming video application) and the big disadvantage is that when you are not running the parallel code you are stuck with the 3% performance.
(Actually, it’s not quite as simple as that, as you may have a faster processor equipped to do the single thread of execution stuff, but then your computer starts to become more expensive.)
In the future chips will get faster – but maybe not that much faster. In a decade’s time they could be between 400% and 34% faster than they are today, depending on whether you are optimistic or pessimistic (realistic?) about processor technologies. That will help, but still not enough to put this technology in your desktop – as opposed to your games console or your television set or maybe a specialised processor in a top of the range machine.
So don’t expect your personal supercomputer too soon.
- The end of Dennard scaling (cartesianproduct.wordpress.com)
- Dell working on ARM supercomputer prototypes (techcentral.ie)
- IBM Roadrunner Supercomputer Retires (dailytech.com)
- Since Dwave 512 qubit Quantum computer is 10,000 times faster than a 420 GFlop workstation then it approximates a 4.2 petaflop supercomputer for some optimization problems (nextbigfuture.com)
- Intel Core i3 3240 Computer Processor Review (cpureviews.wordpress.com)
- NASA + Google = Quantum Supercomputer (tweendoestech.wordpress.com)