Update: I have truncated this article for now (20 December) as there was an error in my LRU software that made LRU look like a much better performer than it really was. I’ll update this with the correct data shortly….
In 1966 László Bélády published “A study of replacement algorithms for virtual storage computers”, one of the truly epoch making papers for operating system science – the first comprehensive examination of page replacement strategies for virtual memory computers.
These days all but the simplest embedded computing devices will use some sort of virtual memory system because it allows computing devices to (relatively) seamlessly load bits of computer programs in and out of memory as needed – the programs see a faked – virtual – address and so the chunks can be loaded in an out of whatever piece of memory is available without worrying about having to get the chunks into exactly the same place every time.
But in 1966 virtual memory was a new and essentially experimental technology and so Belady’s examination of the different strategies for deciding which chunk (page) of memory was kept or replaced when new pages were required to be loaded is the foundation stone of all the approaches that followed.
This last couple of weeks I have found myself walking in the steps of Bélády as I built software to examine the different performance characteristics of potential page replacement policies in a network-on-chip computer.
I have about 220GB of XML data which represents a record of the memory accesses of an 18 threaded video processing application – and using that data I can test computer system performance using various different policies.
Servers designed for Linux (Photo credit: Wikipedia)
Two big thoughts strike me as a result of the literature review I have just completed for my PhD:
- Linux is not the centre of the universe, in fact it is a bit of an intellectual backwater;
- The UK may have played as big a role in the invention of the electronic computer as the US, but these days it is hardly even in the game in many areas of computing research.
On the first point I am in danger of sounding like Andy “Linux is obsolete” Tanenbaum – but it is certainly the case that Linux is far from the cutting edge in operating system research. If massively parallel systems do break through to the desktop it is difficult to imagine they will be running Linux (or any monolithic operating system).
In fact the first generation may do – because nobody has anything else right now – but Linux will be a kludge in that case.
Doing my MSc which did focus on a Linux related problem, it seemed to me that we had come close to “the end of history” in operating system research – ie the issue now was fine tuning the models we had settled on. The big issues had been dealt with in the late 60s, the 70s and the 80s.
Now I know different. Operating systems research is very vibrant and there are lots of new models competing for attention.
But along the way the UK dropped out of the picture. Read some papers on the development of virtual memory and it will not be long before you come across the seminal experiment conducted on EMAS – the Edinburgh University Multiple Access System – which was still in use when I was there in 1984. Now you will struggle to find any UK university – with the limited exceptions of Cambridge and York (for real-time) – making any impact (at least that’s how it seems to me).
It’s not that the Americans run the show either – German, Swiss and Italian universities are leading centres of research into systems software.
I am not sure how or why the UK slipped behind, but it feels like a mistake to me – especially as I think new hardware models are going to drive a lot of software innovation in the next decade (well, I would say that, wouldn’t I?)
Image via Wikipedia
By 2020 home computing devices in the UK will be consuming around 7 Terra Watt Hours (TWh) of electricity every year: it was just 1 TWh in 1990.
Consumer electronic devices, all of which will be running some software and many of which will have what can loosely be described as an operating system, will be eating a massive 22 TWh, almost double where they were in 1990.
Essentially this rise of the computing machines more than matches the falls in electricity use that come from technological improvements in domestic lighting and refrigeration over this time.
Hardware improvements – more MIPS per Watt – plainly are not enough to control the growth of computing-driven power consumption.
Operating systems research has been seriously neglected in our universities in recent years (and I do not just mean in the UK): maybe that ought to be reconsidered and urgently.
How systems order their storage accesses, how they handle virtual memory, sequence their access to the network, and many more questions besides have a big impact on computing power use. And, at 29 TWh, just a 1% saving would lighten domestic bills by about £30 million. And that excludes the positive impact on greenhouse gas emissions.
(There is a Guadian article about this but I cannot see it on their website yet – when I can I’ll link to it.)
The graph above shows the number of (4k) pages required by xterm when running on a 32 bit Linux system – the x axis measures the lifetime of the process (as determined by instructions executed), the y axis the number of pages.
So, I fixed the problem I was having with ranges by reducing the number of times the range was being accessed by something like 1000 orders of magnitude by pushing the calculation to much later on in the program (ie instead of checking the range on every calculation, check it only at the end when some other filtering has already happened).
And so I can now show the world a graph that illustrates how xterm‘s initialisation does indeed show different phases of locality – ie that in a small enough time frame spacial locality is shown in memory accesses, but over a bigger time space that locality will shift – this is important because it indicates that a good virtual memory manager would be able to flush the pages from the old locality out of the system:
Plainly I need to do a bit more work on the graphing software – but the yellow lines are the axis and the green dots indicate where the memory is being accessed – time (or rather its analogue in terms of instructions executed) runs from left to right and (process) memory address gets greater as you move from the bottom up.
In fact this graph is only looking at the bottom 10% of the range of memory accessed by the application (and we are ignoring the instructions in favour of explicit heap loads, stores and modifications) – but the problem is that for most of that range nothing is accessed – again I need to adjust the Groovy script to make it better.
None of this is new – this experiment was first conducted in the early 1970s – though I am not aware of anyone having done it for Linux – but I suspect they have.
- Introducing the lackeyml format (cartesianproduct.wordpress.com)
Took another day’s worth of work to get this right … last night’s effort had a few mistakes in it as well as being not such a great piece of Metapost. So here’s the correct version:
As before, the Metapost of this image is available at http://github.com/mcmenaminadrian
Art was the only subject in which I failed a school exam – getting a low 30-something in 1980′s end of year tests. Not that I cared much. But as the years have gone by I have on more than one occasion wished I was rather better at it. Even now, trying to write a scientific paper – and for me, at least back then, art was always the opposite pole to science – my art skills are rather letting me down.
Virtual memory and paging
This graphic shows you why – now I have reduced it in size it looks passable (as a hugely simplified explanation of paging and virtual memory, but the arrows are still very ragged. Still, this is better than I would have managed even a year ago.