Help! Computing power or better algorithm required

Peter Denning
Peter Denning, Image via Wikipedia

I have a serious problem with my MSc project.

For the first part of this I am seeking to demonstrate/investigate the underlying assumptions about locality, phases of locality and so on that underlie the working set method of VM management.

The graphs on other blogs here are some of the output of the test cases I have been using and they seem to work well.

In addition, though, I want to show what Peter Denning refers to as a “lifetime curve” for a process – essentially showing how increasing the size of the resident set (or in my case increasing the time for which pages remain resident) changes the time between page faults.

This should not (and early results of mine show) it isn’t linear (though it is monotonic). But to calculate this seems, under the algorithm I am using, to require vast amounts of computing power.

My algorithm (in Groovy) is essentially this:

1. Set time (if we are plotting 600 points then first time would be 1/600th of the process lifetime) for page expiry

2. Reading the lackeyml file and using a groovy/java map, store the page number and the reference time -if the page wasn’t referred to in the map, increase the fault count

3. Look through the map to see if there are any pages which are older than the page expiry times, and evict them from the map

4. Move to the next record in the lackeyml file and return to 2

5. Get to the end of the file and count the total faults.

The problem with all this is (2) above – typically it is being called millions of times and iterating through a map in this way is extremely slow. I think on a dual AMD64 box here with the 48 million instruction count for xterm (pretty much a ‘toy’ application in this context) – this will take 20 days to run, even with a decent bit of parallelization in the code.

Hence I need a better algorithm or massive (and affordable) computing power. Anyone have a cluster they could lend me for a few hours?


Patterns of locality in xterm

So, I fixed the problem I was having with ranges by reducing the number of times the range was being accessed by something like 1000 orders of magnitude by pushing the calculation to much later on in the program (ie instead of checking the range on every calculation, check it only at the end when some other filtering has already happened).

And so I can now show the world a graph that illustrates how xterm‘s initialisation does indeed show different phases of locality – ie that in a small enough time frame spacial locality is shown in memory accesses, but over a bigger time space that locality will shift – this is important because it indicates that a good virtual memory manager would be able to flush the pages from the old locality out of the system:

locality in xterm memory accessesPlainly I need to do a bit more work on the graphing software – but the yellow lines are the axis and the green dots indicate where the memory is being accessed – time (or rather its analogue in terms of instructions executed) runs from left to right and (process) memory address gets greater as you move from the bottom up.

In fact this graph is only looking at the bottom 10% of the range of memory accessed by the application (and we are ignoring the instructions in favour of explicit heap loads, stores and modifications) – but the problem is that for most of that range nothing is accessed – again I need to adjust the Groovy script to make it better.

None of this is new – this experiment was first conducted in the early 1970s – though I am not aware of anyone having done it for Linux – but I suspect they have.

The price of syntactical sugar?

A selection of programming language textbooks ...
Image via Wikipedia

The programming language Groovy contains lots of “syntactical sugar” designed to make the programmer’s task easier and the code more expressive and so, presumably, easier to maintain.

But my experience seems to suggest it also comes with a heavy performance price. I have updated some code I am working on to create XML files and now check some values against a range – of the form max .. min and performance has fallen through several floors: a script that was running in 20 minutes has now been cranking away for over 145 minutes.

Introducing the lackeyml format

Example of a XML file
Image via Wikipedia

The Valgrind simulation program has a lot of options and for many of them it supplies an XML output option, though not so for the basic “lackey” option that allow one to look at basic memory reads and writes.

I need some XML output from that – it’s simply easier to manipulate and interpret the output (eg to transform into a graph) when you can use a known and well supported format. So I wrote some code to do that and defined a new format – lackeyml.

You can get the Groovy code here – and use it like this:

First run Valgrind and output the raw text to a file (using xterm here as the targeted application):

valgrind --tool=lackey --trace-mem=yes --log-file=xtlackey.txt xterm

Then run the groovy script against that file:

groovy lackey_xml.groovy xtlackey.txt

If, and when, I get the next stage right – producing an SVG from the output, I’ll post that on the blog also.

Eighteen million lines of XML!

Image via Wikipedia

I have just finished a Groovy (the language) helper application to process the output of Valgrind‘s “lackey” package into an XML file for further analysis. More proof of the power of Groovy (a lanuage I am coming to like) – but also a sign that maybe I should get a more powerful computer – the XML file that came from the loading of xterm ran to over 18 million lines.

Bridge working

Adapter-Card for Notebook PCMCIA/PC-Cards in a...
Image via Wikipedia

My software ethernet to 802.11b bridge has been up and running for a couple of days now – I had to change some of the BIOS settings to get it to work reliably (it seemed the PCI <—> PCMCIA bridge would not be properly configured if I had “plug and play” settings on), I had to use the right drivers – the newer 80211 driver did not operate with the card, but the older hostap_cs one did – and of course I had to set it to master mode.

Beyond that there were still a few routing and IP address issues and I have yet to nail them down and ensure that the bridge will come back up automatically – when I do that I will give all this a full and proper write up.

Using old Prism 2 card for the wireless bridge

So, I have taken the Belkin RT2500 card out and replaced it with a decade old DLink DWL-650 (a locked down PCMCIA <–> PCI bridge) – this has a Prism chipset and so should work. But right now I cannot even get the wireless card to come up, never mind bridge it…

adrian@dragoneye:~$ sudo hostapd -ddd /etc/hostapd/hostapd.conf
[sudo] password for adrian:
Configuration file: /etc/hostapd/hostapd.conf
Failed to create interface mon.wlan1.
nl80211 driver initialization failed.
wlan1: Unable to setup interface.


More as I get it…