Help! Computing power or better algorithm required

Peter Denning
Peter Denning, Image via Wikipedia

I have a serious problem with my MSc project.

For the first part of this I am seeking to demonstrate/investigate the underlying assumptions about locality, phases of locality and so on that underlie the working set method of VM management.

The graphs on other blogs here are some of the output of the test cases I have been using and they seem to work well.

In addition, though, I want to show what Peter Denning refers to as a “lifetime curve” for a process – essentially showing how increasing the size of the resident set (or in my case increasing the time for which pages remain resident) changes the time between page faults.

This should not (and early results of mine show) it isn’t linear (though it is monotonic). But to calculate this seems, under the algorithm I am using, to require vast amounts of computing power.

My algorithm (in Groovy) is essentially this:

1. Set time (if we are plotting 600 points then first time would be 1/600th of the process lifetime) for page expiry

2. Reading the lackeyml file and using a groovy/java map, store the page number and the reference time -if the page wasn’t referred to in the map, increase the fault count

3. Look through the map to see if there are any pages which are older than the page expiry times, and evict them from the map

4. Move to the next record in the lackeyml file and return to 2

5. Get to the end of the file and count the total faults.

The problem with all this is (2) above – typically it is being called millions of times and iterating through a map in this way is extremely slow. I think on a dual AMD64 box here with the 48 million instruction count for xterm (pretty much a ‘toy’ application in this context) – this will take 20 days to run, even with a decent bit of parallelization in the code.

Hence I need a better algorithm or massive (and affordable) computing power. Anyone have a cluster they could lend me for a few hours?

The price of syntactical sugar?

A selection of programming language textbooks ...
Image via Wikipedia

The programming language Groovy contains lots of “syntactical sugar” designed to make the programmer’s task easier and the code more expressive and so, presumably, easier to maintain.

But my experience seems to suggest it also comes with a heavy performance price. I have updated some code I am working on to create XML files and now check some values against a range – of the form max .. min and performance has fallen through several floors: a script that was running in 20 minutes has now been cranking away for over 145 minutes.

More work on Life.groovy

According to “Groovy in Action” lists of lists are the correct way to do multidimensional arrays in Groovy, so I reckon I will try that now – as well as making some better use of some of the other language features, such as better use of ranges, that I noticed could be done.

Life.groovy now available

It’s far from sophisticated, and it will offend Groovy and agile programmers generally for not having a test suite, but a version of John H. Conway’s Game of Life is now available for anyone interested – licensed under the General Public Licence, so free to use and modify within those terms – it is at:

It is defiantly Old Skool in it’s approach to such fripperies as graphics and user interaction…

Game of Life in action

Eating humble pie

Groovy (programming language)
Image via Wikipedia

Maybe this should be titled … why you should check your examples thoroughly.

A few days ago I posted a Groovy code fragment here and said I was having trouble with the same code on an application.

My problem was that the application was a piece of coursework and I really did not want to post that here in case there was some sort of plagiarism issue later on. So I wrote a code fragment that, I thought, encapsulated the problem.

I then also took the issue to the groovy-user mailing list – see here.

The problem, though, was that there was a subtle difference between the two examples and so I was not asking people to test the same thing.

My problem was that I wanted people to enter a string of two integers separated by a comma but that the String.tokenize() method was failing to parse the input string correctly (or so I thought).

In reality the core issue was that the Scanner object (in the real code but not in the test example) was already tokenizing the input string.

To make matters worse, though, my code example was failing, but in a different way, on my Ubuntu machine – though it does not any more now I have upgraded Groovy from 1.7.0 to 1.7.6 – as the screenshot below shows – this may or may not be a bug in the code installed by default with Ubuntu so beware:

Crash in groovyConsole

Anyway, the fundamental issue was the Scanner and not the tokenize message.

Issue with String.tokenize() in Groovy

Groovy (programming language)
Image via Wikipedia

Here is some Groovy code:

class Test {
String zx = new String("4, 5")
void toy() {
char xx =','
def myBits = zx.tokenize(xx)
println "first bit is ${myBits[0]}"
println "second bit is ${myBits[1]}"

Test ff = new Test()

Apologies for the fomatting but hopefully you can see what this trying to do: tokenise a String with comma as the delimiter.

Run this code in the Groovy web console and it gives the output you would expect –
first bit is 4
second bit is 5

But when I run it at home it will collapse with a null pointer exception if I try to read the second object. Essentially it falls over if there is any whitespace after the comma.

This appears to be a problem with my configuration as asking on IRC only got me a reply that the person concerned could not replicate the problem and clearly it runs in the web console too.

Anyone else seen something similar/willing to test the above code fragment?

Update: I thought that maybe this was because I was using the openjdk (installed by default on amd64 boxes by Ubuntu). But the error persists with the Sun/Oracle JDK. Very strange.

Redirecting stdOut in Groovy

Groovy (programming language)
Image via Wikipedia

I am adding this because – while it is out there on the web – it took me some time to find it and I suppose this helps make it more visible in search engines.

I was writing a Groovy unit test for a void function that normally would print to the screen. The only simple way to test would be to redirect stdOut to a string and then test the string.

This is the (or one) way to do it:

public void testPrint() {
//based on solution on
//redirect stdOut and check string is formatted
def bufStr = new ByteArrayOutputStream()
def oldStdOut = System.out;
def newStdOut = new PrintStream(bufStr)
System.out = newStdOut
System.out = oldStdOut
String prtTestStr = bufStr.toString()