Is the time pages are in the working set stochastic?

Reading about the Monte Carlo method has set me thinking about this and how, if at all, it might be applied to page reclaim in the Linux kernel.

In my MSc report I show that my results show that working set size is not normally distributed – despite occasional claims to the contrary in computer science text books. But it is possible that a series of normal distributions are overlaid – see the graphic below:

Working set size for MySQL daemonThe first question is: how do I design an experiment to verify that these are, indeed a series of normal distributions?

(I may find out how I have done in the degree in the next week or so – wish me luck)

Microsoft are not the enemy

For the last six years my job situation has made me wary of commenting on the politics of the free software movement and its enemies, but I have just changed jobs (been a busy week round here) and now I feel I comment freely on what every free software advocate has always known as Public Enemy Number One: Microsoft.

Except, now the time has come, I don’t really feel they are any more. Indeed I feel free software is more threatened by one of the teams that is meant to be on our side – Google, which has embraced the world of software patents and user abandonment with enthusiasm and by another, Apple, that uses Unix but seems oblivious to the idea of user freedom.

Now, I am not saying that, because Microsoft have found themselves in recent months as one of the biggest contributors to the Linux kernel, and marked the 20th birthday of Linus’s famous usenet announcement of a new Unix-like kernel with an appeal for more co-operation, they really are our friends. But we should recognise that the change – even if it did come because they were dragged through the US and EU courts – is a real one. They are at least recognising that we are here to stay and that the server room of the future will be running multiple OSes on virtualised machines of different flavours.

On the desktop our side is still nowhere – perhaps still under 1% globally and the boys and girls in Seattle might quickly turn nasty again if we ever did start to crack that nut, but in the meantime we maybe should be testing just how sincere that offer of co-operation really is… afterall they are not offering to work with us because they think we are weak!

Update; Thanks to the retweeters. Think I should point out my proposal of a compromiso historico with Microsoft is a minority view – as the links below probably suggest.

Done and dusted

Image via Wikipedia

I submitted my MSc project report yesterday, so that is it, at least for now, as a computer science student.

The report was on “applying working set heuristics to the Linux kernel“: essentially testing to see if there were ways to overlay some elements of local page replacement to the kernel’s global page replacement policy that would speed turnaround times.

The answer to that appears to be ‘no’ – at least not in the ways I attempted, though I think there may be some ways to improve performance if some serious studies of phases of locality in programs gave us a better understanding of ways to spot the end of one phase and the beginning of another.

But, generally speaking, my work showed the global LRU policy of the kernel was pretty robust.

I find out how I did in November.

Need to find some other programming task now. Mad bit of me suggests getting engaged with GNU Hurd. Though mucking about with Android also has an appeal.

Problems with oprofile

Image via Wikipedia

For one-last-thing with my report I want to profile the kernel in a specific configuration and so thought I would try oprofile instead of the cruder profile=X command line options.

Big mistake.

Essentially I could not get it to run under KVM at all. KVM hides many hardware details from the profiler and set up is notoriously difficult (I now know). Apparently this is fixable, but not simply and there is very little information out there about how to do it. If I had days to learn maybe I would persevere, but I don’t.

But I did come across one feature/bug in oprofile that I will document a fix for in the hope it proves useful to someone.

To start oprofile off (to profile the kernel), one has to specify where a vmlinux file (note, not a compressed vmlinuz or bzImage etc) or similar is.

Mine were of the format vmlinux-3.0.0-sched+ but oprofile consistently failed to let me specify that: again I did not have time to go into the details but is was clear it was the + that was the issue. I renamed the file and all was fine.

Best book on Linux kernel internals

Write an MSc project report means having to read a lot of source code and constantly referring to texts in the hope that they will make things clearer.

I have three books on the kernel – there are obviously others, but I think two of these three will be familiar to most kernel hackers – but it is the third that I rate most highly.

Understanding the Linux kernel
Understanding the Linux Kernel: for many this must feel like the standard text on the kernel – it’s published by O’Reilly, so (my experience with their XML Pocket Reference not withstanding) will be good, it’s well printed and readable and it is, after all, the third edition of a book your forefathers used. But the problem is, it is also now six years old and  a lot has happened since then. I well remember going into Foyles in the autumn of 2005 and seeing it newly minted on the shelves. For a long time it could hide behind the fact that the kernel was still in the 2.6 series, but even that protection is gone. Verdict: Venerable but getting close to past it.




Linux kernel developmentLinux Kernel Development: the previous, second, edition of this book was a fantastic introduction to how the kernel worked and was written in a slightly whimsical tone which made it easier to read. It is rare that one can read a computer book like a novel, starting from the first page and going on to the end, but you could with that. Sadly someone seems to have got to Robert Love (who, from personal experience I know to be a great guy) and presumably told him he had to be more serious if he wanted his book to be a set text for CS courses. The problem is that the book now falls between two stools – somewhere between a solid but broad introduction to how the kernel works and a guide to writing kernel code. Unfortunately,  it does not quite hit either target. That said, it is still worth having. Verdict: Good, but where did the magic go?






Wrox kernel bookProfessional Linux Kernel Architecture : unfortunately, everything about the Wrox brand suggests “cheap and nasty”, which is a real pity, as this book is the best of the bunch. Admittedly it would not, as Robert Love’s book, be at all suitable as a primer for operating system study – it is far too big and detailed for that. But if you are looking for an up to date (or reasonably so, anyway) helpmate for actually patching the kernel, then this seems to be the best choice. Sadly the cheap and nasty side does creap through on occasion with bad editing/translation, but it’s not enough to stop me from recommending it. Verdict: this one’s a keeper, for now any way.

My first R program

Having used Groovy (which makes the scripting environment feel familiar) and some Scheme (via Structure and Interpretation of Computer Programs), R does feel completely alien, but it still feels like a steep learning curve.

But here’s my short script –

unpatched <- read.csv("~/unpatched.txt")
unpatchcons <- transform(unpatched, realm=realm*60 + reals)
plot(size, realm, log="y")
abline(reg=linelog, untf=TRUE, col="blue",lty=3)

And here’s the graph (of Linux kernel compile times) it generates – the blue line is obviously a very bad fit!

Linux kernel compile times

Testing OpenGrok

Programming in the large and programming in th...
Image via Wikipedia

Hacking at the kernel means using a Linux Cross Reference (LXR) is pretty much essential.

I have set one up on my own servers before, but it was difficult to maintain and the performance was poor.

But I am trying out the OpenGrok tool now – this was quite easy to install once I realised the thing to was not to read the various online descriptions of what to do, but to look at the README file that came with the binaries.

First impressions … it looks nice but I am not sure it is really up to it.

You can try mine at