Still out there

Surprised and pleased to find that, a quarter of a century after I released it to a distinctly unmoved world – and a decade after I first mentioned it on this blog – the first piece of software I published, a not particularly brilliant program that allowed you to predict the result in a given UK constituency from a national opinion poll, is still available on an FTP server –

Can’t actually run this on a 64 bit Windows system and the source (in Borland C++) is long-gone…

The missing link and closing schools

London, where I am writing this, is now perhaps the global centre of the covid19 pandemic, thanks to a mutation of the virus that has allowed it to spread more easily. This mutation may not have come into existence in the South East of England but it has certainly taken hold here, and about 2% of London’s population currently have symptomatic covid.

In response all primary and secondary schools, which were due to open tomorrow, will be effectively closed and teaching will go online.

Suddenly the availability of computing resources has become very important – because unlike the Spring lockdown, where online teaching was (generally) pretty limited, this time around the clear intention is to deliver a full curriculum – and means one terminal per pupil. But even now how many homes have multiple computers capable of handling this? If you have two children between the ages of 5 and 18, and two adults working from home it is going to be a struggle for many.

Thus this could have been the moment that low cost diskless client devices came into their own – but (unless we classify mobile phones as such) they essentially don’t exist. The conditions for their use have never been better – wireless connections are the default means of connecting to the internet and connections are fast (those of us who used to use X/Windows over 28kbit dial-up think so anyway).

Why did it not happen? Perhaps because of the fall in storage costs? If the screen and processor costs haven’t fallen as fast as RAM and disk then thin clients get proportionally more expensive. Or perhaps it’s that even the fat clients are thin these days? If you have a £115 Chrome book then it’s probably not able to act realistically as a server in the way a laptop costing six times as much might.

But it’s also down to software choices and legacies. We live in the Unix age now – Android mobile phones and Mac OSX machines as well as Linux devices are all running some version of an operating system that was born out of an explicit desire to create an effective means to share time and resources across multiple users. But we are also still living in the Microsoft Windows era too – and although Windows has been able to support multiple simultaneous users for many years now, few people recognise that, and even fewer know how to activate it (especially as it has been marketed as an add-on and not the build in feature we see with Unix). We (as in the public at large) just don’t think in terms of getting a single, more powerful, device and running client machines on top of it – indeed most users run away at the very idea of even invoking a command line terminal so encouraging experimentation is also difficult.

Could this ever be fixed? Well, of course, the Chrome books are sort of thin clients but they tie us to the external provider and don’t liberate us to use our own resources (well not easily – there is a Linux under the covers though). Given the low cost of the cheapest Chrome books its hard to see how a challenger could make a true thin-client model work – though maybe a few educational establishments could lead the way – given pupils/students thin clients that connect to both local and central resources from the moment they are switched on?

Why do people hate Apple redux

Image representing Apple as depicted in CrunchBase
Image via CrunchBase

The Guardian has an interesting piece called “why do people hate Apple“, which makes some telling points but I do not think really gets there.

I suppose I should begin by saying I don’t “hate Apple” – in fact I think their products are quite nice if hideously over-priced. But I do have a bit of contempt, I must admit, for those people who tell me they like Apple because “it’s so easy” – somewhat like the fashion, sadly continuing, of boasting by arts and humanities graduates of how little they know of maths or science (scientists do not do the same back).

Apple is “easy” because you are restricted to buying their over-priced hardware, it is as simple as that. If they control the hardware they control the drivers and so you have to do what they say.

OK, you may feel that the time it would take you to learn how to install your own hardware is worth every penny. But your lack of knowledge is hardly something to boast of as though you were superior to those of us who do know how to do it, is it?

(Though the problems people sometimes have with hardware on their Windows machine is because of the broken business model that too many hardware manufacturers operate under. The real choices are – grip the operating system and the hardware so the two move in absolute tandem, as with Apple – or free the hardware specification so that FOSS drivers appear, for free. Being a hardware manufacturer but being forced to chase after Microsoft’s Windows ABI is surely the worst of both worlds. On this point I do think Eric S. Raymond is right.)

Rediscovering enthusiasm

This is the first “normal” – not abroad or just back, not jet lagged and so on – weekend I’ve been able to have at home in a month and it has also been the first time in that period where I have been able to expend some time to looking further at my proposed MSc project – on extending working set heuristics in the Linux kernel.

The good news is that I am once more convinced of the utility of, and enthusiastic about the implementation of, the idea. At the risk of looking very naive in six months (or six weeks) time even in my own eyes – here is the core idea:

Peter Denning’s 1968 and 1970 papers on the working set and virtual memory made some bold claims – calling global page replacement algorithms “in general sub-optimal” and asserting that the working set method is the best practical guarantee against thrashing.

Windows NT and its derivatives (XP, Vista, 7 etc) reflect their heritage from VMS in using a working set based replacement policy.

In contrast Linux (and the Unix family generally) use global replacement policies: indeed a fairly simple clock algorithm stands at the centre of Linux’s page replacement policies. Kernel developers say the policy works well in practice and that, in effect, the active “least recently used” list of cached pages – against which the clock algorithm runs, is a list of pages in the working sets of running processes.

My essential idea is to seek to trim the active list on a process-by-process basis when the system is under high load (the long delay in execution caused by a page fault hopefully making it efficient to execute the extra code in the hope of reducing the number of page faults.) Pages from the active list that are owned by the processes with the biggest memory footprint will be dropped into the inactive list, so making it more likely they will be eventually swapped out.

The second aspect of the application of a working set heuristic will be to alter the scheduling priorities of processes depending on their memory footprint. There are a few options here and I have not looked at this closely enough yet, but things to test could include:

  • Increasing the priority of the smallest processes – on the basis these might reach the end of execution more quickly and so release memory back to the pool
  • Radically lowering the priorities of the processes whose pages are being swapped out – on the basis that they do not have a working set of resources available and so, as Denning argued forty years ago, should not be able to run

In practical terms I am still some way off writing any kernel code. I have, though, written some user tools (still need polishing) to display the memory footprint of Linux processes in a red-black tree (the representation used internally by the kernel). Following Eric S Raymond (on Unix programming not politics!), the tools are partitioned into single applications that do different things – but run together they can generate graphics such as the one below:

Processes on a Linux box

So, on we go…