A timerless CLOCK algorithm


I am writing this down partly as a way of seeing if it makes sense…

Minor faults we can live with: if we have to reaffix a mapping back into the TLB it takes a few cycles but that is not a heavy burden. Major faults, however, are a different matter: they take many thousands of processor cycles to fix and have to be avoided at all costs.

Yet we have a basic problem in that we cannot mark each and every page access with a timestamp, so we have to guess at which page was least recently used – and so typically we use the sweeping hand of a CLOCK algorithm to do this.

But I have an even bigger problem – I have no time source, and hence no timer interrupts, to use.

So instead I want to try this:

  • All entries in the page table (PT) will have two validity bits, one we call VALID (V) and another AVAILABLE (A);
  • When a page is first mapped in the PT, both V and A will be marked as on.
  • When we first run out of space in the PT we will start to advance through the PT on each access turning off a V bit. So, if we pin the first three entries in the PT (eg for ‘kernel’ code, the PT itself and a stack), the first time we go round we might mark V off for the 16 entries in the PT from 2 – 18
  • Then, when we are mapping a page for which no mapping exists, we take the first entry where V is off
  • But before that we check whether we have any entries where A is on that map our page, in which case we switch V on for that mapping and we are off

 

Having written that out I can see we only need the A bit if we think that a 0x00 <–> 0x00 mapping might be a valid one, but we don’t have to make that assumption and so we can restrict this to just the V bit.

Picking which page to remove


I have now written some very basic but functional paging code for the Microblaze but have realised I am, essentially, going to have to start a lot of it again.

The Microblaze puts page tables firmly under software control and its memory management unit (MMU) concentrates on managing page references through its translation lookaside buffers (TLBs) instead. So a reference to a virtual address that is mapped in a TLB is handled invisibly to the user or programmer, but a reference to a memory address that may be mapped in a page table but not in the TLB generates an exception and everything is dumped into the lap of the programmer – this is what is often known as a “soft fault” (a “hard fault” being when there is no version of the page being referenced available in physical memory at all).

The number of TLBs is quite limited – just 64 – so to handle (or in my case, simulate), say, 512kB of physical memory through 4kB page frames you need to expect to handle quite a few soft as well as hard faults and so you need to take a view on which mappings get evicted from the TLB and – if you are referencing more than 512kB of program and data – which pages get evicted from the page table and, hence, physical memory.

This is where the difficulties start, because there is no simple way to know when a read through a TLB mapping takes place – the very point is that it is invisible (and hence quick) to the programmer or any software she or he may have written. The problem with that is that – for most situations – the only guide you have to future behaviour in terms of memory references is past behaviour: so it would be very helpful to know whether a page has been recently accessed (on the basis that if it was accessed recently then it will be accessed again soon).

The standard response to this sort of problem is some form of “CLOCK” algorithm – which was first implemented, I believe, for the Multics operating system.

Multics can be thought of as the estranged (and now late) father of Unix – the “Uni” being a play on the “Multi” of the earlier system – and both direction and through its offspring its influence on computers has been profound and one of our inheritances is CLOCK, some version of which is almost certainly running in the computer, phone or tablet on which you are reading this.

The principle of CLOCK is simple. A “clock hand” sweeps regularly through the page mappings marking them invalid, then on a subsequent attempt to reuse the page mapping the valid bit has to be reset (meaning the page has been used recently) or alternatively if a new mapping is needed then we can through out the first page in the list of mappings where the mapping is marked as invalid.

And this, or some version of it is what I am now going to have to implement for the Microblaze. The obvious thing to do is to have some sort of timer interrupt drive the time clock – though I am not even sure the Microblaze has a timer interrupt available – I’m guessing it doesn’t – as it would expect those to come from the board, so this could be tricky!

Why I owe @slugonamission a (or several) large drink(s)


gdb icon, created for the Open Icon Library
gdb icon, created for the Open Icon Library (Photo credit: Wikipedia)

I have been struggling with a problem with my Microblaze simulation on OVPsim all week.

I have been trying to make a start on implementing a simple demand paging system and to trigger the hardware exception that the Microblaze CPU manual says should happen when you either try to read from a mapped page that is not in the TLB (ie, a “minor” page fault) or has not been mapped at all (a “major” page fault).

My problem was that – despite having, I thought, specified all the correct parameters for the Microblaze and knowing that I had virtual memory mappings work well, it simply was not happening for me. Instead of the expected exception handling, the simulation reported an error and exited.

But Jamie Garside from the University of York’s computer science department saved my life by (correctly) pointing out that what I also needed to do was turn on ICM_ATTR_SIMEX. Otherwise the simulation will always exit on an exception (or halt in GDB).

I can see why that might make sense – at least the halting in the debugger bit: if an exception is about to be fired you can halt the machine while it is still executing in virtual, as opposed to real, mode and see what has caused it.

It was also a reminder to RTFM – in this case not just the Microblaze manual, but also the OVPsim manual. I cannot rely on Jamie doing that for me everytime.

So to fix my problem I ORed in the missing attribute:

#define SIM_ATTRS (ICM_ATTR_DEFAULT|ICM_ATTR_SIMEX)