The delights and frustrations of assembly programming


English: 3 binding of manual "Z80 CPU Pro...
English: 3 binding of manual “Z80 CPU Programming Reference Card” (Photo credit: Wikipedia)

They say of C it has all the speed of assembly programming and all the maintainability of … assembly programming. What it lacks is the basic sense of complete mastery over the machine that a successful piece of assembly delivers but, actually, it is a bit easier to maintain than assembly.

These past few weeks, for the first time in more than thirty years, I have spent a lot of time writing assembly – and for a 32 bit RISC processor, quite different from the 8 bit CISC world of the Z80, where I last did this.

Assembly gives you complete control, but it is also very frustrating – debugging iteration after debugging iteration.

How sad is it to admit, though, that I am enjoying it greatly?

Advertisements

A timerless CLOCK algorithm


I am writing this down partly as a way of seeing if it makes sense…

Minor faults we can live with: if we have to reaffix a mapping back into the TLB it takes a few cycles but that is not a heavy burden. Major faults, however, are a different matter: they take many thousands of processor cycles to fix and have to be avoided at all costs.

Yet we have a basic problem in that we cannot mark each and every page access with a timestamp, so we have to guess at which page was least recently used – and so typically we use the sweeping hand of a CLOCK algorithm to do this.

But I have an even bigger problem – I have no time source, and hence no timer interrupts, to use.

So instead I want to try this:

  • All entries in the page table (PT) will have two validity bits, one we call VALID (V) and another AVAILABLE (A);
  • When a page is first mapped in the PT, both V and A will be marked as on.
  • When we first run out of space in the PT we will start to advance through the PT on each access turning off a V bit. So, if we pin the first three entries in the PT (eg for ‘kernel’ code, the PT itself and a stack), the first time we go round we might mark V off for the 16 entries in the PT from 2 – 18
  • Then, when we are mapping a page for which no mapping exists, we take the first entry where V is off
  • But before that we check whether we have any entries where A is on that map our page, in which case we switch V on for that mapping and we are off

 

Having written that out I can see we only need the A bit if we think that a 0x00 <–> 0x00 mapping might be a valid one, but we don’t have to make that assumption and so we can restrict this to just the V bit.

Picking which page to remove


I have now written some very basic but functional paging code for the Microblaze but have realised I am, essentially, going to have to start a lot of it again.

The Microblaze puts page tables firmly under software control and its memory management unit (MMU) concentrates on managing page references through its translation lookaside buffers (TLBs) instead. So a reference to a virtual address that is mapped in a TLB is handled invisibly to the user or programmer, but a reference to a memory address that may be mapped in a page table but not in the TLB generates an exception and everything is dumped into the lap of the programmer – this is what is often known as a “soft fault” (a “hard fault” being when there is no version of the page being referenced available in physical memory at all).

The number of TLBs is quite limited – just 64 – so to handle (or in my case, simulate), say, 512kB of physical memory through 4kB page frames you need to expect to handle quite a few soft as well as hard faults and so you need to take a view on which mappings get evicted from the TLB and – if you are referencing more than 512kB of program and data – which pages get evicted from the page table and, hence, physical memory.

This is where the difficulties start, because there is no simple way to know when a read through a TLB mapping takes place – the very point is that it is invisible (and hence quick) to the programmer or any software she or he may have written. The problem with that is that – for most situations – the only guide you have to future behaviour in terms of memory references is past behaviour: so it would be very helpful to know whether a page has been recently accessed (on the basis that if it was accessed recently then it will be accessed again soon).

The standard response to this sort of problem is some form of “CLOCK” algorithm – which was first implemented, I believe, for the Multics operating system.

Multics can be thought of as the estranged (and now late) father of Unix – the “Uni” being a play on the “Multi” of the earlier system – and both direction and through its offspring its influence on computers has been profound and one of our inheritances is CLOCK, some version of which is almost certainly running in the computer, phone or tablet on which you are reading this.

The principle of CLOCK is simple. A “clock hand” sweeps regularly through the page mappings marking them invalid, then on a subsequent attempt to reuse the page mapping the valid bit has to be reset (meaning the page has been used recently) or alternatively if a new mapping is needed then we can through out the first page in the list of mappings where the mapping is marked as invalid.

And this, or some version of it is what I am now going to have to implement for the Microblaze. The obvious thing to do is to have some sort of timer interrupt drive the time clock – though I am not even sure the Microblaze has a timer interrupt available – I’m guessing it doesn’t – as it would expect those to come from the board, so this could be tricky!

Why I owe @slugonamission a (or several) large drink(s)


gdb icon, created for the Open Icon Library
gdb icon, created for the Open Icon Library (Photo credit: Wikipedia)

I have been struggling with a problem with my Microblaze simulation on OVPsim all week.

I have been trying to make a start on implementing a simple demand paging system and to trigger the hardware exception that the Microblaze CPU manual says should happen when you either try to read from a mapped page that is not in the TLB (ie, a “minor” page fault) or has not been mapped at all (a “major” page fault).

My problem was that – despite having, I thought, specified all the correct parameters for the Microblaze and knowing that I had virtual memory mappings work well, it simply was not happening for me. Instead of the expected exception handling, the simulation reported an error and exited.

But Jamie Garside from the University of York’s computer science department saved my life by (correctly) pointing out that what I also needed to do was turn on ICM_ATTR_SIMEX. Otherwise the simulation will always exit on an exception (or halt in GDB).

I can see why that might make sense – at least the halting in the debugger bit: if an exception is about to be fired you can halt the machine while it is still executing in virtual, as opposed to real, mode and see what has caused it.

It was also a reminder to RTFM – in this case not just the Microblaze manual, but also the OVPsim manual. I cannot rely on Jamie doing that for me everytime.

So to fix my problem I ORed in the missing attribute:

#define SIM_ATTRS (ICM_ATTR_DEFAULT|ICM_ATTR_SIMEX)

Referenced in an academic text book


My MSc project on memory management in the Linux kernel has indeed been referenced in an academic text book – Computing Handbook, Third Edition: Computer Science and Software Engineering – you can look me up in the preview there. The article that references my work is by the great Peter J. Denning, the “father” of the working set.

The book (one volume of a set) costs £157 so I am not sure if I am going to buy it just yet.

Latest Microblaze puzzle


Having successfully written a couple of simple virtual memory mappings I now want to write the exception handling code that will allow me to swap mappings in and out – so I did this:

	ori	r20, r0, 0
	ori	r12, r0, 0x1000000
	/* next line should break it */
	sw	r20, r12, r0

I expected that the marked line would raise a hardware exception as a TLB miss and I’d be taken to the appropriate vector etc. Instead it just tells me I’ve had a SIGSEGV.

Hmmm.

Computer programming centre-right lovers of Swedish meatballs


English: YouGov logo
English: YouGov logo (Photo credit: Wikipedia)

According to YouGov (the UK’s largest polling company) that is what typical lovers of Linux are – though it’s based on just 272 individual profiles (out of 200,000 or so members of YouGov’s panel). Oh, and they are blokes. More at https://yougov.co.uk/profiler#/Linux/demographics

(YouGov made at least some of their profiling data available online this morning and it has kept British internet users amused all day.)

Windows lovers are, apparently, somewhat more numerous – there are 744 of them – but also typically younger, less well off and even more right wing. And are also men. Apple pie is their favourite dish and perhaps unsurprisingly they are not as keen on programming. Yes, it’s true: Windows lovers are lusers through and through. See https://yougov.co.uk/profiler#/Microsoft_Windows/demographics

Admirers of the Microsoft brand, though, tend to be older (still male) and rather more centrist – and numerous. Perhaps this is the Bill Gates effect? People admire his creation in the abstract but there is little concrete love. See https://yougov.co.uk/profiler#/Microsoft/demographics

But what of your favourite hipster computer brand – Apple? Turns out they are centrist, female and middle class and like grilled halloumi cheese. It’s harder to make a direct comparison though as (surprise, surprise) Apple users don’t seem to identify their operating system. See https://yougov.co.uk/profiler#/Apple/demographics

There is lots more to look at – for instance Android users are seemingly very left wing while computer scientists are middle aged men who eat a lot of chicken.

The Imitation Game


The first things to say about this film is that it is well worth seeing, profoundly moving and (in general) very well acted.

The second is that it gets to be this way by rather playing with the facts.

I am no Turing expert – I’ve read On Computable Numbers (via the quite brilliant The Annotated Turing), Computing Machinery and Intelligence and listened to the audio book of Alan Turing: The Enigma (now subtitled “the book that inspired the film The Imitation Game”) but I know enough to doubt that there really was a late-night post-boozing moment when the bombe machine started to work (this appears to be an attempt to lump the success of the bombe and Turing’s insight into German naval codes into one gloriously cinematic moment) and I certainly know that Turing did not spend – as the film implies if not explicitly states – the whole of the war at Bletchley Park.

 

And the film does Turing’s co-workers a great dis-service when it implies that Turing alone wrote to Churchill and that Turing then used the letter’s success to dominate his co-workers. The letter was written collectively and,  what is more, after the first bombes were working, not as a device to get a bombe built.

 

Nor is the film fair on Turing in the sense that he is portrayed as – and the title implies he was adept at – hiding his sexuality. If anything it was Turing’s unwillingness to hide that caused him so much trouble. He was not ashamed to be gay (a word he used) even if some simulation just might have helped him dodge his indecency conviction.

 

No matter, though, the spirit of the film is correct and Cumberbatch is excellent in the lead role. Though it was Keira Knightly playing Joan Clarke who, if anything, impressed me more (though her accent seemed to swing back and forth between very posh and modern classlessness). Perhaps that is because Joan is now rather more of an enigma than Alan.

 

Reading this morning’s papers about the film it was implied that Turing’s work was not given due credit until recently because he was gay. I am not sure that is true.

 

We should not really expect the majority of the public to have heard of the “Church-Turing thesis” or to have grasped the basics of a Turing Machine, though the increased pervasiveness of computing devices does mean that Turing’s name as a key founder of the theoretical basis of electronic computing has become more widely known regardless of attitudes towards homosexuality. The Ultra decryption effort was kept hidden until the late 70s and the full details took some time to come out, but the ACM‘s Turing Award – the highest that a computer scientist could hope for, has been in existence since 1966: computer science did not disavow him.

 

But his story is a reminder of how bigotry damaged so many lives – even those to whom we owe so much.

 

 

 

 

Of course turning on the MMU breaks your code!


Prinzipdarstellung der Arbeitsweise einer MMU
(Photo credit: Wikipedia)

Tonight I finally managed to get the code and the sequencing right to not just boot the Microblaze simulation, but to turn on the MMU.

But I was puzzled as to why, as soon as I had done that, the execution breaks with a segmentation fault and appeared to be executing code I had not written. And then it dawned on me – all my addresses had immediately become virtual addresses.

Two steps forward, one step back…

First working assembly code in 33 years


When I say working, I mean boots into an endless loop with no ouput to the user – but then, that’s what it’s meant to do:

/* 	Microblaze start up code
	Copyright Adrian McMenamin <acm538@york.ac.uk>, 2014
	Licensed under GPL v3
*/


	.global _start
	.section .vectors.reset, "ax"
	.align 2
	.ent _start
	.type _start, @function

_start:
	brai	_actualstart
	.end _start

	.section .vectors.sw_exception, "ax"
	.align 2
_vector_sw_exception:
	brai	_handle_exception

	.section .vectors.interrupt, "ax"
	.align 2
_vector_interrupt:
	brai	_interrupt_handler

	.section .vectors.breakpoint, "ax"
	.align 2
_vector_breakpoint:
	brai _handle_breakpoint

	.section .vectors.hw_exception, "ax"
	.align 2
_vector_hw_exception:
	brai	_handle_hwexception

	.section .text
	.global _actualstart
	.align 4
	.ent _actualstart
	.type _actualstart, @function

_actualstart:
	mts	rmsr, r0
	mts	rslr, r0
	addi	r8, r0, 0xFFFFFFFF
	mts	rshr, r8

	nop
	addik	r3, r0, 0x3F
	mts	rtlbx, r3



_handle_exception:
_interrupt_handler:
_handle_hwexception:
_handle_breakpoint:
	nop
	bri 0 /*endless loop if it ever works */

Only have to get it to do something useful now, how hard could that be? (There is a linker script to get this to work also – as it has to be loaded to address 0x00000000. Have a look at my github (mcmenaminadrian) repos.