What computer output is supposed to look like


Image of conv net learning to classify images of chess pieces
Conv net attempting to classify chess pieces

This month is the 41st anniversary of me coming face-to-face with a “micro-computer” for the first time – in WH Smith’s in Brent Cross. I am not truly sure how I knew what I was looking at (beyond I suppose the shop’s own signage) – because at that time not even “The Mighty Micro” – ITV’s groundbreaking (and exceptionally far-sighted) TV series had yet been broadcast, but I was instantly smitten.

If you remember the time, then you’ll recall computers were very basic and only ran BASIC (but you could still do a lot with that). Black and white (or green and white) graphics were the standard (unless you were a rich kid and owned an Apple II).

But that didn’t stop us – my brother and I got a Sinclair ZX80 in 1980 (even if you ordered early the wait was long) and started writing code straight away (there wasn’t much choice if you wanted to get some use from the device).

The best code was mathematical and computationally intensive (as far as 1KB of RAM on a board with an 8 bit 3.25MHz CPU would allow that is) yet managed to combine that with rapid screen updates – something that was difficult on a ZX80 because computation blanked the screen (a ROM update and an interrupt driver – we copied the machine code bytes into every program – later fixed that.)

So 41 years later the code I am now running – shown above – perfectly fits the bill for “proper computing”. It is a computationally intensive – essentially multiple matrix multiplications – convolutional neural network that is attempting to classify images of chess pieces of the sort commonly seen with published chess puzzles. But what I love most of all is the fast flickering digits (the nine classes) and small images (the output of the first two layers of the 50 filters that are at the heart of the network).

This is the second time I’ve had a go at this personal project and I’ve made some progress – but it’s been hard going. Most conv net users seem to have long moved on from C++ (which I am using) to Python libraries like Tensor Flow – so it’s not even that I feel I am part of a strong community here.

Lots of subtle (that’s my story and I’m sticking to it) programming traps – like the fact that the STL Maps class reorders the objects added to reflect the order of the key (sounds obvious when you say it like that – why would it not have such a lexical order?) – I had simply assuming that the entries kept the order they were added in. (This was today’s discovery).

But if it was easy to write these things then it would be no fun.

Flirting with disaster


A week ago, in the middle of the Christmas holidays, I contemplated how I could tie all the bits I had written for my PhD thesis into a coherently and seamlessly argued complete first draft.

I even bought a book about completing a PhD, In the end I decided I needed to add in one more set of experimental data – this time examining how highly parallel code might work in a many-core environment.

The results I had were over two years old and the software simulation I used didn’t quite match the code used later (the later code was a development of the earlier stuff) but I thought if I made that clear, there wouldn’t be a problem.

But a quick test of the model showed that what I had produced before was just too flakey, especially when I decided what I really had to do was compare the potential results of a system with 256 cores with a system with, say, 64 cores: the system with a lower number of cores cannot tackle the whole problem at once, but maybe the combination of faster cores, shorter queues for memory and less idle time (as processors will have to be set to tackle the unsolved problems) might still deliver better performance (I don’t know the answer – yet – to this “Amdahl’s Law” related question).

So work was needed on the code but try as I might I couldn’t get it to work.

Eventually I tracked the issue down to pages of memory being incorrectly marked as “read only” when they should be “read-write”. Read only pages stored locally don’t get written back to main memory and when you are using shared memory to communicate that’s a problem – you write your messages locally but they never get delivered.

I thought I found the bad code – a wayward #define – and a small test seemed to show that this did indeed fix the problem.

Except it didn’t and I realised last night that another piece of code was not creating “read only” page table entries in error (as the first piece of bad code did) but was converting “read write” page table entries to read only (it was meant to do the opposite – ie if you wrote to a read only page the page table entry was suitably updated, but sloppy coding meant instead it was routinely converting “read write” entries to “read only”.)

And the really awful thing was that – unlike the bad #define – this was a piece of code that had been inherited by every subsequent iteration of my simulator.

In other words – an awful lot of results I thought were definitive were wrong. Not actually wrong at such a fundamental level that I had just discovered I’d wasted several years of my life (I recently read a horror story about how a chemistry PhD wrote a whole thesis only to discover his or her “interesting result” was caused by accessing unallocated memory in Fortran 77) – but enough to mean the numbers are wrong.

Initial reaction was panic. Today it’s more stoical – I now have 11 simulations running to get new data and I probably need the same number or so again to fill in all the gaps.

The only positive is that it all has come so late in the process that I am confident in handling and interpreting the data that will come out of the fixed experiments.

But the downside is that I am running out of time and these experiments quite literally take weeks to run – and if there is a hitch with the university IT setup I go straight back to square one.

Wish me luck.

Getting a job


I have, essentially, two sets of skills and experience.

One is as a political campaigner and communicator. I did well out of that for a while and more than that, did some things I am proud of and feel really privileged to have had a chance to be part of.

But it’s fair to say that road seems to have hit a dead end.  If you want to run a serious, progressive, campaign then I am certainly still interested, but I am not sure there is much of that out there today.

So then there are the other skills – ones that I am told are in high demand.

Namely as a software designer/writer/developer.

I can do this and I am much better these days than I used to be: unlike, say, running I am still getting faster and sharper. C/C++/Embedded/Perl/Linux/Groovy/DSLs/R/Deep Learning – I can tick all those boxes.

But where to begin? The reputation of IT recruitment agencies is pretty grim, though I have no direct experience. I have registered with one, but I am being sent invitations to be a senior C++ engineer in Berlin on a salary of €150,000 per annum which even I think is probably a bit over-ambitious for someone with no commercial experience.

(NB: If you want to see what I have done have a look at https://github.com/mcmenaminadrian).

Strange reviews on Amazon


Messing about with convolutional neural networks (CNNs) continues to take up some of my time (in case my supervisor reads this – I also have a simulation of a many core system running on the university’s computer atm).

I started my research here with a much cited, but really well-out-0f-date book – Practical Neural Network Recipes in C++. What’s nice about that book is that it is orientated towards getting things done, though the C++ is really from the “C with classes” era.

Another book – Guide to Convolutional Neural Networks: A Practical Application to Traffic-Sign Detection and Classification – which I can access through the University of York’s library, helped fill in some of the theoretical and other gaps and also showed me why I needed to move away from the perceptron model promoted in the earlier book and move towards a CNN. But like many of Springer’s books it is poorly edited and not fully and properly translated to use online.

So I’m still on the look out for the perfect match – a book with practical coding examples that clearly explains the theory and, bluntly, is written in good English with all the maths actually reproduced in the online format (as I am just not going to be able to afford a printed copy.)

In particular I want a clear explanation of how to do back propagation in a CNN – as it’s plain that the general method outlined in “Practical Neural Network Recipes” doesn’t work beyond a fully connected layer, while the explanation in “Guide to…” is impenetrable and, actually, rather odd (as it seems to imply that we use a fixed weight for every neuron in a filter as opposed to using fixed weights across each filter – if I have explained that properly).

So, I’ve just had another look and came across this book … “Convolutional Neural Networks in Python: Introduction to Convolutional Neural Networks

This book has managed (at time of writing to have collected two, one star, reviews on the Amazon UK website):

Amazon reviews

I have no idea how fair those reviews are, but this passage from the preview version available on the website doesn’t suggest the author is yet rivalling Alan Turing:

But, here’s the odd thing. It would appear a number of “purchasers” of this book through Amazon.com are very enthusiastic about it and all felt the need to say so this very day (9 August):

odd reviews

Even more oddly, the reviews all read like spam comments I get on this blog. But I have no evidence to suggest these are anything other than genuine, if odd, comments on the book…

Free software to chop up your JPEGs


As a public service announcement – if you need some software (on Linux but may well compile and run on other systems if they support Qt) to chop a big JPEG up into smaller files, I have written this (in my case to support building a test set for a neural network).

It’s at https://github.com/mcmenaminadrian/TestSetCreator and it’s GPL licenced (v 3).

Back to neural networks


Neural networks have fascinated me for a long time, though I’ve never had much luck applying them.

Back in the early and mid 1990s the UK’s trade and industry department ran a public promotional programme to industry about NNs and I signed up, I even bought a book about how to build them in C++ (and I read it, though I have to confess my understanding was partial).

My dream then was to apply the idea to politics: as a more effective way of concentrating resources on key voters. My insight was that, when out doing what political parties then called “canvassing” and now – largely for less than fully honest legal reasons as far as I can see – call “voter ID” (in electoral law “canvassing” i.e., seeking to persuade someone to vote one way is more highly regulated than simply “identifying” how they intend to vote) you could quite often tell how someone would respond to you even before they opened the door. There was some quality that told you, you were about to knock on a Labour voter’s door.

I still don’t know what that quality is – and after the recent election I’m not sure the insight applies any more any way – but the point was that if you could take a mass of data and have the NN find the function for you, then you could improve your chances of getting to your potential support and so on…

But I never wrote any code and NNs seemed to go out of fashion.

Now, renamed “machine learning”, they are back in fashion in a big way and my interest has been revived. But I am not trying to write any code that will work for politics.

Instead I am exploring whether I can write any code that will look at a music score and play it back. (This probably a silly idea as a start NN project, as music is not easy to decipher in any way as far as I can see).

I have read another book on C++ and neural networks (and even understood it): an ancient tome from the previous time NNs were in fashion.

The first task is to actually identify which bits of scribbling on the page are musical notation at all. And I have written some code to build a training set for that – here. It might well be of use to you if you need to mark bits of a JPEG as “good” or “bad” for some other purpose, so please do get in touch if you need some help with that.

So, dear reader, I wrote a Gauss-Jordan solver


Number of steps in the Euclidean algorithm for...
Number of steps in the Euclidean algorithm for GCD(x,y). Red points indicate relatively few steps (quick), whereas yellow, green and blue points indicate successively more steps (slow). The largest blue area follows the line y = Φx, where Φ represents the Golden ratio. (Photo credit: Wikipedia)

Been a while…

…Anyway: I am now designing (PhD related) a simulation of a NoC system and I need to have it run some code – so I thought I’d look at code to solve a system of linear equations and sought to build the same using Gauss-Jordan elimination.

It proved to be a lot more difficult than I had hoped: not that the maths was too complex, but that producing something I could later simplify from C++ into something that looked like synthetic assembly has and is proving very difficult indeed.

My system needs to execute a simple synthetic assembly because what I am modelling is the way the memory interactions work – so I have to get close to the metal and not hide behind a high-level language (no, not even C will do). Obviously – or at least it feels like that to me – I don’t want to have to mess about with floating point implementations, so I wanted integer arithmetic.

And that is where the first problem comes – to implement a Gauss-Jordan solver (or even to solve linear equations in the way you used to at school: the Gauss-Jordan algorithm is merely a scaling up of that), you have to divide numbers and sadly (for this purpose) not all numbers are multiples of one another – so the only way to preserve integer maths is to use rationals.

That, of course, is not such a problem. The C++ pair utility even makes it easy to model a rational out of two integers (writing a separate rational class would be overkill for something as small as this.)

But then, quickly, another problem becomes apparent. Even when using long long types to represent the numerator and denominator of our rational fraction repeated multiplications to enable fractional subtraction breaks the integer bounds after little more than a handful of row operations (and I want to model a large system – with literally thousands of equations in each system).

Even when adding a repeatedly applied gcd function to apply Euclid’s algorithm to the rational fractions, the numbers go out of range quickly.

So I have been forced to rely on an arbitrary precision library – GNU’s GMP. The code now works well (you can see at its Github repro and the solver code is below:

#include <iostream>
#include <sstream>
#include <fstream>
#include <string>
#include <vector>
#include <cstdlib>
#include <utility>
#include <gmpxx.h>



//Gauss-Jordan elimination

using namespace std;

void gcd(pair<mpz_class, mpz_class>& input)
{
	mpz_class first = abs(input.first);
	mpz_class second = abs(input.second);
	if (first > second) {
		first = second;
		second = abs(input.first);
	}
	while (second != 0) {
		mpz_class temp = second;
		second = first%second;
		first = temp;
	}
	input.first /= first;
	input.second /= first;
	if (input.second < 0) {
		input.first *= -1;
		input.second *= -1;
	}
}

int main()
{
	string path("./variables.csv");
	ifstream inputFile(path);

	vector<mpz_class> answers;
	//answer line
	string rawAnswer;
	getline(inputFile, rawAnswer);
	istringstream stringy(rawAnswer);
	string number;
	while(getline(stringy, number, ',')) {
		answers.push_back(atol(number.c_str()));
	}

	//now read in the system
	vector<vector<pair<mpz_class, mpz_class> > > lines;
	while(getline(inputFile, rawAnswer)) {
		istringstream stringy(rawAnswer);
		vector<pair<mpz_class, mpz_class> > innerLine;
		while (getline(stringy, number, ',')) {
			pair<mpz_class, mpz_class>
				addPair(atol(number.c_str()), 1);
			innerLine.push_back(addPair);
		}
		lines.push_back(innerLine);
	}

	for (int i = 0; i < lines.size(); i++) {
		pair<mpz_class, mpz_class> pivot(lines[i][i].second,
			lines[i][i].first);
		if (lines[i][i].first != 0) {
			lines[i][i].first = 1;
			lines[i][i].second = 1;
		} else {
			continue;
		}
		for (int j = i + 1; j <= lines.size(); j++) {
			lines[i][j].first *= pivot.first;
			lines[i][j].second *= pivot.second;
			gcd(lines[i][j]);
		}
		for (int j = i + 1; j < lines.size(); j++) {
			pair<mpz_class, mpz_class> multiple(lines[j][i].first,
				lines[j][i].second);	
			lines[j][i] = pair<mpz_class, mpz_class>(0, 1);
			for (int k = i + 1; k <= lines.size(); k++) {
				pair<mpz_class, mpz_class>
					factor(
					multiple.first * lines[i][k].first,
					multiple.second * lines[i][k].second);
				gcd(factor);
				lines[j][k] = pair<mpz_class, mpz_class>
					(lines[j][k].first * factor.second -
					factor.first * lines[j][k].second,
					lines[j][k].second * factor.second);
				gcd(lines[j][k]);
			}
			
		}
	}
	// now eliminate upper half
	for (int i = lines.size() - 1; i > 0; i--) {
		if (lines[i][i].first == 0) {
			continue;
		}
		pair<mpz_class, mpz_class> answer = lines[i][lines.size()];
		for (int j = i - 1; j >= 0; j--) {
			pair<mpz_class, mpz_class> multiple = lines[j][i];
			if (multiple.first == 0) {
				continue;
			}
			lines[j][i] = pair<mpz_class, mpz_class>(0, 1);
			lines[j][lines.size()].first =
				lines[j][lines.size()].first * multiple.second
				- answer.first * multiple.first * 
				lines[j][lines.size()].second;
			lines[j][lines.size()].second =
				lines[j][lines.size()].second *
				answer.second * multiple.second;
			gcd(lines[j][lines.size()]);
		}
	}
	cout << "DIAGONAL FORM" << endl;
	for (int i = 0; i < lines.size(); i++) {
		for (int j = 0; j < lines.size(); j++) {
			if (lines[i][j].first == 0) {
				cout << "0 , ";
			} else {
				cout << lines[i][j].first << "/" << lines[i][j].second << " , ";
			}
		}
		cout << " == " << lines[i][lines.size()].first << " / " << lines[i][lines.size()].second << endl;
	}

}

(NB: I know that GMP supports rationals “out of the box” but I am trying to model something that I can translate to my synthetic assembly and so hiding the details of the implementation is not what I want to do.)

How on Earth, though, am I going to model arbitrary precision code in my “assembly” code?

Struggling


Die of an Intel 80486DX2 microprocessor (actua...
Die of an Intel 80486DX2 microprocessor (actual size: 12×6.75 mm) in its packaging. (Photo credit: Wikipedia)

Been a while since I’ve written here – been avoiding writing about politics, which has obviously not been so great for me in the last couple of weeks… but now I have something else to ruminate on.

I have reached a milestone, or perhaps basecamp, in my PhD research: having a model for memory management that needs further exploration. (Hopefully there may even be a paper on this soon.)

Some of that exploration will have to be in hardware, and that’s really beyond me but I can and should build a software model to test how a many core system built using this new model might operate.

So far I have been testing or proving concepts with OVPSim but it does not allow me to build a true asynchronous multi-core model, so I need to do that in software myself.

But where to begin – I have a list of classes that you might want to have in C++:

  • Computer – which would aggregate…
    • DRAM
    • Storage
    • NoC – which would aggregate…
      • Mesh
      • Tiles – which would aggregate…
        • CPU
        • Cache
        • Ports (to the Mesh)

I hope you can see how quickly this becomes complex – and all we are talking about here is a simple software framework to allow me to do the research (ie., delivering the software, complex as it is, is only the very start.)

I am struggling to know where to begin – C++ seems like the logical choice for this, but it’s not proving to be much fun. Particularly because my CPU class has to be able to “execute” some code – I thought about using a DSL but may stick to the XML output I got from my hacked Valgrind Lackey – as at least I can then use existing XML engines.

Should I build from the XML up – eg., get a CPU class that can hack the XML and pass the requests up the chain (eg via the cache up to the Mesh and up to the DRAM etc), or what?

Pointers versus references


English: Image for teaching pointers in comput...
English: Image for teaching pointers in computer programming Česky: Obrázek pro výuku ukazatelů v programování (Photo credit: Wikipedia)

Some people don’t like pointers – and for that reason, I think, we have references in C++. But as a confirmed pointer person, I find references very hard going.

I had a piece of C++ code that did this:

PartialPage& DoubleTree::oldestPage()
{
	PartialPage& pageToKill = pageTree.begin()->second);
	long timeToKill = pageTree.begin()->second.getTime();
	map<long, PartialPage&>::iterator itOld;
	for (itOld = pageTree.begin(); itOld != pageTree.end(); itOld++) {
		if (itOld->second.getTime() < timeToKill) {
			timeToKill = itOld->second.getTime();
			pageToKill = itOld->second;
		}
	}
	return pageToKill;
}

This produced rubbish results – because re-assigning the reference didn’t make it refer to a new element of the map. Essentially you cannot mutate a reference in C++ at all.

Switching to pointers fixed the problem though.

PartialPage* DoubleTree::oldestPage()
{
	PartialPage* pageToKill = &(pageTree.begin()->second);
	long timeToKill = pageTree.begin()->second.getTime();
	map<long, PartialPage>::iterator itOld;
	for (itOld = pageTree.begin(); itOld != pageTree.end(); itOld++) {
		if (itOld->second.getTime() < timeToKill) {
			timeToKill = itOld->second.getTime();
			pageToKill = &(itOld->second);
		}
	}
	return pageToKill;
}

 

Racey code can damage your (mental) health


Running the Hackney half marathonI’ve had a tough week. Ran a half marathon (quite badly, but I finished – see the picture) on Sunday, hobbled around York University (blister on my foot) on Monday before returning to London and coping with the added stress of a new job.

All the while the failure of my latest piece of code to work was annoying me too – eating away at me and making me wonder if the whole PhD thing was a bit of Quixotic nonsense or even Panzan stupidity.

Anyone who has chased down a bug for days will know the feeling – the root cause must be out there somewhere, even if you have been through your code dozens of times and can see nothing wrong.

Sitting on a long commute to my place of (temporary) work this morning I realised that my problem could only be down to one thing – a race condition.

I am emulating an LRU/2 type page replacement policy – with two page queues – a “high” and a “low”: pages are initially assigned to low, but can be pushed into high on a second reference. Pages only leave via low (they can be pushed out of high into low if we run out of space in high and so on).

With one thread running there was no problem, but when a second thread came on board the application fell over – and I realised it was because the manipulation of the LRU queues was not atomic.

And, a few dozen code changes later (done on the reverse journey back to London) – and some accounting for C++ template strangeness (it seems that an iterator across a map, means the map is not const – that took me a while to work out) and the problem has gone away and suddenly academic endeavour feels worth it again.