What computer output is supposed to look like


Image of conv net learning to classify images of chess pieces
Conv net attempting to classify chess pieces

This month is the 41st anniversary of me coming face-to-face with a “micro-computer” for the first time – in WH Smith’s in Brent Cross. I am not truly sure how I knew what I was looking at (beyond I suppose the shop’s own signage) – because at that time not even “The Mighty Micro” – ITV’s groundbreaking (and exceptionally far-sighted) TV series had yet been broadcast, but I was instantly smitten.

If you remember the time, then you’ll recall computers were very basic and only ran BASIC (but you could still do a lot with that). Black and white (or green and white) graphics were the standard (unless you were a rich kid and owned an Apple II).

But that didn’t stop us – my brother and I got a Sinclair ZX80 in 1980 (even if you ordered early the wait was long) and started writing code straight away (there wasn’t much choice if you wanted to get some use from the device).

The best code was mathematical and computationally intensive (as far as 1KB of RAM on a board with an 8 bit 3.25MHz CPU would allow that is) yet managed to combine that with rapid screen updates – something that was difficult on a ZX80 because computation blanked the screen (a ROM update and an interrupt driver – we copied the machine code bytes into every program – later fixed that.)

So 41 years later the code I am now running – shown above – perfectly fits the bill for “proper computing”. It is a computationally intensive – essentially multiple matrix multiplications – convolutional neural network that is attempting to classify images of chess pieces of the sort commonly seen with published chess puzzles. But what I love most of all is the fast flickering digits (the nine classes) and small images (the output of the first two layers of the 50 filters that are at the heart of the network).

This is the second time I’ve had a go at this personal project and I’ve made some progress – but it’s been hard going. Most conv net users seem to have long moved on from C++ (which I am using) to Python libraries like Tensor Flow – so it’s not even that I feel I am part of a strong community here.

Lots of subtle (that’s my story and I’m sticking to it) programming traps – like the fact that the STL Maps class reorders the objects added to reflect the order of the key (sounds obvious when you say it like that – why would it not have such a lexical order?) – I had simply assuming that the entries kept the order they were added in. (This was today’s discovery).

But if it was easy to write these things then it would be no fun.

Convolutional network (again)


Black king on black square with random filters

With time on my hands I have returned to working on an old project – attempting to build a convolutional network that will solve chess puzzles.

(A convolutional network is a type of neural network – a modelled ‘artificial intelligence’ that can be used to detect patterns or undertake similar tasks.)

Here I am not using ‘AI’ to solve the chess puzzle itself (though there is are very large libraries of chess endings and positions available, so I suppose that would be possible), but to read the chess position in the puzzle.

Thus the task is to classify squares on the board.

I tried this a couple of years ago and got nowhere, but reading this book – “Machine Learning: An Applied Mathematics Introduction” has persuaded me to have another go, reducing the dimensions of the answer I am seeking from 25 to 9 (without any loss of information).

At the moment I am just in the process of building the “feed forward” network – i.e. the neural network that, once trained, will take an image as input and then give a nine-dimensional answer.

These answers can be thought of, perhaps not too accurately but not totally unreasonably, as a measure of likelihood that the input picture falls into a given category (e.g. by giving a number between 0 and 1 under the category of white square, or pawn, or black piece etc.).

The input picture is passed through a series of filters that are designed to extract features of the image and then, at the end, the AI considers all the results and gives its view as to the classification of the image.

In my AI there are 50 fibres (i.e. 50 chains of filters) and the image at the top of the page shows the results of passing the image – a black king on a black square – through the top two layers. So the first 50 images are from the top rank of filters and the bottom from the second rank. I plan to implement another three layers of filters (though of smaller dimensions – the idea being they can concentrate their information) before the final “fully connected” layer (where all 50 fibres exchange information) that delivers the result.

The images here are produced from randomly assigned filters so essentially contain no real “intelligence” at all – but if you magnify the image you’ll see that even these random filters produce interesting results.

Training the network is vital of course – and that’s where it all failed last time. I’m back to reading Springer’s “Guide to Convolutional Neural Networks” – which is one of their better books but still full of shoddy editing (though I’d recommend persisting with it.)

The training is through ‘back propagation’ – essentially adjusting the network to minimise errors by testing it against a set of known results. Getting a large set of pictures of do the training against is maybe even more difficult than getting the maths of the training right. Even if I recycle the images from last time I will need a lot more.

Getting a job


I have, essentially, two sets of skills and experience.

One is as a political campaigner and communicator. I did well out of that for a while and more than that, did some things I am proud of and feel really privileged to have had a chance to be part of.

But it’s fair to say that road seems to have hit a dead end.  If you want to run a serious, progressive, campaign then I am certainly still interested, but I am not sure there is much of that out there today.

So then there are the other skills – ones that I am told are in high demand.

Namely as a software designer/writer/developer.

I can do this and I am much better these days than I used to be: unlike, say, running I am still getting faster and sharper. C/C++/Embedded/Perl/Linux/Groovy/DSLs/R/Deep Learning – I can tick all those boxes.

But where to begin? The reputation of IT recruitment agencies is pretty grim, though I have no direct experience. I have registered with one, but I am being sent invitations to be a senior C++ engineer in Berlin on a salary of €150,000 per annum which even I think is probably a bit over-ambitious for someone with no commercial experience.

(NB: If you want to see what I have done have a look at https://github.com/mcmenaminadrian).

Giving up on the convolutional network?


For almost three months now I have been trying to build and train a convolutional network that will recognise chess puzzles: but I don’t feel I am any closer to succeeding with it than I was at the start of September and so I wonder if I should just give up.

The network itself is built, and as far as I can see, works except for the fact that I just cannot get it to converge on the training set.

The (learning) code is here: https://github.com/mcmenaminadrian/ChessNet/tree/learning

The training set is here: https://github.com/mcmenaminadrian/ChessNet/tree/learning/images

There are 25 possible classes of outcome – from an empty white square to a black square hosting a black king, and the network outputs a value between -1 (no match) and 1 (perfect match).

There are 25 convolutional fibres each with seven layers, going from a 100 x 100 input layer to a final filter (feature map) of 88 x 88 which are then fully connected to 25 output neurons (there is no pooling layer): as you can see that means there are 88 x 88 x 25 x 25  + 25 (4.84 million, plus 25 for bias) connections at the final, fully connected, layer (or alternatively each output neuron has 193601 input connections).

Perhaps the issue is that the scale of the fully connected layer dwarfs the output and influence of the feature maps? I don’t know, but what I do know that, as training goes along, the output neurons generally begin in a low (i.e., close to -1) state and then edge towards a high state, but as they do they are suddenly overwhelmed and everything returns to an even lower state than before.

Envisaging this as a three dimensional surface, we creep up a steep hillside and then fall down an even deeper hole just as we appear to be getting towards a summit: the problem seems to be that training doesn’t really teaching the network to differentiate between any of the training images, it just pushes the network towards a high value. Then, suddenly images which should be reported as low are reported as high and the error values flood the network on back propagation.

To explain further: in the training set our image X will will always be relatively infrequently seen so most results should be low and are low, with small error values (deltas as they are usually called) – so small that they are generally ignored. The deltas for X are then large and they feed into the network, dragging our response towards high. Eventually we cross a threshold and all the results – for good and bad images – are reported as high and so there lots of big deltas which overwhelm the small number of correct positives. At least that is what I think is happening.

Of course what really should happen is that the network learns to discriminate between the ‘good’ and ‘bad’ images, but that just seems as far away as ever.

Any tips, beyond giving up, gratefully received.

Conv-nets are hard


Months ago I started work on a convolutional neural network to recognise chess puzzles. This evening after mucking about with the learning phase for weeks I thought I had scored a breakthrough – that magic moment when, during learning, a tracked value suddenly flips from the wrong result to the right one.

Brilliant, I thought – this is about to actually work, and I started tracking another value. Only to come back to the original and see that it had all gone wrong again.

False minima abound in training – which is essentially about getting the right coefficients for a large set of non-linear equations each with many thousands of parameters. Or maybe it wasn’t a false minimum at all – but the real one, but it’s just operating over an extremely small range of parameter values.

Will I ever find it again? And if I do can I find it for the other 25 classification results too?

(As an aside: I made the code parallel to speed it up, but it’s a classic example of Amdahl’s law – even on a machine with many more processors than the 26 threads I need and with no shortage of memory, the speed-up is between 3 and 4 even with the most heavy-duty calculations run in parallel.)

More and more spam reviews on Amazon


Earlier this month I highlighted how a book that claims to be about using Python to build convolutional neural networks and yet, say readers, contains not a single line of Python, was garnering rave reviews on Amazon.

The trend hasn’t stopped and it is pretty clear to me that these are, in fact, spam.

Plainly Amazon’s review system is broken.

 

Can you get a useful result with a random convolution filter?


In a number of places I’ve seen it remarked that a random convolution filter makes for a reasonably efficient edge detector for images, so I thought I’d test this.

The answer, perhaps surprisingly, seems to be yes.

With 25 input filters in an untrained convolutional neural net (where kernel values were pseudo-randomly distributed between -0.5 and 0.5), all but three of the first level filters returned something that suggested edge detection (though given the original image was a collection of edges this is not much of a claim.) Some of the second or even third level filters also showed patterns, but most delivered something like universal blackness.

Admittedly this is a small sample size, with just one test image.

Here is the original (100 x 100) image:

orig

Here are some of the useful or at least interesting (98 x 98) filtered images:

I cannot really think of a useful application of this finding, but it does interest me none the less.

Strange reviews on Amazon


Messing about with convolutional neural networks (CNNs) continues to take up some of my time (in case my supervisor reads this – I also have a simulation of a many core system running on the university’s computer atm).

I started my research here with a much cited, but really well-out-0f-date book – Practical Neural Network Recipes in C++. What’s nice about that book is that it is orientated towards getting things done, though the C++ is really from the “C with classes” era.

Another book – Guide to Convolutional Neural Networks: A Practical Application to Traffic-Sign Detection and Classification – which I can access through the University of York’s library, helped fill in some of the theoretical and other gaps and also showed me why I needed to move away from the perceptron model promoted in the earlier book and move towards a CNN. But like many of Springer’s books it is poorly edited and not fully and properly translated to use online.

So I’m still on the look out for the perfect match – a book with practical coding examples that clearly explains the theory and, bluntly, is written in good English with all the maths actually reproduced in the online format (as I am just not going to be able to afford a printed copy.)

In particular I want a clear explanation of how to do back propagation in a CNN – as it’s plain that the general method outlined in “Practical Neural Network Recipes” doesn’t work beyond a fully connected layer, while the explanation in “Guide to…” is impenetrable and, actually, rather odd (as it seems to imply that we use a fixed weight for every neuron in a filter as opposed to using fixed weights across each filter – if I have explained that properly).

So, I’ve just had another look and came across this book … “Convolutional Neural Networks in Python: Introduction to Convolutional Neural Networks

This book has managed (at time of writing to have collected two, one star, reviews on the Amazon UK website):

Amazon reviews

I have no idea how fair those reviews are, but this passage from the preview version available on the website doesn’t suggest the author is yet rivalling Alan Turing:

But, here’s the odd thing. It would appear a number of “purchasers” of this book through Amazon.com are very enthusiastic about it and all felt the need to say so this very day (9 August):

odd reviews

Even more oddly, the reviews all read like spam comments I get on this blog. But I have no evidence to suggest these are anything other than genuine, if odd, comments on the book…

First results from the “musical” neural network


I am working on a project to see whether, using various “deep learning” methods, it is possible to take a photograph of some musical notation and play it back. (I was inspired to do this by having a copy of 1955’s Labour Party Songbook and wondering what many of the songs sounded like.)

The first task is to identify which parts of the page contain musical notation and I have been working with a training set built from pictures of music chopped into 100 x 100 pixel blocks – each is labelled as containing or not containing musical notation and the network is trained, using back propagation, to attempt to recognise these segments automatically.

Now I have tested it for the first time and the results are interesting – but a bit disappointing. In this image all that is plotted is the neural net’s output: the redder the image, the higher the output from the net’s single output neuron:

Neural network output
The brighter the image the more likely there is music

It’s a bit of a mystery to me as to why you can see the staves and notes in this sort of shadowy form: as that means the network is rejecting them as musical notation even as it does highlight the regions where they are found as the best places to look.

To make it all a bit clearer, here are the results with the blue/green pixels of the original image unchanged and the red pixels set on the strength of the network’s output:

Blaydon Races filtered by neural net

It seems clear the network is, more or less, detecting where there is writing on the page – though with some bias towards musical staves.

I’m not too disappointed. My approach – based on stuff I read in a book almost 25 years old – was probably a bit naïve in any case. I came across a much more recent and what looks to be much more relevant text yesterday and that’s what I will be reading in the next few days.

(You can see the code behind all of this at my Github: https://github.com/mcmenaminadrian)