Failing software – again


The line above is from a real (and current at time-of-posting) job advertisement for a software developer. I’m not positing it because I think it is bad, shocking or dangerous, but mainly because it is illustrative of the real world: developers are expected to be “pragmatic” when it comes to testing the software they make for correctness.

So when, as happened today with British Airways, we see a major software failure (or at least what looks like a major software failure), maybe we should not be too surprised.

There is more to it, of course, than the fact developers are expected to compromise on testing their code: there is the simple fact that most functions are not computable at all and that for many others we don’t know how to compute them.

For an example of something uncomputable there is Hilbert’s tenth problem:

For any given a Diophantine equation is there a general algorithm that tells us whether or not there is a solution with the unknowns taking integer values…

To explain:

  • A Diophantine equation (named after Diophantus of Alexandria) is a polynomial (e.g., x = y , y^2 + 5x + z^{67} = 0, 3x + 2y +5x^2y^3 + 90 = 7z and so on) equation where all the coefficients of the unknowns are integer (counting number) values and where there are a finite number of unknowns.
  • An algorithm is a set of rules of simple mathematical procedures to solve a problem

So the task is to take a Diophantine equation and not to find a solution but to determine whether such a solution exists – but no such algorithm exists and therefore we cannot write any sort of computer program that would do this for us, no matter how powerful a computer we were using.

And for an example of something for which a solution does exist but which we will struggle to find, there is the famoustravelling salesman problem” – namely given a set of cities all connected by roads (or railways, or air routes) what is the most efficient way of visiting all the cities.

This problem is an example of what is known as an “NP” problem – in other words one for which (we believe) no general algorithm exists to produce a solution in “polynomial time” (meaning in a time related to the length of the input – in this case the number of cities).

Plainly a solution does exist – there is a quickest way to travel to all the cities – but the only immediately available way to be certain of finding this is to try all the solutions.

But that will take time: for even moderately sized problems quite possibly more time than we have before the Sun swallows the Earth. To get round this we have to use heuristics (essentially educated guesswork) or approximations.

In the travelling salesman case those approximations are powerful and efficient – that’s why modern logistics works – but the general point remains the same: in many cases we are building software that approximates the solution to our problem but may not answer it fully correctly and, unless there is a breakthrough which eliminates NP problems as a class, we are always going to be stuck with that.

Switching to Opera


When I first started using it, Google’s Chrome browser seemed like a huge leap forward: a thread for each open tab, what’s not to like about that?

Latterly, though it has felt more and more like a drag: run it for any length of time and your computer will thrash or even freeze as it is such a poor manager of resources and/or a major leaker of memory.

The answer, so I have read, is to use Opera – so that is what I am doing. Will report on progress in a few weeks.

Let’s talk about sex (dolls)


Alan Turing

I’m not all that interested in sex dolls, actually. But what I am interested in is the reactions they provoke from people when they consider the nature of intelligence.

My view – pretty much that followed by Alan Turing in his pioneering paper “Computing Machinery and Intelligence” – from which we get the “imitation game” aka the Turing Test – is that intelligence is whatever looks like intelligence.

The relevance of this to sex dolls is that the BBC’s technology correspondent Jane Wakefield has put together a series of reports on the subject – the first was on “From Our Own Correspondent” last Saturday, there’s a web piece – here – and there is a report for the BBC’s World Service yet to come.

Wakefield argues that while the sex doll “Harmony” can say things which sound like intimate small talk, the doll can never know the feelings behind the words.

But what does that mean? At the most basic level none of us can live inside the head of another – we can never “feel” what it’s like to that other person, because we cannot be them.

Or, to paraphrase Turing, you might like strawberry ice cream and I might hate it: but we are both tasting the same thing, so what does this feeling of “love” or “hate” correspond to? How could you know how I “feel” about the ice cream, when you “feel” differently?

It’s an entirely subjective thing, so how can you assert that the machine “feels” nothing?

Of course, the human brain and human experience generally appears to be a massively parallel thing and we simply cannot, yet, replicate that in a machine, but if we could are we seriously suggesting that human consciousness transcends the material? That simply doesn’t make any sense to me.

Proprietary software as a false economy


By Eraserhead1, Infinity0, Sav_vas - Levenez Unix History Diagram, Information on the history of IBM's AIX on ibm.com, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=1801948

I recently had to fill in a form for the Computer Science Department at the University of York.

Like, I am sure, any computer science department in any major world university, York is a “Unix shop”: research servers all run Linux and I guess the academics who aren’t using that are – as I am now – are running the modified/derived BSD that is Mac OS X.

But the form was “optimised” (i.e., only able to operate properly on) Microsoft Word – not a piece of software found on many ‘nix machines.

Because the rest of the University – like almost all of Britain’s public sector – was totally reliant on Microsoft’s proprietary offerings.

Thirty years ago I worked in a public sector organisation that used a mixture of proprietary software for “mission critical” work – Netware, Word Perfect and MS Dos. But even that mixture has gone: it’s Microsoft for everything (on the desktop) these days.

And now the price of that false economy – because so often this reliance on Microsoft has been justified because it keeps training costs low (“everybody knows how to use it”) – has been revealed by a massive global ransomware attack.

If free/open source software (FOSS) had been more-widely used then, of course, the risk would not have disappeared: not least because the crackers would have turned their attention to FOSS and left Windows behind: but there are two pretty obvious advantages to FOSS in terms of security:

  • You can see how it works – you wouldn’t walk across a bridge with no visible means of support, yet every time you use proprietary closed-source software you do just that: the fact it hasn’t fallen down yet seems like a poor justification.
  • Everybody can fix it: if Microsoft’s software breaks or is seen to have a vulnerability you are essentially reliant on them to fix it. And if you are using an unsupported piece of software you may not even have that. Again there are no guarantees of invulnerability with FOSS – software is hard – but there is a guarantee that you or anyone you ask/pay can attempt to fix your problem.

It’s time we ended this dependency on proprietary software and invested in a FOSS future.

Open source alternatives to “Junior Librarian v3”?


My partner is a teacher in a primary school and has special responsibility for teaching English – which also means she’s in charge of the school library.

I don’t have any personal experience of the library and Lorraine is not an IT or database expert, so what follows may be a bit sketchy…

…anyway, the library is managed using a piece of proprietary software called “Junior Librarian” (version 3). The company that makes/distributes this now says it is “no longer supported by Microsoft” (I don’t know whether that means it’s targeted at an older version of Windows or the underlying DBS is out of date, or whatever) and so needs to be “upgraded” to another piece of proprietary software that costs over £1000.

Given the state of school budgets in the UK that essentially means buying no new books next year. The vendor’s claim that the new software will support e-books is also of little to zero appeal to a teacher who wants children to spend less time at a screen.

So, my question(s) is/are this:

  • Is there a free software alternative to Junior Librarian out there that will allow the existing data to be imported?
  • If not, does anyone know anything more about this and would they be willing to at least explore developing such a thing? (I have some free time at the moment – unless you want to give me a job, that is.)

Reviving this blog … with a question about binary trees


Work changes and a determination to actually finish my PhD mean I really should make a bit of an effort here and so I will.

Here is a puzzle that has been bothering me about binary trees which has come from my PhD research…

In that research I am investigating how to implement virtual memory on a many-core Network-on-Chip (NoC) system. Essentially I have been building and running emulators.

The core of the emulation is the “memory tree” – the connection between the cores of the NoC and the global memory controller. The “Bluetree” memory tree is a binary tree arrangement, with the root node (level 0) connecting to the memory controller and the leaves at the NoC tiles (with each tile having a single processor).

At every non-leaf level of the tree there are multiplexors (mux) with a buffer that can hold a single memory packet – so if two requests for global memory arrive at once and the buffer is free there needs to be an arbitration process to decide which packet gets to occupy the buffer and which has to wait.

We have 128 leaf nodes – as in this tree…

Binary tree with 128 leaf nodes

With this many leaf nodes, the question of the arbitration policy of the muxes is important. A naïve approach would be to say, if two packets contest for the buffer, let the packet on the left win the first time and then pass the packet on the right: after all this guarantees no starvation.

But it is actually a hopeless policy … the leftmost leaf node (shall number from zero) faces no opposition, the rightmost (number 127) loses out to every other node.

But number of leaf node alone is not a sufficient way of describing the relative number of blocks – for instance leaf node 16 appears to be in a better place than leaf node 15 – as going up the tree leaf node 15 can be blocked at level 6 (the root node is at level 0), at level 5, level 4, level 3 and level 2: while leaf node 16 is only blocked at level 1.

Practical testing of this naïvely fair approach in the emulator gives results like the following… as can be seen the right hand part of the tree performs appallingly but even there lots of variation can be seen:

some test results

My question is this: what is the correct (i.e., correct mathematically) way of describing the order of the leaf nodes?

Your assembly is my domain-specific language


A few years ago I set out to write a clone of Sinclair BASIC as a domain-specific language (DSL) in Groovy. The end result was BINSIC, but it was much closer to an interpreter than a DSL: it turned out that strange ways that Groovy handled capitalised function/closure names meant that I could not match keywords at all and so interpretation became the only option (though there were DSL-like features lurking under the covers).

Now I have set out to write an interpreter for RISCV assembly and have written something which is much more like a DSL. My aim was to track the memory reference patterns of various real-time benchmarks running on RISCV systems – based on the disassembly generated by Spike, the RISCV emulator.

Getting that done requires tracking register state – because in true RISC fashion reads and writes are done not to immediates, but to addresses held in registers with immediates used as offsets.

To make this happen every (or at least every used) RISCV instruction is mapped to a closure and the operands treated as closure parameters. The closures are all inside the RegisterFile class so all can access the registers to keep the state updated. This makes it quite like a DSL but I make no claims for purity: every statement is passed through a regular expression based interpreter to separate out the parameters: if nothing else that eliminates a lot of boilerplate code that would have to be stuck inside the closures.

Memory is treated as a sparse array through a Groovy Map instance – with an address (64-bit Long) mapped to an 8-bit Byte.

The process isn’t perfect – a disassembly doesn’t given you the contents of initialised data, so an attempt to access those addresses is quite likely to raise an exception which needs to be trapped and the memory accounted for – by simply assigning zero to the address: it’s a kludge but it seems to work.