Failing software – again

The line above is from a real (and current at time-of-posting) job advertisement for a software developer. I’m not positing it because I think it is bad, shocking or dangerous, but mainly because it is illustrative of the real world: developers are expected to be “pragmatic” when it comes to testing the software they make for correctness.

So when, as happened today with British Airways, we see a major software failure (or at least what looks like a major software failure), maybe we should not be too surprised.

There is more to it, of course, than the fact developers are expected to compromise on testing their code: there is the simple fact that most functions are not computable at all and that for many others we don’t know how to compute them.

For an example of something uncomputable there is Hilbert’s tenth problem:

For any given a Diophantine equation is there a general algorithm that tells us whether or not there is a solution with the unknowns taking integer values…

To explain:

  • A Diophantine equation (named after Diophantus of Alexandria) is a polynomial (e.g., x = y , y^2 + 5x + z^{67} = 0, 3x + 2y +5x^2y^3 + 90 = 7z and so on) equation where all the coefficients of the unknowns are integer (counting number) values and where there are a finite number of unknowns.
  • An algorithm is a set of rules of simple mathematical procedures to solve a problem

So the task is to take a Diophantine equation and not to find a solution but to determine whether such a solution exists – but no such algorithm exists and therefore we cannot write any sort of computer program that would do this for us, no matter how powerful a computer we were using.

And for an example of something for which a solution does exist but which we will struggle to find, there is the famoustravelling salesman problem” – namely given a set of cities all connected by roads (or railways, or air routes) what is the most efficient way of visiting all the cities.

This problem is an example of what is known as an “NP” problem – in other words one for which (we believe) no general algorithm exists to produce a solution in “polynomial time” (meaning in a time related to the length of the input – in this case the number of cities).

Plainly a solution does exist – there is a quickest way to travel to all the cities – but the only immediately available way to be certain of finding this is to try all the solutions.

But that will take time: for even moderately sized problems quite possibly more time than we have before the Sun swallows the Earth. To get round this we have to use heuristics (essentially educated guesswork) or approximations.

In the travelling salesman case those approximations are powerful and efficient – that’s why modern logistics works – but the general point remains the same: in many cases we are building software that approximates the solution to our problem but may not answer it fully correctly and, unless there is a breakthrough which eliminates NP problems as a class, we are always going to be stuck with that.