Incompleteness in the natural world


Gödel Incompleteness Theorem
Gödel Incompleteness Theorem (Photo credit: janoma.cl)

A post inspired by Godel, Escher, Bach, Complexity: A Guided Tour, an article in this week’s New Scientist about the clash between general relativity and quantum mechanics and personal humiliation.

The everyday incompleteness: This is the personal humiliation bit. For the first time ever I went on a “Parkrun” today – the 5km Finsbury Park run, but I dropped out after 2.5km 2km – at the top of a hill and about 250 metres from my front door – I simply thought this is meant to be a leisure activity and I am not enjoying it one little bit. I can offer some excuses – it was really the first time ever I had run outdoors and so it was a bit silly to try a semi-competitive environment for that, I had not warmed up properly and so the first 500 metres were about simply getting breathing and limbs in co-ordination – mais qui s’excuse, s’accuse.

But the sense of incompleteness I want to write about here is not that everyday incompleteness, but a more fundamental one – our inability to fully describe the universe, or rather, a necessary fuzziness in our description.

Let’s begin with three great mathematical or scientific discoveries:

The diagonalisation method and the “incompleteness” of the real numbers: In 1891 Georg Cantor published one of the most beautiful, important and accessible arguments in number theory – through his diagonalisation argument, that proved that the infinity of the real numbers was qualitatively different from and greater than the infinity of the counting numbers.

The infinity of the counting numbers is just what it sounds like – start at one and keep going and you go on infinitely. This is the smallest infinity – called aleph null (\aleph_0 ).

Real numbers include the irrationals – those which cannot be expressed as fractions of counting numbers (Pythagoras shocked himself by discovering that \sqrt 2 was such a number). So the reals are all the numbers along a counting line – every single infinitesimal point along that line.

Few would disagree that there are, say, an infinite number of points between 0 and 1 on such a line. But Cantor showed that the number was uncountably infinite – i.e., we cannot just start counting from the first point and keep going. Here’s a brief proof…

Imagine we start to list all the points between 0 and 1 (in binary) – and we number each point, so…

1 is 0.00000000…..
2 is 0.100000000…..
3 is 0.010000000……
4 is 0.0010000000….
n is 0.{n – 2 0s}1{000……}

You can see this can go on for an infinitely countable number of times….

and so on. Now we decide to ‘flip’ the o or 1 at the index number, so we get:

1 is 0.1000000….
2 is 0.1100000….
3 is 0.0110000….
4 is 0.00110000….

And so on. But although we have already used up all the counting numbers we are now generating new numbers which we have not been able to count – this means we have more than \aleph_0 numbers in the reals, surely? But you argue, let’s just interleave these new numbers into our list like so….

1 is 0.0000000….
2 is 0.1000000…..
3 is 0.0100000….
4 is 0.1100000….
5 is 0.0010000….
6 is 0.0110000….

And so on. This is just another countably infinite set you argue. But, Cantor responds, do the ‘diagonalisation’ trick again and you get…

1 is 0.100000…..
2 is 0.110000….
3 is 0.0110000….
4 is 0.1101000…
5 is 0.00101000…
6 is 0.0110010….

And again we have new numbers, busting the countability of the set. And the point is this: no matter how many times you add the new numbers produced by diagonalisation into your counting list, diagonalisation will produce numbers you have not yet accounted for. From set theory you can show that while the counting numbers are of order (analogous to size) \aleph_0 , the reals are of order 2^{\aleph_0} , a far far bigger number – literally an uncountably bigger number.

Gödel’s Incompleteness Theorems: These are not amenable to a blog post length demonstration, but amount to this – we can state mathematical statements we know to be true but we cannot design a complete proof system that incorporates them – or we can state mathematical truths but we cannot build a self-contained system that proves they are true. The analogy with diagonalisation is that we know how to write out any real number between 0 and 1, but we cannot design a system (such as a computer program) that will write them all out – we have to keep ‘breaking’ the system by diagonalising it to find the missing numbers our rules will not generate for us. Gödel’s demonstration of this in 1931 was profoundly shocking to mathematicians as it appeared to many of them to completely undermine the very purpose of maths.

Turing’s Halting Problem: Very closely related to both Gödel’s incompleteness theorems and Cantor’s diagonalisation proof is Alan Turing’s formulation of the ‘halting problem’. Turing proposed a basic model of a computer – what we now refer to as a Turing machine – as an infinite paper tape and a reader (of the tape) and writer (to the tape). The tape’s contents can be interpreted as instructions to move, to write to the tape or to change the machine’s internal state (and that state can determine how the instructions are interpreted).

Now such a machine can easily be made of go into an infinite loop e.g.,:

  • The machine begins in the ‘start’ state and reads the tape.  If it reads a 0 or 1 it moves to the right and changes its state to ‘even’.
  • If the machine is in the state ‘even’ it reads the tape. If it reads a 0 or 1 it moves to the left and changes its state to ‘start’

You can see that if the tape is marked with two 0s or two 1s or any combination of 0 or 1 in the first two places the machine will loop for ever.

The halting problem is this – can we design a Turing machine that will tell us if a given machine and its instructions will fall into an infinite loop? Turing proved  we cannot without having to discuss any particular methodology … here’s my attempt to recreate his proof:

We can model any other Turing machine though a set of instructions on the tape, so if we have machine T we can have have it model machine M with instructions I : i.e., T(M, I)

Let us say T can tell whether M will halt or loop forever with instructions I – we don’t need to understand how it does it, just suppose that it does. So if (M, I) will halt T writes ‘yes’, otherwise it writes ‘no’.

Now let us design another machine T^\prime that takes T(M,I) its input but here T^\prime loops forever if T writes ‘yes’ and halts if T writes ‘no’.

Then we have:

M(I) halts or loops – T(M, I) halts – T^\prime loops forever.

But what if we feed T^\prime the input of T^\prime(T(M, I)?

M(I) halts or loops – T(M, I) halts – T^\prime(T(M,I)) loops forever – T^\prime(T^\prime(T(M,I))) – ??

Because if the second T^\prime(T^\prime(T(M,I))) halted then that would imply that the first had halted – but it is meant to loop forever, and so on…

As with Gödel we have reached a contradiction and so we cannot go further and must conclude that we cannot build a Turing machine (computer) that can solve the halting problem.

Quantum mechanics: The classic, Copenhagen, formulation of quantum mechanics states that the uncertainty of the theory collapses when we observe the world, but the “quantum worlds” theory suggests that actually the various outcomes do take place and we are just experiencing one of them at any given time. The experimental backup for the many worlds theory comes from quantum ‘double-slit’ experiments which suggest particles leave traces of their multiple states in every ‘world’.

What intrigues me: What if our limiting theories – the halting problem, Gödel’s incompleteness theorem, the uncountable infinite, were actually the equivalents of the Copenhagen formulation and, in fact, maths was also a “many world” domain where the incompleteness of the theories was actually the deeper reality – in other words the Turing machine can both loop forever and halt? This is probably, almost certainly, a very naïve analogy between the different theories but, lying in the bath and contemplating my humiliation via incompleteness this morning, it struck me as worth exploring at least.

Advertisements

Turing, Hofstadter, Bach – with some Cherla thrown in


The imitation game, as described by Alan Turin...
The imitation game, as described by Alan Turing in “Computing Machinery and Intelligence”. Player C, through a series of written questions, attempts to determine which of the two players is a man, and which of the two is the woman. Player A – the man – tires to trick player C into making the wrong decision, while player B tries to help player C. Turing uses this game as the basis for his test for intelligence. (Photo credit: Wikipedia)

A few weeks ago I attended the morning (I had to go back to work in the afternoon) of the BCS doctoral consortium in Covent Garden in London – watching various PhD students present their work to audience of peers.

The presentation which most interested me was that of Srikanth Cherla who is researching connectionist models for music analysis and prediction and to use generative models to produce short passages of music that are in a similar style to the music passages that his systems learn.

It’s not a field that I have any expertise in or indeed much knowledge of, though in essence (I hope I get this right): a specialised form of neural network is used to analyse musical passages (Bach’s chorale works were highlighted) and from there it is possible to get the computer to play some passages it has composed based on the style it has learnt.

Srikanth emphasised that it was not a case of applying a rigid rule that guessed or picked the next note – there is a semi-random/stochastic element that can be attributed to certain musical patterns in the works of the great composers and capturing that is important.

And the music he played at the end – while plainly not matching Bach, did certainly sound like Bach.

As I am (still! It’s a big book) reading Godel, Escher, Bach: An Eternal Golden Braid by Douglas Hofstadter my thoughts were immediately turned to the idea of artificial intelligence and then on to Alan Turing’s idea of the “Imitation Game” and the concept that anything that looks like it is thinking should, in fact, be thought of as thinking.

Today, prior to writing this blog I read through Turing’s October 1950 paper “Computing Machinery and Intelligence“, from which we get the idea of a “Turing Test” (though obviously he doesn’t call it that).

The paper begins:

I propose to consider the question, ‘Can machines think?’

And goes on to discuss ways in which it might be possible “by the end of the century” to have machines which could fool a remote observer, able only to read typed answers to questions, that a digital computer was in fact a person.

The paper is not, for Turing at least, in a completely different field to “On computable numbers”: Turing’s essential point is anything a human computer can do, a digital computer can do, and he goes on to explicitly call humans machines.

The idea that great works of art, such as the “next” set of Bach chorales,  might in the future be composed by computer no doubt horrifies many readers, as it plainly did in Turing’s day too – as he deals specifically with what he calls “the theological objection” – an extreme objection based on the idea that “God gives an immortal soul to every man and woman, but not to any other animal or machine”:

I am unable to accept any part of this… I am not very impressed with theological arguments whatever they may be used to support

But in any case, from within the theological paradigm dismisses it as a human imposition on what is meant to be an unlimited Godly power:

It appears to me the argument quoted above implies a serious restriction on the omnipotence of the Almighty

…before going on to swat aside Biblical literalism as an argument by citing how it was used against Galileo (maybe there are still fundamentalists out there who believe in the literal truth of Psalm 104 and an unmoving Earth but if so they keep quiet about it).

Then he deals with the argument that machines could not appear human because they have no consciousness by essentially asking what is consciousness anyway – and how can we prove others have it and then goes on to deal with “various disabilities” – such as computers being unable to appreciate the taste of strawberries with cream:

The inability to enjoy strawberries and cream may have struck the reader as frivilous. Possibly a machine might be made to enjoy this delicious dish, but any attempt to make one would be idiotic. What is important about this disability is that it contributes to some of the other diabilities e.g. to the difficulty of the same kind of friendliness occurring between man and machine as between white man and white man, or between black man and black man.

This passage is worth quoting as it both suggests that Turing is far from the 100% progressive superhero later admirers are tempted to paint him as – he beat the Nazis and was persecuted as a gay man and therefore can do no wrong: in fact he was a man of his times with all that implies – as well as because I find it less than fully satisfying an answer.

In context I think the point he is seeking to make is that we could make a machine that liked “eating” strawberries and could be friends with its fellows (so long as they had the same skin colour, don’t lets get too radical!) but why would we bother… but it is not totally clear.

Similarly he, like Hofstadter, deals with the so-called Goedalisation argument less than satisfactorily: this states that we, humans, can state true statements about numbers that machines cannot determine (i.e. we know they are true but the machine cannot decide if they are true or false). Hence we could, in the imitation game, pose a Goedel Number type puzzle that the computer could never answer.

Actually, of course, the computer could guess, as humans often do! But the more general point – that humans can do something machines cannot and so we are not truly Turing Machines seems unanswered to me by both Turing and Hofstadter’s argument: that we can also find questions humans cannot determine if we make them complex enough.

Perhaps an expert would care to comment?

Update: Following some feedback from Srikanth I have edited the passages referring to his work slightly – haven’t changed the sense I think but just made it a bit clearer. I also updated the Psalm number – as I had misread Turing’s reference to line 5 of the Psalm for the Psalm itself

Stephen Fry is wrong about Alan Turing


Stephen Fry argues that Alan Turing’s “Universal Machine” is the greatest ever British innovation.

Turing at the time of his election to Fellowsh...

The case for Turing is, of course, a strong one. Not just because of the computer you are reading this blog on, but because, probably even more importantly, Turing’s concept was at the heart of the British “Station X” codebreaking effort in the Second World War, shortening the war by years and perhaps was even essential to national survival. Defeating Hitler – something that it is hard to believe could have been done without Britain even if others paid a higher price in money and lives to finally deliver the triumph – remains the great British gift to humanity.

But it is utterly wrong.

Because the Universal Machine is not an invention or innovation at all. It is a discovery – of a mathematical idea – and it has existed (at least) since the beginning of our universe and will still exist long after we (humanity) are all gone.

Admittedly my formulation of mathematics – essentially Platonic – is not uncontroversial and I have strong “political” motivations for making it (of I will be writing more shortly) – seeing mathematical ideas as discoveries rather than inventions is core to keeping patent lawyers’ hands off them (in this jurisdiction at least).

But why treat maths as any different from physics? We are surely not about to grant Peter Higgs (lovely man that he is) a patent over mass, even though we are on the verge of confirming his theoretical paper of 1962 was essentially correct in its formulation of the “Higgs field” that gives particles mass.

Update: Stephen Fry is of course describing the Universal Machine as an “innovation” and not an “invention” as I originally descried it at the top of this article (changed now). The argument doesn’t change though – I don’t think it is an innovation, because it has always existed. It is a great discovery, arguably the greatest, but that’s a different matter.

My (computer science) book of the year


The Annotated Turing
Image via Wikipedia

It wasn’t published in 2011 but The Annotated Turing: A Guided Tour Through Alan Turing’s Historic Paper on Computability and the Turing Machine is without doubt my computer science book of the year.

If books can change your life then The Annotated Turing changed mine – because it showed me just how strong the link between maths and computer science is and how fundamental maths is to an understanding of the nature of reality and truth. The world, in my eyes, has not been the same since I read this book last January.

If you are a computer science student the you must read this book!

And finally, Happy New Year to all.

Very small Turing machines


State diagram of a Turing Machine
Image by Center for Image in Science and Art _ UL via Flickr

I am pretty busy with work now, so one of the things I had planned to do – write a simple Turing Machine in Groovy – will have to wait.

In the meantime here are some very small Turing machines to wonder over.

The Great Alan Turing


Allan Turing Statue, on display at Bletchley Park
Image via Wikipedia

Slashdot have a story to the effect that Leonardo DiCaprio is to play Alan Turing in a film that will mark the mathematician’s centenary next year.

Great news – the man’s memory deserves nothing more than the actor who has proved himself to be both great and edgy in recent work (he’s certainly not the milque toast figure the start of his career briefly suggested.)

As a geek, of course, I hope that the film will try to explain, just a little his achievements.

But how can you explain the ideas of computability and the Church-Turing thesis in a popular film? A tough one, but I suppose you could do something.

The Bletchley Park “bombe” and the idea that the weakness of the German Enigma machine – that it would never map a letter to itself (eg., in any message “e” would never be encrypted as “e”) – could be used to break the code (if a combination of a guessed plain text, usually a weather report, at the start of the message , and the initial key settings produced code that mapped letters to themselves then the initial settings were wrong) – is probably easier to explain.

And don’t forget about SIGSALY, the voice encryption system Turing worked on with Bell Labs. As a piece of engineering this is probably impossible to over-estimate in importance: as the first practical pulse code modulation system it could even be said to be the mother the mobile phone, or at least its grand aunt.

And, of course, let me again plug my book of the year: The Annotated Turing: A Guided Tour Through Alan Turing’s Historic Paper on Computability and the Turing Machine

Cardinality of the set of all strings


Oded Goldreich
Image via Wikipedia

I finished John Naughton‘s A Brief History of the Future: Origins of the Internet – an interesting diversion, to be sure and a book worth reading (not only because it reminds you of how rapidly internet adoption has accelerated in the last decade.)

By now I should be on to proper revision, but I indulged myself last week and bought a copy of P, NP, and NP-Completeness: The Basics of Computational Complexity which I am now attempting to read between dozes in the garden (weather in London is fantastic). The book is written in a somewhat sparse style but even so is somewhat more approachable that many texts (the author, Professor Oded Goldreich, rules out using “non-deterministic Turing machines“, for instance, saying they cloud the explanation).

But in his discussion of (deterministic) Turing machines he states:

the~set~of~all~strings~is~in~1-1~correspondence~to~the~natural~numbers

Is that right? (Serious question, please answer if you can).

Surely the set of all strings has the cardinality of the reals?

If we had a set of strings (0,1)^* like this:

1000000....
1100000....
1110000....
1111000....

and so on…

Then surely the standard diagonalisation argument applies? (i.e. take the diagonal 1111111.... and switch states of each member – 000000... and this string cannot be in the original set as it is guaranteed that for the i^{th} member of the set, the with elements \alpha_{1 ... \infty} , \alpha_{i} will be different. (See blog on diagonalisation.)

In Naughton’s book he makes (the very valid) point that students of the sciences are generally taught that when their results disagree with the paradigm, then their results are wrong and not the paradigm: so what have I got wrong here?

P = NP once again


Abacus
Image via Wikipedia

Been an interesting week here on the blog … as I said before getting on Slashdot is great as you get access to the intellect of a lot of people, and that taught me quite a bit.

It’s not relevant to my course but I went to the (Maths shelves of) the Birkbeck library and withdrew The Language of Machines – one of the few books on computability in the library.

But before I get down to reading much of that I think I need to reply to some of the harsher remarks that were made about this blog on Slashdot – in particular the piece I wrote at the end of December called “What if P=NP?

The opening line of that article is “I admit I now going slightly out of my depth” and it does contain a few errors, though nothing I think that justifies some of the really nasty things a couple of people wrote about it – but then again some people also insisted that proving P=NP would not break public key encryption so plainly there are people on /. whose ignorance is matched by the vehemence of their opinion, so maybe I shouldn’t let it get to me: especially as one of the critics insisted there were no algorithms to solve NP problems (so how are they solved? Fairy dust inside your computer?)

But I thought I’d try to rewrite my blog post in a better way….

I admit I now going slightly out of my depth, but I will try to explain what this is about and why it is interesting.

It is known that computers can solve some problems in what is called “polynomial time“: that is to say a finite time that is proportional to a polynomial of complexity of the input. The key thing is that these problems are computable using known mechanical steps (ie algorithms). Often that means they are solvable in a way that is (sometimes euphemistically) described as “quickly”.

These can be simple problems – like what is the sum of the first ten integers – or more complex ones, such as creating self-balancing trees, sorting database records alphabetically and so on.

These are the so-called “P” (for polynomial time class) problems. Here’s a definition:

A problem is assigned to the P (polynomial time) class if there exists at least one algorithm to solve that problem, such that the number of steps of the algorithm is bounded by a polynomial in n, where n is the length of the input.

Then there are another class of problems which seem fiendishly difficult to solve but which it is relatively simple to prove the correctness of any solution offered. These problems can also be solved (computed) in a finite time – and they can also be computed by a Turing machine (a simple model of a computer) and so an algorithmic solution exists. It is just that one cannot tell in advance what that algorithm is. In the worst case the only way – it is thought – that a solution can be found is through an exhaustive search of all algorithms – in other words a form of “brute force“. These are the NP (non deterministic polynomial time class) problems.

Now most mathematicians think that NP does not equal P – ie there are no “simple” solutions to the NP problems – and that may or may not be a good thing as much of our internet commerce relies on encryption which is thought to be an NP problem and effectively impervious (due to the time it would take to factor the very large prime numbers at the heart of the process) to brute force attacks if properly done.

(In Afghanistan in 2001 US cryptanalysts seemingly brute forced a Taliban Windows NT box but it used much weaker encryption than most electronic commerce.)

But what if it were the case that all seemingly NP problems were actually P problems? There are a lot of people studying this issue – but according to the New Scientist (their Christmas edition, the first in my subscription and delivered this morning, inspired me to write this) we should expect to wait at least until 2024 for an answer (by which point the problem – first formulated in 1971 – will have reached the age at which 50% of major mathematical problems will have been solved).

Some problems thought to be NP have already been shown to be P. But there was also a big fuss earlier in 2010 when a draft proof of P != NP was published (the proof was flawed).

This all matters: unlike, say, Fermat’s last theorem, proving P = NP is likely to have more or less immediate effects on the lives of many of us (how would you feel if you knew that it was possible, even if not likely, that someone had developed a way to quickly crack open all your internet credit card transactions?)

Of course, proving P = NP for all problems does not necessarily mean we will immediately have determined polynomial time based solutions for all the current NP problems out there. But we should not expect that to last for long, though it may be the case that the algorithms may be exceptionally complex rendering them very slow even on the fastest computing equipment, but if history teaches us anything it is that computers get faster (yes, I know there are bounds on that).

And, actually, I think the benefits to humanity would be enormous. The most likely immediate effects would be in improvements in computing/operating system efficiency. fast computers for less money and/or less energy consumption would be a huge benefit. From that alone many other benefits will flow in terms of information availability and universality.

But there could be many, many others in medicine and linguistics and even in getting the trains to run on time!

Coming up… the lambda calculus


An example of a DFA state diagram
Image via Wikipedia

Another thing that The Annotated Turing taught me is what all those lambdas that I have seen over the last 25 years were about, or at least it introduced me to what they were about.

So I have just ordered a copy of Structure and Interpretation of Computer Programs: and I can guess that some blog posts will eventually follow.

Oh no. Maybe I am turning into a LISP hacker – having finally reached the age that all these guys were when I first became aware of them (older, actually).

For all you first order/predicate logic fans out there


From The Annotated Turing: now reached page 226 and it is still good.

I think Charles Petzold has made a mistake – has he? Please read on and let me know.

Three axioms from Hilbert and Bernays

Petzold says these three axioms come from Hilbert and Bernays and they mean:

  1. Every member has a successor
  2. There exists a number that does not have a successor
  3. That is r is a successor  to x and y and x is the successor to s, then y is also the successor to s.

But surely the second axiom actually means there exists an x which is not a successor of y?

Comments please.