Not one for @johnrentoul – but could 1 in 6 of us be dead this time next year?

English: Different sites and outcomes of H1N1 ...
Image via Wikipedia

It sounds like an outrageous idea, fit only for conspiracy theorists and the otherwised unhinged, but, at the risk of attracting the interest of John Rentoul’s “Question To Which The Answer is No” (QTWTAIN) list or sounding like I want a job on the Daily Express, the serious point is this: it is not impossible, if not yet likely, that as many as one in six of the planet’s human population could be felled inside a year by (a mutated) H5N1 flu virus.

I have just read a really very scary article by Debora MacKenzie to this effect in the current edition of the New Scientist:

Two research teams have found that a handful of mutations allow H5N1 to spread like ordinary flu while staying just as deadly, at least in ferrets.

Given that ordinary flu can infect a third of humanity in  a season and that half the people who catch H5N1 die, the implications are not hard to fathom.

Sounds like the gravest of public health emergencies to me, and indeed the point of the article is not to scare people but to insist that governments and scientists get their act together in tackling the emergency.

So far the way in which this recent work has reached limited public consciousness has been over the issue of censorship of the experimental results – the US authorities in particular are worried that describing the genetic modifications required to H5N1 to allow it to spread widely (currently it is widespread in birds but very rarely transmitted to humans and will not transmit from human to human at all) will be used by terrorists to make a bio weapon. It is not difficult to understand the fear, given the basic maths.

A multiple choice quiz on the Newton-Raphson method

zelf, paint MADe 17 okt 2005 13:40 (CEST)
Image via Wikipedia

Maybe I am not very good, but it took me two goes to get up to 5/6 correct answers on this multiple choice quiz on the Newton-Raphson method. The one I still have not got right is the first one.

Newton-Raphson method

This is another post inspired by the New Turing Omnibus, which is a great book, even if its author believes in some very stupid things.

The Newton-Raphson method is a way to find the roots of a polynomial equation. It does not always work, but it is a powerful tool when it does.

Eg., let us say we had y=3x^4+2x^2-x-15, the root of this is the value of x that causes the equation to evaluate to zero.

To do this we can first estimate a root value, evaluate the function, if it is not zero then use as the next guess the point where the tangent to the function at the first guess evaluates to zero. We can compute the tangent from the derivative.

Eg., in the case here f(x) = 3x^4+2x^2-x-15, so f^\prime(x) = 12x^3 + 4x -1.

How to find the point where the tangent evaluates to zero? Well, the formula for a tangent (or any straight line) is of the form y=kx + c and in this case, with our guess x_0 , that means y = f^\prime(x_0)x + c and, at the guessed point f^\prime(x_0)x_0 + c = f(x_0).

Clearly the tangent line is zero when f^\prime(x_0)x = -c and we know that -c = f^\prime(x_0)x_0-f(x_0).

So f^\prime(x_0)x = f^\prime(x_0)x_0 - f(x_0)

Therefore our new guess, x , should be x_0 -\frac{f(x_0)}{f^\prime(x_0)}

If we try the example here, we can start with a guess of 2 – which evaluates to 39 and has a derivative of 103, giving a new guess of 1.62 (to two significant places).

In turn this evaluates to 9.29 and so on…

Update:I tested this out on a spreadsheet and it took just seven iterations to get a root that evaluated to zero at 9 significant places:


A better calculator

A rudimentary Atari BASIC program ready to run
Image via Wikipedia

Only a few days ago I was musing to myself that, back in the days when BASIC was all I had, I could easily see what a function plot looked like by writing a dozen or so lines of code, while today I have to load libraries and make specialised calls to their APIs.

Then along comes “A Better Calculator” and it is even easier!

How fair is my dime, part 2

Continuing my Bayesian experiment with a dime coin. Part one is here.

So, let’s look at the posterior (outcome) probability density function after the next two tosses. First a head:

Two heads, two tails pdfNow we can see the results near the extreme possibilities (ie H= 1, H = 0) being eliminated completely. Interestingly, although, in reality, the evidence (two heads, two tails) points towards a fair coin, ie H = 0.5, the pdf here peaks with H below 0.5, reflecting our earlier ‘estimate’ of a coin biased towards tails.

Now, another tail:

2 heads, 3 tails pdfThe peak is becoming narrower now, with some suggestion that there is a bias to tails, but of course with the coin having only been tossed 5 times, some such “bias” is unavoidable.

Now a head:

Six tosses, three heads, three tails pdfAnd again…

7 tosses pdfAnd again…

pdf after 8 tossesAnd the ninth toss:

pdf after 9 tossesHere the peak is becoming pronounced (and it also looks like the splines used to create the graph are showing signs of Runge’s phenomenon).

And the final, tenth, toss:

pdf after 10 tossesThis does suggest a bias but the pdf is still quite broad – so there is lots of room for further correction. Indeed the thing that strikes me most is how little the suggested bias is after five heads in a row.

Is my dime fair?

I made my one trip to the United States in November 2009 – four days in Washington DC (three working, one extra day’s holiday). It was a fantastic experience and I hope to go again, soon.

I brought back a dime (10 cent coin) from that trip and tonight decided to test it for its fairness as a tossing coin (really, I am working through an example from Data Analysis: A Bayesian Tutorial, and as I write this I do not know what the answer will be.)

I tossed the coin ten times and (taking FDR’s bust as the head, h, and liberty, peace and victory as the tails, t), I got this sequence thththhhhh. So 70% of the tosses resulted in a head – too many to suggest the coin is fair? Let’s see.

I am going to use this formula, based on Bayes’s theorem:

pdf(H \vert \{data\},I) \propto pdf(\{data\} \vert H,I) \times pdf(H \vert I)

In other words the probability density function (pdf) for bias (H = 0 is a double tailed coin, H = 0.5 is a perfectly fair coin, H = 1 is a double headed coin) is proportional to the products of the observed data and the initially guessed (prior) pdf for H (in all cases I means the other, initial, conditions in the universe).

I begin with stipulating I know nothing about the coin’s bias, in other words set the prior pdf to 1 for all possible values of H. After two tosses we know that the probability of H = 1 is zero and similarly the probability of H = 0 is also zero. But what of the other values for H.

Well, \int_0^1HdH = 1 but we ignore that just to look at the relative values.

The data pdf is governed by the binomal distribution \propto H^R(1-H)^{N-R} where N is the total number of tosses and R is the number of heads returned.

So for H = 0.1 we get 0.1^1 \times 0.9^ 1 = 0.09, H = 0.2 0.2^1 \times 0.8^1 = 0.16, H = 0.3 0.3 \times 0.7 = 0.21 , H = 0.4 0.4 \times 0.6 = 0.24 , H = 0.5 0.5 \times 0.5 = 0.25 . The results are obviously symmetrical about 0.5, and as the prior pdf is 1 for all values of H, we can say that the most likely view is that the coin is fair but we have a high degree of uncertainty (see graph)

pdf after one head, one tailSo, now we toss an additional tail and take the new pdf (as in the graph) as our prior – what do we get now.

Well for H = 0.1 we now get for the data (0.1)^1 \times (0.9)^2 = 0.081 which we multiply by our new prior of 0.09 to get 0.00729. For H = 0.2 this becomes (0.2)^1 \times (0.8)^2 \times 0.16 = 0.02048 (of course the actual numbers mean little here, we are merely tracing the shape of the graph). In contrast H = 0.8 would now give (0.8)^1 \times (0.2)^2 \times 0.16 = 0.00512. The new graph has the shape shown below:

pdf shape after two tails and one headThis coin may be biased but there is still a lot of uncertainty here…

As it’s now after 12.30 in the morning here the rest will have to wait…


Richard Stallman plainly lacks any sense of irony

Richard Stallman conference on free software t...
Image via Wikipedia

Richard Stallman (aka “RMS” or the “last of the true hackers” according to Steven Levy’s classic book) is a great, but very difficult, man.

His strops and sulks about Linus Torvalds should not blind us to the fact that he does have a serious and valid point when he insists that what most people call “Linux” should really be called “GNU/Linux”.

But he would also seem to be a man without much of a sense of irony because he has used the propaganda channel funded and founded by the Kremlin, Russia Today (RT), to launch an attack on Facebook as an enemy of Freedom.

RT has marketed itself, in Britain at least, as the channel of choice for conspiracy nuts, the enemies of science and “truthers”. Hardly a surprise when its paymasters in the Kremlin are desperate to hold off any advances towards democracy in Russia and remain the last line of defence of the Syrian and Iranian dictatorships in the international arena.

Shame on you Richard Stallman.

(The interview with RT was conducted over a month ago but as you can guess I am not a regular viewer and it just turned up in the “CodeProject” email this morning.)

Better algorithms, not better hardware are what make your computer faster, faster

Moore's Law, The Fifth Paradigm.
Image via Wikipedia

Many people have heard of Moore’s Law – which states that the number of transistors that can be packed into a piece of silicon doubles every two years (or 18 months in another formulation). More transistors means greater speed, more and cheaper memory and so on … so in the thirty two years since I first saw a real “micro-computer” raw processing power has increased by over 30000 times on even the conservative formulation of the law.

The contrast with other engineering improvements is enormous: typical lightbulbs have become six or seven times more energy efficient in that period – something that LED technologies might double again in the next few years. Great, but nothing like computing.

But this contribution from the electronics engineers, while of course impressive, is minuscule in comparison to some of the improvements won by the mathematicians and the computer scientists who have perfected better, faster, methods to solve the common problems. These improvements in algorithms are not universal, unlike Moore’s Law, but can be much, more bigger – often in the factor of millions of times faster.

We are on the verge of having another such improvement revealed – to the “fast fourier transform” (FFT) algorithm which allows signals to be broken down into their components for compression and other purposes.

The FFT improvement is said to be (the paper is yet to be published and only the abstract is in the public domain) around a factor of 10 – which could have big implications for any application where fat data has to be pushed over a thin line and some loss of quality is permissible (mobile telephony is one obvious example but home sensors could be another).

The FFT image compression algorithm was patented some years ago, but as a pure mathematical procedure such a patient has no application in UK (or EU) law (and surely that is right – how can it be possible to patent something which already exists?) – it is not clear what the legal position of this innovation will be.

Once again, though, all this shows the fundamental importance of maths and science education to us all – and the importance of pure mathematical and scientific research. A little more emphasis on this and a little less on the “needs of industry” (very few employers are interested in research as opposed to getting their products out of the door a bit faster or cheaper) would be a welcome step forward for policy makers.

Simple point I had not previously realised about Boolean functions

Here is a typical Boolean truth table:

A    B    S        O

0    0    0        0
0    0    1        0
0    1    0        0
0    1    1        1
1    0    0        1
1    0    1        0
1    1    0        1
1    1    1        1

By now I am sure more than a few will have recognised this as a simple multiplexer circuit, with S as the select line – set S to 1 and the output O is the value of input B, otherwise O is the value of A.

Like me you were probably taught this could be simplified by ORing together all the 1 outputs eg:

(\neg A \wedge B \wedge C) \vee (A \wedge \neg B \wedge \neg C) \vee (A \wedge B \wedge \neg C) \vee (A \wedge B \wedge C)

… which can be further simplified to …

(A \wedge \neg C) \vee (B \wedge C)

Well, I now know this is the disjunctive normal form and that there is also a conjunctive normal form which is, of course, created by ANDing together all the terms that give a 0 output:

(A \vee B \vee C) \wedge (A \vee B \vee \neg C) \wedge (A \vee \neg B \vee C) \wedge (\neg A \vee B \vee \neg C)

which may also be slightly simplified to (A \vee B) \wedge (A \vee \neg B \vee C) \wedge (\neg A \vee B \vee \neg C) .
I know this is a simple observation, but it was new to me! Another thing I picked up from A. K. Dewdney’s book.

Calculating the Kronecker product

Leopold Kronecker
Image via Wikipedia

I realise in my previous blog post, I showed a Kronecker product, but did not explain how it was created, so here goes.

Let \ K_1 = \left( {\begin{array}{cc} a & b \\ c & d \\ \end{array}} \right)

Let \ K_2 = \left( {\begin{array}{ccc} z & y & x \\ w & v & u \\ t & s & r \\ \end{array}} \right)

Then K_2 \otimes K_1 = \left( {\begin{array}{cccccc} za & zb & ya & yb & xa & xb \\  zc & zd & yc & yd & xc & xd \\  wa & wb & va & vb & ua & ub \\  wc & wd & vc & vd & uc & ud \\  ta & tb & sa & sb & ra & rb \\  tc & td & sc & sd & rc & rd \\  \end{array}} \right)