Poisson distribution puzzle

I asked this on math.stackexchange but have yet to get an answer, so will try here too…

In their monograph “Queues“, Cox and Smith state (paraphrased – this is p5):

In interval (t,t+Δt) the probability of no arrivals in a completely random process is 1−αΔt+o(Δt), for one arrival αΔt+o(Δt) and for more than one arrival o(Δt) where α is the mean rate of arrival.

I cannot follow this… here is my thinking – we take N to be the probability of no arrivals, W to be the probability of one arrival, Z to be the probability of more than one arrival, and A to be the probability of any arrivals.

So 1=N+W+Z=N+A

By my understanding of Cox and Smith:

1=1−αΔt+o(Δt)+αΔt+o(Δt)+o(Δt) =1+3o(Δt) which is surely nonsense.

So, what have I got wrong here?

So, how good are the English premiership top seven?

oops: got the maths wrong first time – this is the corrected version

Last weekend was said to be one of the worst in the history of English bookmakers (I know, your heart bleeds) as for the first time ever all seven top teams in the English premiership won – so ensuring a lot of fixed odds accumulator payouts.

But just how big an upset was this “for the books”?

Well, the English premier league has been going since the summer of 1992, so this is its 21st season. In the first two seasons there were 22 teams, and since then there have been 20.

So the total game “weeks” have been:

1992/93 – 1994/95: 42 x  2 = 84
1995/96 – 2012/13: 38 x 18 = 684
2013/14 (to last week): 21

Giving a total of 789. So just one week out of 789 makes it sound like the top seven are not very reliable.

But, of course, this ignores the fact that – on most weeks – at least one of the top seven is playing another team in top seven (this is certainly happening this week with Chelsea playing Manchester United).

On any given week the chances of this NOT happening are (for a 20 team league):

\frac{13}{19} \times \frac{12}{17} \times \frac{11}{15} \times \frac{10}{13} \times \frac{9}{11} \times \frac{8}{9} \times \frac{7}{7} = 0.198

While for a 22 team league these odds are:

\frac{15}{21} \times \frac{14}{19} \times \frac{13}{17} \times \frac{12}{15} \times \frac{11}{13} \times \frac{10}{11} \times \frac{9}{9} = 0.248

So, in other words there have been only 84 x 0.198 + 705 x 0.248 = 191 weeks (on average – I’d have to look at the precise run of games this season to be clear) where this was even possible.

And how often should the top teams win? If they decided games on tossing of coins (and we ignore draws as they are ‘wins’ for the bookies) then the top seven would only all win 1 out of 128 weeks. The fact they managed it after 191 weeks might suggest they were worse than that, but we cannot be sure – it probably needs a bigger sample size.

A further suggestion that P!=NP

There are a class of mathematical problems known as “P”: these can be solved in “polynomial time” or, to cut to the chase, they are problems for which we know how (algorithmically) to solve when we see them. A trivial example is -if you are given a number X, how do you calculate Y which is  X times 10. (If you don’t know the algorithm to multiply a base 10 number by 10 then stop reading this now!)

There are another class of problems – the “NP” (as in “Not P”) problems for which we know how to check if we have a correct solution to the problem, but not how to solve the problem itself. A typical example is producing the factors of a very large number created by multiplying two very large prime numbers together. If I gave you the factors you could probably verify that their product was the larger number, but if I gave you the larger number you might be here until the end of the universe trying to figure out what the factors were, as the only real way we know of (today) is to eliminate every prime number, one by one.

But, it may just be the case that “NP” problems are actually “P” problems after all, and it’s just we have yet to discover the algorithm to solve them. If we could show that NP=P in this way then we could do things like simply draw up train and class timetables instead of fiddling with expensive and approximate software tools to get these right.

But we’d also be able to crack internet encryption, which essentially relies on large numbers produced by two very large primes (public and private keys). These “one way functions” – i.e. bits of maths easy to do one way (multiply the two primes) but essentially impossible the other way – factor their product – are at the core of internet encryption.

Encryption and mathematics are deeply entwined and so communications intelligence agencies like the NSA in the United States and GCHQ in the UK are also centres of mathematical excellence – trying to break codes and, we must assume, test whether P=NP after all.

So this fascinating story about the state of encryption software in the US might suggest to us that the NSA have not been able to prove P=NP (most mathematicians think P!=NP but that is not proved either).

(The story essentially suggests that the NSA have paid to have a widely used form of internet encryption – Dual_EC_DRBG – operate like an Enigma Machine where the starting positions are always known. As with Enigma, once you had that, everything else would fall into place and what looks like a random sequence of letters is quickly converted to plain text.

Of course it could all be paranoid nonsense or even a cover to make us think that they haven’t cracked the P=NP problem (as, after all, you’d guard that secret with everything you’d got – except internet encryption!) – paying out a few million dollars to make someone think you had doctored one way functions because you could crack them no other way would be money very well spent!

A further point about Scottish referendum polls

Free Celtic Nationalist Jesus Surfers Sunshine...
Free Celtic Nationalist Jesus Surfers Sunshine Vote Yes Van (Photo credit: GerardFerry)

One (smallish) point I left out of my discussion of opinion polling a few days ago was that the “margin of error” for a 95% confidence interval varies according to the reported score.

This range of possible error is actually highest for parties or opinions that score 50% – which is more or less where the “no” vote in the Scottish referendum is polling now – for a 1000 sample poll with 95% confidence interval the range at 50% is:

\pm 2 \times \sqrt{\frac{0.5 \times 0.5}{1000}} = \pm 3.2\%

Good news for the yes camp? Not really, because apart from the obvious point that it is mathematically equally likely to be an over-estimate as an under-estimate, the corollary is that the error in smaller figures is less. For 30%, roughly where the yes campaign are, the error is:

\pm 2 \times \sqrt{\frac{0.3 \times 0.7}{1000}} = \pm 2.9\%

More on the Riemann sphere

In my previous discussion on the Riemann Sphere I talked of it as a closed sphere of infinite extent, but it can also be represented as a sphere of finite extent.

Display of complex number on the Riemann sphere.
Display of complex number on the Riemann sphere. (Photo credit: Wikipedia)

In this case imagine a sphere centred on the origin and of unit radius. Then any complex number can be represented as a point on the sphere’s surface by simply drawing a line from the pole to that point on the (complex) plane – where the line intersects the sphere is the point that represents the number (numbers of magnitude of less than one are represented by the underside of the sphere).

This is, in one sense, a consequence of the infinity of the continuum – we are projecting an infinite line on to a finite line, but as the points on both lines have the same cardinality there can be a one to one correspondence – it’s another thing of beauty, I think.

y^2 + x^2 = 1

This entry is based on the prologue for the book Elliptic Tales: Curves, Counting, and Number Theory (challenging but quite fun reading on the morning Tube commute!):
y^2 + x^2 = 1 is the familiar equation for the unit circle and in the prologue the authors show how a straight line with a rational slope intersects a circle at two point which is rational i.e, of the form (x, y) = (\frac{a}{b},\frac{c}{d}) then the second point is also rational and that all such lines trace out the full set of rational points on the circle.

But then the book goes further –

We say that circle C has a “rational parametrization”. This means that the coordinates of every point on the circle are given by a couple of functions that are themselves quotients of polynomials in one variable. The rational parametrization for C we have found is the following:

(x,y) = (\frac{1 - m^2}{1+m^2},\frac{2m}{1+m^2})

So this is what I wanted to check… it surely isn’t claiming that all the points on the circle are rational, is it? Merely that the above – if m (which corresponds to the slope of our line through the two points) is rational, generates the full set of rational points on the circle. Because if m is not rational then the second point will not be either? Is that right?

Maths or physics ‘A’ level student? Then read this book…

Differences between Euclidean and non-Eucl. ge...
Differences between Euclidean and non-Eucl. geometries (Photo credit: Wikipedia)

The Language of Mathematics: making the invisible visible

Although I want to warmly recommend this book, it is not what I expected when I started reading it – another popular explanation of maths that just might contain an insight or two.

Instead it is much more like a tour d’horizon of a first year of a degree course in mathematics – but without much of the maths itself.

We get a quick tour of number theory, logic, infinitesimals, Euclidean and non-Euclidean geometry, group theory, topology, Gaussian distributions, Bayesian probability, relativistic space-time and super-symmetry.

The absence of detailed mathematical explanations for much of the ideas presented here (and the occasionally sloppy writing) can be a bit frustrating, but generally Keith Devlin gets away with it because he paces the book well and sticks to his underlying theme – that mathematics is the science of patterns and that patterns underly much of reality.

As such the book taught me a lot and reminded me of quite a bit too – I wish I had read it when I was an ‘A’ level student or even when I was an undergraduate.

Well worth buying and reading – though strangely it seems to be about to go out of print in Britain. My copy suggests it is the 13th impression of the 2000 edition, so clearly demand is high – at least in the United States – perhaps readers on this side of the pond are put off by two of the least important but most egregious errors – the description of the Scots James Clerk Maxwell and the Scots-Irish William Thomson (Lord Kelvin) as “English” within three pages of each other.