The binomial distribution, part 2

Binomial probability mass function.
Image via Wikipedia

(Part 1 is here – these notes are to assist me, rather than contain any real news!)

So, if the probability that an event will happen to a single entity in a unit of time is p and the probability it will not happen is 1 - p = q , what is the probability that a large number of events, k , will take place?

Considering the Uranium-235 example again, lets say there are a very large number, N of atoms: how many will decay and emit an \alpha -particle in a given second?

Well we know what the mean number of decays we should expect is: Np . But this is a random process and so will not always deliver that result, instead there will be a distribution of results around this mean.

What does the this distribution look like – ie., what is its probability mass function (pmf)?

For exactly k decays let’s call this f(k; N,p) .

To show where the pmf comes from, let’s look at a much simpler example – tossing a coin four times and seeing if we get exactly one head, ie k=1, N=4 .

One way we could get this is like this: HTTT. The probability of that happening is p^kq^{N-k} = pq^3 = \frac{1}{16} . But that is not the only way exactly one head could be delivered: obviously there are four ways: HTTT, THTT, TTHT, TTTH and so the probability of exactly one head is \frac{4}{16} .

(For two heads we have six ways of arranging the outcome: HHTT, HTHT, HTTH, THHT, THTH, TTHH and so the probability is \frac{6}{16} . For three heads the probability is the same as three tails (ie for one head), and the probabilities for all heads and all tails are both \frac {1}{16} . Cumulatively this covers all the possibilities and adds up to 1 .)

The generalisation of this gives us a pmf thus: f(k; N,p) = _NC_k\ p^kq^{(N-k)}, where _NC_k is the binomial coefficient and can be spoken as “N choose k” and is the number of ways of distributing k successes from N trials.

_NC_k = \frac{N!}{k!(N-k)!}

There are approximately 10^{21} Uranium atoms in a gramme of the substance and calculating factorals of such large numbers efficiently requires an awful lot of computing power – my GNU calculator has been at it for some time now, maxing out one CPU on this box for the last 14 minutes, so I guess I am going to have to pass on my hopes of showing you some of the odds.