# The binomial distribution, part 2

(Part 1 is here – these notes are to assist me, rather than contain any real news!)

So, if the probability that an event will happen to a single entity in a unit of time is $p$ and the probability it will not happen is $1 - p = q$, what is the probability that a large number of events, $k$, will take place?

Considering the Uranium-235 example again, lets say there are a very large number, $N$ of atoms: how many will decay and emit an $\alpha$-particle in a given second?

Well we know what the mean number of decays we should expect is: $Np$. But this is a random process and so will not always deliver that result, instead there will be a distribution of results around this mean.

What does the this distribution look like – ie., what is its probability mass function (pmf)?

For exactly $k$ decays let’s call this $f(k; N,p)$.

To show where the pmf comes from, let’s look at a much simpler example – tossing a coin four times and seeing if we get exactly one head, ie $k=1, N=4$.

One way we could get this is like this: HTTT. The probability of that happening is $p^kq^{N-k} = pq^3 = \frac{1}{16}$. But that is not the only way exactly one head could be delivered: obviously there are four ways: HTTT, THTT, TTHT, TTTH and so the probability of exactly one head is $\frac{4}{16}$.

(For two heads we have six ways of arranging the outcome: HHTT, HTHT, HTTH, THHT, THTH, TTHH and so the probability is $\frac{6}{16}$. For three heads the probability is the same as three tails (ie for one head), and the probabilities for all heads and all tails are both $\frac {1}{16}$. Cumulatively this covers all the possibilities and adds up to $1$.)

The generalisation of this gives us a pmf thus: $f(k; N,p) = _NC_k\ p^kq^{(N-k)}$, where $_NC_k$ is the binomial coefficient and can be spoken as “N choose k” and is the number of ways of distributing $k$ successes from $N$ trials.

$_NC_k = \frac{N!}{k!(N-k)!}$

There are approximately $10^{21}$ Uranium atoms in a gramme of the substance and calculating factorals of such large numbers efficiently requires an awful lot of computing power – my GNU calculator has been at it for some time now, maxing out one CPU on this box for the last 14 minutes, so I guess I am going to have to pass on my hopes of showing you some of the odds.