Let us now praise insufficiently famous men: Claude Shannon


Claude Elwood Shannon (1916-2001)

Claude Elwood Shannon (1916-2001) (Photo credit: Wikipedia)lies absolutely all the communication on the internet.

Claude Shannon, who was born exactly a century ago, is the reason you are reading this – his work on signal processing lies behind all the communication on the internet.

Shannon’s master work was his MSc thesis – the unreachable target which is dangled in front of every computer science masters student.

I refute it thus?


English: Schrödinger's Cat, many worlds interp...

English: Schrödinger’s Cat, many worlds interpretation, with universe branching (Photo credit: Wikipedia)

A while ago I read Max Tegmark‘s “Our Mathematical Universe” (Amazon link) – which introduced me to the concept of “Quantum Suicide” and the idea that if the “many worlds” interpretation of quantum physics is correct then, if death is a result of quantum processes (e.g., does this particular atomic nucleus decay, releasing radiation, causing a mutation, leading to cancer and so on), then, actually, we can expect to live forever – in the sense that our consciousness would continue on in that universe where all the quantum randomness was for the best.

It’s a powerful, if quite mind-bending idea, and it had quite a profound effect on me.

Until, that is, at the end of January, when I slipped on a London street, smashed my face on the pavement and swallowed the broken piece of tooth. Three months later the pain in my upper left arm – with which I tried to break my fall, is a constant reminder that maybe Niels Bohr and the Copenhagen Interpretation was right after all.


Love and Math: a review


Dansk: Matematikeren Edward Frenkel under et f...

Dansk: Matematikeren Edward Frenkel under et foredrag om geometrisering af sporformler på University of California, Berkeley, 16. september 2010. (Photo credit: Wikipedia)

When I started reading this book by Edward Frenkel (Amazon link here) I became so engrossed in it on my morning commute that I missed my Tube stop – and the next one. I got an insight into life in the Soviet Union on the cusp of perestroika from a contemporary (if somewhat higher achieving student), including into how academic (and anti-scientific in the sense that some were desperate to discredit Einstein) anti-Semitism was on the increase from the 1970s onwards, as well as a new take on group theory in geometry and an introduction to braids.

It really was great – the main text skated over the maths while the footnotes explained it in some detail. Not all of it was perfect – the attempt to explain symmetry at the start left me confused about something I thought I understood – but it seemed to all hit the right note.

But it seems my Leicester Square moment was the pinnacle. Even by the time I had retreated back to Holborn I was starting to struggle as the maths just went off the deep end and the explanations offered no quarter.

It’s a pity, because I do think that just some small additional efforts to explain what some of the concepts meant could have gone a long way – for instance we just get Riemann surfaces dumped on us as though they were something different from manifolds (I am sure they are, but a little more effort at explaining why would have helped). While at the end we get a long, and dry, description of branes and A-models and B-models which we are told are potentially important in quantum physics, but we never quite are told why they are important.

My overall impression was the maths has run away with the science a bit – but I am not really in any position to judge.

This could have been a great book, but unless you really are well read on your complex topologies then I’d have to warn you to stay clear.

The maths and physics of walking in the sand


I love this – which I picked up from Ian Stewart’s now slightly out-of-date (e.g., pre-proof of Fermat’s Last Theorem) and out-of-print The Problems of Mathematics (but a good read and on sale very cheaply at Amazon) – because it demonstrates the harmony of physics with maths, is based on a common experience and is also quite counter-

English: Foot stepping on wet sand causing san...

English: Foot stepping on wet sand causing sand to appear dry around the foot. Sand dilates due to the pressure of the foot nearby and draws water into pores so it appears to be dryer. (Photo credit: Wikipedia)


Most of us are familiar with the experience – if you walk on damp sand two things happen: firstly the area around our foot becomes suddenly dry and secondly, as we lift our foot off, the footprint fills with water. What is happening here?

Well, it turns out that the sand, before we stand on it, is in a locally optimised packing state – in other words, although the grains of sand are essentially randomly distributed they are packed together in a way that minimises (locally) the space between the grains. If they weren’t then even the smallest disturbance would force them into a better packed state and release the potential energy they store in their less efficiently packed state.

This doesn’t mean, of course, that they are packed in the most efficient way possible – just as they are randomly thrown together they fall into the locally available lowest energy state (this is the physics) which is the locally available best packing (this is the maths).

But this also means that when we stand on the sand we cannot actually be compressing it – because that would actually imply a form of perpetual motion as we created an ever lower energy state/even more efficient packing out of nothing. In fact we make the sand less efficiently compressed – the energy of our foot strike allowing the grains to reach a less compressed  packing – and, as a result, create more space for the water in the surrounding sand to rush into: hence the sand surrounding our foot becomes drier as the water drains out of it and into where we are standing.

Then, as we lift our foot, we take away the energy that was sustaining the less efficient packing and the grains of sand rearrange themselves into a more efficient packing (or – to look at it in the physical sense – release the energy stored when we stand on the sand). This more efficient packing mean less room for the water in the sand and so the space left by our foot fills with water expelled from the sand.


Struggling with coin tossing paradox


The method of Eratosthenes used to sieve out p...

The method of Eratosthenes used to sieve out prime numbers is employed in this proof. (Photo credit: Wikipedia)

I was led to this by the discussion on the non-random nature of prime numbers – as apparently it inspired one of the authors of the paper that noted this. I am struggling a bit with the maths of this, so hopefully writing it out might either help me grasp what is wrong with my exposition or else somebody will explain to me what I am doing wrong.

The paradox, taking H for heads and T for tails, is this – if you have a coin and toss it twice there is an equal probability (0.25) that you will see the sequence HH or HT.

But if you have two coins and toss them then, on average, it will take six tosses before you see HH but ocannly four to see HT.

I have no problem with the HT sequence – inside four tosses you can reach this via:


which a simple sum of probabilities shows comes to: \frac{4}{16} + \frac{2}{16} + \frac{1}{16} + \frac{2}{16} + \frac{1}{16} + \frac{1}{16} = \frac{11}{16}, greater than half, while just three tosses would give a probability of 0.5.

So, the issue is with the HH combination. With just five tosses we can get:


Which sums to: \frac{8}{32} + \frac{2}{32} + \frac{1}{32} + \frac{4}{32} + \frac{2}{32} + \frac{1}{32} + \frac{1}{32} = \frac{19}{32} – i.e., more than half (four tosses give us odds of 0.5).

So where is the flaw in my reasoning?

Is dark matter locked up in primordial black holes?


Dark matter pie

Dark matter pie (Photo credit: Wikipedia)

To be honest, I have an issue with both “dark matter” and “dark energy” – they both look like close-to-metaphysical constructs to me: we have a hole where theory and observation do not match so we’ll invent this latter-day phlogiston and call it “dark”.

Then again, I’m not really qualified to comment and it is pretty clear that observations point to missing mass and energy.

I have another issue – if most of the mass of the universe is in the “dark matter” why is there no obvious evidence of it nearby? I don’t mean why can’t we see it – as obviously that is the point – but even though we sit in an area (Earth) of local gravitational field maximum we are struggling to see any local mass effects. (For instance this paper talks about how “local” conditions should impact any dark matter wind but, as far as I can see at least, it’s all entirely theoretical: we haven’t seen any dark matter wind at all.)

So the suggestion – reported in the New Scientist – and outlined in this paper – that actually dark matter is locked up in black holes caused by sound wave compression in the earliest moments of the universe has an appeal. It also potentially fits with the observational evidence that dark matter appears to be knocking around in the halos of galaxies.

These primordial black holes are not a new concept – Bernard Carr and Stephen Hawking wrote about them in 1974 (in this paper). The new evidence for their potential as stores of dark matter comes from the already famous Laser Interferometer Gravitational-Wave Observatory (LIGO) experiment – as that leaves open the prospect that the two black holes that generated the detected gravitational waves could both be in a galactic halo and of the right mass spectrum to suggest they and similar bodies accounted for the gravitational pull we see from dark matter.

All this is quite speculative – the paper points the way to a new generation of experiments rather than proclaims an epoch making discovery, but it’s obviously very interesting and also suggests that the long search for WIMPS – the hypothesised weakly interacting particles that have previously been the favourites as an explanation for dark matter – has essentially been in vain.

Primes are not random


An article in the New Scientist – here – discusses this paper on arxiv on the non-random distribution of the prime numbers.

So what? After all, we know that primes are not randomly distributed and become increasingly less common as we progress up the “number line” – with the Prime Number Theorem telling us that the number of primes up to or less than x, being \pi(x) \sim \frac{x}{ln(x)}.

(This is also takes us to perhaps the best known paradoxes of infinity – even though we know that primes become increasingly rare as we count higher, we also know there are an infinite number of primes: if we had a maximum prime P then if we multiplied all the primes up to an including P and added 1, we would have a number that was not divisible by any prime: a contradiction which demonstrates there can be no maximum prime and hence there must be an infinite number of primes.)

But the lack of randomness here is something deeper – there is a connection between successive primes, something rather more unexpected.

For any prime greater than 5 counted in base 10 we know it must end in either 1, 3, 7 or 9. The even endings (0, 2, 4, 6, 8) are all divisible by 2, while those that end in 5 are also divisible by it (this is what made learning the five times tables so easy for me in Mrs McManus’s P3 class forty-three years ago – the girls on the street loved a skipping rhyme and all I had to do was sing it in my head: five, ten, fifteen, twenty…).

The thinking was that these endings would be randomly distributed, but they are not (a simple result to demonstrate, so it is a sign that not all interesting maths research is yet beyond us mere mortals.)

To simplify things a bit, though, I am (as does the arxiv paper) going to look at the residues the primes leave in modulo 3 counting. Essentially this just means the remainder we get when divide the prime by three (there has to be a remainder, of course, because otherwise the number would not be prime). As is obvious these have to be 1s or 2s – e.g., 5 has a residue of 2, 7 a residue of 1 and so on.

What we are really interested in is the sequence of these residues. If successive primes were wholly independent then we would expect these sequences – i.e., (1, 1) or (1, 2) or (2, 1) or (2, 2) to all be equally likely. Using mathematical notation we describe the frequency of these sequences like this: \pi(x_0; 3, (1, 1)) for the (1, 1) sequence and so on.

For the first million primes we would expect each sequence to have 250,000 occurrences or something very close to that. But we don’t see that at all:

\pi(x_0; 3, (1,1)) = 215,837
\pi(x_0; 3, (1,2)) = 283,957
\pi(x_0; 3, (2,1)) = 283,957
\pi(x_0; 3, (2,2)) = 216,213

Using a list of primes I found online and some C++ and R (listed at the end if you want to mess about with it yourself), I decided to using a little bit of Bayesian inference to see how the non-random the primes are.

I treated the sequences as though they were coin tosses – if we get a switch (1, 2) or (2, 1) then that is the equivalent of a head, while if we get a recurrence (1, 1) or (2, 2) then we have had the equivalent of flipping a tail. If the coin was “fair” then for a 1000 flips we’d get 500 of each type. The result for the first 1000000 primes, though, suggests we’d actually get 568 heads.

(I discussed Bayesian testing for coin fairness some time ago – here.)

With Bayesian inference we can posit a “prior” –   a guess, educated or otherwise – as to how fair the coin we are testing might be. If we know nothing we might start with a prior that assigned equal likelihoods to all results, or we might, in this case, go for a (in retrospect naive) view that the most likely outcome is fairness and that subsequent primes are independent of those that went before.

The first graph – below – is based on saying we are ignorant of the likely result and so assigns an equal likelihood to each result. (A probability of 1 indicates absolute certainty that this is the only possible outcome, while a probability of 0 indicates there is literally zero chance of this happening – it should be noted that using Bayesian inference means we are looking at a prediction for all primes, not simply the results of the first 1000 we test.)

First thousand primes

This looks at the first thousand primes and predicts we’d get something like 650 – 660 heads (if you look closely you’ll see that even as the probability density graph becomes narrower it also moves slowly to the left.)

The second graph does the same but posits a different prior – one where the initial guess is that, quite strongly, the “coin” is a fair one. That makes a big difference for the first few primes but it does not take long before the two graphs look exactly the same.

with strongly fair prior

So does this lack of randomness indicate a shocking and previously undetected “hidden hand” in number theory? It seems not (I don’t claim to follow the maths of the paper at this point): the authors explain can explain the pattern using what we already know about primes and also point out that, as we further extend our examination we find the result is increasingly “fair” – in other words, the probability density spike would creep ever closer to 500 in the graphs above. (This also shows that even quite large sample sizes are relatively poor guides to the final result if the sampling is not truly random, but that is another matter entirely).

This is a reworked and consolidated post from what were the two previous posts on this subject.

Here’s the code

#include <iostream>
#include <fstream>
#include <string>
#include <sstream>
#include <vector>
#include <cmath>

using namespace std;

void normalise(vector<long double>& input)
	long double sum = 0;		
	for (auto x: input) {
		sum += abs(x/sizeof(input));
	long double factor = 1/sum;
	for (auto& x: input) {
		x = x * factor;
	auto biggest = max_element(input.begin(), input.end());
	if (*biggest > 1) {
		factor = 1/(*biggest);	
		for (auto& x: input) {
			x = x * factor;

int main()
	vector<int> primes;
	//read in the primes file
	ifstream inputPrimes("/Users/adrian/Downloads/P-1000000.txt");
	//get the primes
	string rawPrimeLine;
	while (getline(inputPrimes, rawPrimeLine)) {
		string ordinal;
		string prime;
		istringstream stringy(rawPrimeLine);
		getline(stringy, ordinal, ',');
		getline(stringy, prime, ',');

	//calculate bayes inference
	ofstream bayesOut("/Users/adrian/bayesout.csv");
	vector<long double> distribution;
	for (int i = 0; i <= 1000; i++) { 
                distribution.push_back(0.0001 + pow(500 - abs(i - 500), 4));
                // distribution.push_back(0.001);
        int last = 1; //7%3 = 1 
        int count = 2;
        int heads = 1;
        //heads = switch
        //tails = no switch
        //0 = all tails, 1 = all heads 
        for (auto p: primes) {
                if (p > 7 && count < 1010) {
                        int residue = p%3;
                        long double hVal = 0.0;
                        if (residue != last) {
                        for (auto& x: distribution) {
                                x *= pow(hVal, heads) * pow(1 - hVal, count - heads);
                                hVal += 0.001;
                        last = residue;
                        int numb = 0;
                        for (auto x: distribution) {
                               if (numb++ > 0) {
					bayesOut << ",";
				bayesOut << x;
			bayesOut << endl;
#!/usr/bin/env Rscript

optos = ani.options(nmax=1000)
for (i in 1:ani.options("nmax")) {
	matplot(t(w2)[,i], ylim=c(0, 1), axes=FALSE, ylab="Probability", xlab=c("Outcome: ", i), type="l")
	axis(side = 1, at = c(0, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000))
	axis(side = 2, at = c(0, 1))
}}, interval=0.15, movie.name="primes.gif", ani.width=400, ani.height=400)