# BBC Sports Personality of the Year votes modelled as a Zipf distribution

One of the biggest annual sporting occasions in the UK is the award, in December, of the BBC’s Sports Personality of the Year, and last nights award, after the Olympics and much else, was, in a real sense, a national event, with chit-chat in the workplace for many weeks previously focusing on who might win, lots of newspaper interest and, of course, a healthy betting market.

In days of yore any reader of the Radio Times could fill out the form there and nominate anyone they liked. But since several attempts to rig it (in favour of an angler that no one else but anglers would have heard of, if I remember correctly), it is now a phone vote from among a BBC chosen panel.

This year there were 12 people to choose from but the overall winner, by a large plurality, was Bradley Wiggins – pretty much the favourite from the moment he won the Tour De France back in the early summer.

I thought it would be interesting to model the votes as a Zipf distribution.

(as a reminder a Zipf distribution would be of the form $V = \frac{k}{R^n}$ where $V$ is the votes won, $R$ the rank achieved and $k$ and $n$ some constants.)

Does it work? Sort of, though there are plainly two slightly different series inside the overall total – the elite 6 and the not-so elite 6 (NB I am talking about their vote-pulling power here, rather than passing comment on their spectacular sporting success).

Here’s the graph with $k = 1.5$ and $n = 1.1$.

(Bradley Wiggins’s vote in th estimate will always be an “over-estimate” unless $k = 1$)

Looks like a decent match, but when we shift to a logarithmic scale then the flaws are very apparent:

And again, without wanting to diminish anyone’s sporting achievement, the graph suggests that Bradley Wiggins, Jessica Ennis, Andy Murray, Mo Farrah, David Weir and Ellie Simmonds were all contenders for the award – but Chris Hoy, Nicola Adams, Ben Ainslie, Rory McIlroy, Katherine Grainger and Sarah Storey were not.