The Labour Party leadership nomination process is now at a mature stage – 485 local Labour parties have made nominations and so there are probably less than 100 left to go.

The pattern of those nominations is pretty clear – a big lead for Keir Starmer (currently backed by 280 local parties) with Rebecca Long-Bailey having just under half his nominations (131) and Lisa Nandy just under half as many again (56) with Emily Thornberry in a poor fourth position with 18 and generally thought unlikely to meet the threshold of 33 to progress into the main ballot.

The discussion here will be about what mathematical modelling based on the nominations might be able to tell us about the outcome of that ballot rather than the politics of the contest.

My initial thought was that a Zipf model might fit with the nominations – here outcomes are proportional to the inverse of their rank raised to a power. So the first candidate’s results are proportional to:

$\frac{N}{1^R}$

And second placed candidate’s results are proportional to:

$\frac{N}{2^R}$

And so on where N is a constant and R a coefficient. In fact this model worked well for a while but Nandy’s (and particularly Thornberry’s) under-performance have seen it break down, so while R has tended to towards just under 1.1 for Long-Bailey, it’s about 1.45 for Nandy and close to 2 for Thornberry.

With no obvious rule, and with Thornberry in particularly falling further behind, I have relied on heuristics to get what looks like a good model for the outcome if all 645 CLPs in Britain nominated: currently that is Starmer 372, Long-Bailey 175, Nandy 75 and Thornberry 24.

From that I try to find a share of the vote that would match this outcome – and here I have to assume that all candidates win their CLP ballots on the first round of voting. (That obviously isn’t true but I don’t have any substantial data to account for preference voting and so have no choice.)

To do this I assume that supporters for the different candidates are randomly distributed around the country but that only a smallish number (a mean of 6%) of them come to meetings. Then the candidate with the most support is likely to win more meetings but the luck of the draw means other candidates can win too. The closer the support for candidates is, the more likely underdogs can win votes and so the point is to find a share that, when tested against the randomly generated outcome of 645 meetings, gives a result closest to my projection.

Currently that is Starmer 31.3%, Long-Bailey 27.1%, Nandy 23.8% and Thornberry 17.8%. The figures at the bottom end tend to gyrate – it’s so hard for Thornberry to win that a small change in her numbers appears to generate a big change in share up or down – but at the top, between Starmer and Long-Bailey, the roughly 31 v. 27 pattern has been more or less stable over the last week and more.

This isn’t a perfect fit – the range of options over the four dimensional “hyperspace” of the different candidates’ support is enormous, but it is the best fit the computer software has found after 1000 (guided) guesses.

The guesses employ “Monte Carlo methods” – vary the parameters randomly (technically stochastically) around a central figure and see if the fit (as measured by the squared error) is better or worse than the current estimate. If the fit is better then we use those figures as the basis of the next guess and so on.

From this – if I have the time (as it takes 8 or so hours to run) I can then run a big Monte Carlo simulation of the election itself. Here, instead of using a 6% average for turnout I use 78% (the highest for recent Labour leadership elections) but again randomly vary that for each CLP. I also randomly vary the support each candidate might expect to win in each CLP to account for local factors (eg the candidate opened your jumble sale a while back and so people know and like them).

This simulation runs 7000 times and allows me to (like a weather forecaster saying there is a 90% chance of rain) give some estimates for the uncertainty in the outcome. Last time it was run – on Saturday – the uncertainty was tiny: Starmer could expect to win the first ballot 99.4% of the time and Nandy – who previously had a very small (0.5%) chance of coming second, was always coming third.

How reliable is this? Well, it’s a projection, not a prediction: the eligibility criteria for the main ballot are different (and not all the voters will be Labour members). It doesn’t account for preference voting and it doesn’t account for regional support (e.g., Long-Bailey has done well in the North West and while this boosts her nomination numbers it may also skew the estimate of her support higher than reality). Nobody really thinks that Emily Thornberry is winning 18% support either – it looks more likely that her team has targeted some local parties to win nominations.

But all that said I don’t think it’s a bad model either – earlier opinion polls suggested Nandy’s support was in the single figure percents and the nominations (and the model) show that is wrong/outdated. The core message – that Starmer has a small but solid and stable lead – seems very right.

I have updated the model in several ways – to make it a slightly better analogue of the real world and to follow the developments in the contest itself.

No beating about the bush – the predictions for the outcome of the first round of balloting (as I say here please don’t take this seriously) are:

Starmer has a 99.9% chance of topping the first ballot with 30.6% support.
Long-Bailey has a 0.1% chance of topping the first ballot with 26.3% support.
Nandy cannot top the ballot, but with 23.9% support she has a 5.1% chance of beating Long-Bailey (and so coming second at this point).
There isn’t really any good news for Thornberry on 19.2% support.

As a reminder: the support shares are the central case – we tested 7000 examples where we stochastically (or randomly in simple terms) alter the variables around this central case (a so-called ‘Monte Carlo simulation‘).

The core improvements are:

1. We stochastically alter the support for each candidate on a constituency by constituency basis against the central case – we think this is more likely to reflect the real world where there will be pools of support or distrust for each candidate locally – eg because they recently spoke at a meeting and so on.
2. We assume a much lower turnout for the nomination meetings than before – this makes it easier for more marginal candidates to win them – so the central case is a 6% turnout but again there is a stochastic variation around that with an upper and lower limit.
3. We have a better fitting procedure for the core vote share – before I was just plugging values in and picking an answer that looked like a good fit. Now I still do that but then subject that guess to a Monte Carlo method test to see if we can find a better fit by exploring values in what is in effect the four-dimensional hyperspace of this problem – looking to minimise the squared difference error.

Code is below:

#!/usr/bin/env Rscript

region<-c('London', 'Scotland', 'WMids', 'EMids', 'Yorkshire',
'North', 'NWest', 'SEast', 'Eastern', 'SWest', 'Wales')
Membership<-c(115943, 20123, 39296, 34001, 50562, 27971,
73250, 66183, 40943, 46530, 26894)
CLPs<-c(73, 73, 59, 46, 54, 29, 74, 84, 58, 55, 40)

sharesX<-c(0.306, 0.263, 0.239, 0.192)
results.data<-data.frame(Starmer = integer(), RLB = integer(),
Nandy = integer(), Thornberry = integer(),
stringsAsFactors=FALSE)

for (x in 1:7000) {
starmerShare<-rnorm(1, sharesX[1], 0.01)
rlbShare<-rnorm(1, sharesX[2], 0.01)
nandyShare<-rnorm(1, sharesX[3], 0.01)
thornberryShare<-rnorm(1, sharesX[4], 0.01)
shares<-c(starmerShare, starmerShare + rlbShare,
starmerShare + rlbShare + nandyShare,
starmerShare + rlbShare + nandyShare + thornberryShare)

starmerW<-0
rlbW<-0
nandyW<-0
etW<-0

for (reg in 1:11)
{
nameRegion<-region[reg]
starmerC<-0
rlbC<-0
nandyC<-0
etC<-0
avMembership<-Membership[reg]/CLPs[reg]
distMembership<-rnorm(CLPs[reg], avMembership, avMembership/2.5)
for (p in 1:CLPs[reg])
{
localStarmer<-rnorm(1, starmerShare, 0.05)
if (localStarmer < 0.02) {
localStarmer<-0.02
}
localRLB<-rnorm(1, rlbShare, 0.05)
if (localRLB < 0.02) {

localRLB<-0.02
}
localNandy<-rnorm(1, nandyShare, 0.05)
if (localNandy< 0.02) {
localNandy<-0.02
}
localThornberry<-rnorm(1, thornberryShare, 0.05)
if (localThornberry < 0.02) {
localThornberry<-0.02
}
localShares<-c(localStarmer, localStarmer + localRLB,
localStarmer + localRLB + localNandy,
localStarmer + localRLB + localNandy + localThornberry)
supportNorm<-localStarmer + localRLB +
localNandy + localThornberry
turnoutThreshold<-rnorm(1, 0.6, 0.1)
if (turnoutThreshold > 0.9) {
turnoutThreshold <- 0.9
}
if (turnoutThreshold < 0.3 ) {
turnoutThreshold <- 0.3
}
if (turnoutThreshold * distMembership[p] < 30)
{
turnoutThreshold <- 30/distMembership[p]
}
starmerV<-0
rlbV<-0
nandyV<-0
etV<-0
for (v in 1:distMembership[p])
{
turnout<-runif(1)
if (turnout > turnoutThreshold) {
next
}
ans<-runif(1) * supportNorm
if (ans <= localShares[1]) {
starmerV = starmerV + 1
next
}
if (ans <= localShares[2]) {
rlbV = rlbV + 1
next
}
if (ans <= localShares[3]) {
nandyV = nandyV + 1
next
}
etV = etV + 1
}
if (max(starmerV, rlbV, nandyV, etV) == starmerV) {
starmerC = starmerC + 1
starmerW = starmerW + 1
next
}
if (max(rlbV, nandyV, etV) == rlbV) {
rlbC = rlbC + 1
rlbW = rlbW + 1
next
}
if (max(nandyV, etV) == nandyV) {
nandyC = nandyC + 1
nandyW = nandyW + 1
next
}
etC = etC + 1
etW = etW + 1
}
regionalResult<-sprintf(
"In %s, Starmer won %i, RLB won %i, Nandy won %i, Thornberry won %i",
region[reg], starmerC, rlbC, nandyC, etC)
print(regionalResult)
}
result<-sprintf(
"Starmer won %i, RLB won %i, Nandy won %i, Thornberry won %i \n",
starmerW, rlbW, nandyW, etW);
print(result)
votesOutcomes<-sprintf("Starmer: %i   RLB: %i   Nandy: %i   Thornberry: %i",
print(x)
}
names(results.data)=c('Starmer', 'RLB', 'Nandy', 'Thornberry')



## Another go at modelling the Labour leadership election

I started doing this for fun and that’s still my motivation – so please do not take this seriously and even if I do slip into using the word “prediction”, above all – this is not a prediction.

Anyway my aim is to model the potential outcome of the first round of the ballot of the Labour leadership election using the concrete data that we actually have – namely Labour membership data (newly disclosed to the Daily Mirror) and nominations made by all-member meetings (as reported by the @CLPNominations twitter account). The model is built using the R programming language and the code is available below.

So dealing with the assumptions made…

On membership – unlike before I now have an up-to-date figure for membership and it’s easy to look up the number of CLPs in each region/country and therefore get an average membership. But what I also do is now distribute the modelled memberships as a Gaussian (normal) distribution around this average (in layperson’s terms I assume there is a range of higher and lower memberships clustered around this average in a bell curve shape). Total arbitrarily I chose the standard deviation of this distribution (a measure of how broad the curve is) as 40% of the mean.

(Never tire of recommending this brilliant book – Statistics Without Tears – for anyone with more than a passing interest in polling and sampling.)

Why does this matter? It’s easier, if support is randomly distributed, for relatively less supported candidates to win a nomination in a smaller CLP and vice versa.

On nominations – having robustly held up for most of the week the simple Zipf model I have been using for the nominations started to creak a bit last night – essentially Lisa Nandy, despite reports of some very good performances in terms of votes won, underperformed relative to the model to the benefit of Rebecca Long-Bailey (Keir Starmer kept his proportional share steady). Emily Thornberry had a bad night.

However I am going to cheat a little bit – juggle the coefficient and the rankings – the coefficient falls to 1.1 (which might indicate that the rate at which Starmer’s lead is increasing is slowing but still means that it is increasing) and drop Nandy to 4th place (Thornberry falls to 10th). And hope that the weekend – when I expect many more nominations – makes it all a bit clearer. It’s a kludge but we aren’t taking this all that seriously, are we?

Then we can estimate that if every CLP made a nomination Starmer would take 360, Long-Bailey 172, Nandy 80 and Thornberry 29.

What we want to do is get our model of support to match (reasonably closely) this outcome – but you may have noticed I’ve had to cheat again – because many (but by no means all) of the nomination meetings have been decided by preference balloting because one candidate has not polled at least 50% + 1 vote on the first ballot. Them’s the breaks I’m afraid – modelling the preference voting requires making political decisions which go well beyond this simple maths model and that’s not my purpose here. So I am just treating all of this as though it was a first-past-the-post process.

By trial and error I have found that setting the shares of support to the figures below gives a pretty good match:

Starmer 26.75%
Long-Bailey 25.65%
Nandy 24.70%
Thornberry 22.90%

Obviously these are very tightly grouped results – and that reflects another deep flaw in the model I’m afraid – we make no allowances at all for clusters of regional support and so have to try to draw out the result from a fully random distribution – so, for instance, if we thought that Nandy or Long-Bailey could pick up a lot of nominations in their home region (the North West) then they they might be able to hit our target with lower levels of support (the same applies to Starmer in London). But that is a level of sophistication beyond this model.

The figures above typically generate a result like this:

[1] “In London, Starmer won 43, RLB won 21, Nandy won 9, Thornberry won 0”

[1] “In Scotland, Starmer won 36, RLB won 21, Nandy won 12, Thornberry won 4”

[1] “In WMids, Starmer won 29, RLB won 19, Nandy won 5, Thornberry won 6”

[1] “In EMids, Starmer won 26, RLB won 11, Nandy won 7, Thornberry won 2”

[1] “In Yorkshire, Starmer won 28, RLB won 19, Nandy won 5, Thornberry won 2”

[1] “In North, Starmer won 21, RLB won 3, Nandy won 5, Thornberry won 0”

[1] “In NWest, Starmer won 43, RLB won 20, Nandy won 11, Thornberry won 0”

[1] “In SEast, Starmer won 51, RLB won 21, Nandy won 9, Thornberry won 3”

[1] “In Eastern, Starmer won 36, RLB won 12, Nandy won 10, Thornberry won 0”

[1] “In SWest, Starmer won 35, RLB won 14, Nandy won 4, Thornberry won 2”

[1] “In Wales, Starmer won 16, RLB won 18, Nandy won 4, Thornberry won 2”

[1] “Starmer won 364, RLB won 179, Nandy won 81, Thornberry won 21

But we can go further now and look at the range of outcomes grouped around these shares – in other words use some “Monte Carlo methods” to estimate what the probabilities of certain outcomes are.

To do this here we use the ‘predicted’ shares above as the mean of a normal distribution (with standard deviation of 1% in each case. In simple terms that means that while our central case is that Starmer has the support of 26.75% of members, we might expect that in roughly one case in six he has support of less than 25.75% and in one case in six he has more than than 27.75% – and similar stipulations apply to the other candidates. We then run the simulation 1000 times and look at the distribution of outcomes.

In fact the (lazy, but it’s only for fun) way I have done this means that the variation is likely to be bigger for Long-Bailey and Nandy than for Starmer – Starmer’s support can go up and down but only moves at one end – if Starmer’s support falls and Long-Bailey’s rises she can get up to double the benefit. This is an artefact of the way I have coded this up but I will keep it because (a) I don’t want to wait another hour to finish this by re-running the code (R is great but nobody has ever suggested it is fast) and (b) Starmer is the favourite so he should feel a bit more pressure! The difference can be seen in the shape of the density curves for the candidates in the featured image for this page – Long-Bailey’s and Nandy’s are broader and shorter curves show their results taking a broader range of answers even if Starmer’s mean is well ahead.

I have assumed a 60% turnout and so we get Starmer’s maximum vote as 99,319 and his minimum as 76,685. For Long-Bailey the figures are 97,937 and 69,396, and Nandy 97,011 and 66,246. The notional results for Thornberry are 86,034 and 63,672.

Another problem…barring a big change in circumstances Emily Thornberry won’t be on the final ballot – she’s not getting anything like enough nominations – and so her supporters will have the choice of voting for someone else or just not bothering. Again this is a political question and not one for here.

This “Thornberry problem” makes what follows pretty worthless, unfortunately – certainly if it really is the case that there are 60,000 – 70,000 would-be Thornberry voters out there who will be forced to do something they would prefer not to… but here is the non-predictive prediction:
Keir Starmer has a 61.9% chance of topping the poll, Rebecca Long-Bailey has a 28.2% chance and Lisa Nandy has a 9.9% chance of winning the first ballot. Emily Thornberry does not top the poll in any of the 1000 simulations run.

Code is below – have fun with it.

#!/usr/bin/env Rscript

region<-c('London', 'Scotland', 'WMids', 'EMids', 'Yorkshire',
'North', 'NWest', 'SEast', 'Eastern', 'SWest', 'Wales')
Membership<-c(115943, 20123, 39296, 34001, 50562, 27971,
73250, 66183, 40943, 46530, 26894)
CLPs<-c(73, 73, 59, 46, 54, 29, 74, 84, 58, 55, 40)

sharesX<-c(0.2675, 0.524, 0.771, 1.0)
results.data<-data.frame(Starmer = integer(), RLB = integer(),
Nandy = integer(), Thornberry = integer(),
stringsAsFactors=FALSE)

for (x in 1:1000) {

starmerShare<-rnorm(1, sharesX[1], 0.01)
rlbShare<-rnorm(1, sharesX[2], 0.01)
nandyShare<-rnorm(1, sharesX[3], 0.01)
shares<-c(starmerShare, rlbShare, nandyShare, 1.0)

starmerW<-0
rlbW<-0
nandyW<-0
etW<-0

for (reg in 1:11)
{
nameRegion<-region[reg]
starmerC<-0
rlbC<-0
nandyC<-0
etC<-0
avMembership<-Membership[reg]/CLPs[reg]
distMembership<-rnorm(CLPs[reg], avMembership, avMembership/2.5)
for (p in 1:CLPs[reg])
{
starmerV<-0
rlbV<-0
nandyV<-0
etV<-0
for (v in 1:distMembership[p])
{
turnout<-runif(1)
if (turnout > 0.6) {
next
}
ans<-runif(1)
if (ans <= shares[1]) {
starmerV = starmerV + 1
next
}
if (ans <= shares[2]) {
rlbV = rlbV + 1
next
}
if (ans <= shares[3]) {
nandyV = nandyV + 1
next
}
etV = etV + 1
}
if (max(starmerV, rlbV, nandyV, etV) == starmerV) {
starmerC = starmerC + 1
starmerW = starmerW + 1
next
}
if (max(rlbV, nandyV, etV) == rlbV) {
rlbC = rlbC + 1
rlbW = rlbW + 1
next
}
if (max(nandyV, etV) == nandyV) {
nandyC = nandyC + 1
nandyW = nandyW + 1
next
}
etC = etC + 1
etW = etW + 1
}
regionalResult<-sprintf(
"In %s, Starmer won %i, RLB won %i, Nandy won %i, Thornberry won %i",
region[reg], starmerC, rlbC, nandyC, etC)
print(regionalResult)
}
result<-sprintf(
"Starmer won %i, RLB won %i, Nandy won %i, Thornberry won %i \n",
starmerW, rlbW, nandyW, etW);
print(result)
votesOutcomes<-sprintf("Starmer: %i   RLB: %i   Nandy: %i   Thornberry: %i",
}
names(results.data)=c('Starmer', 'RLB', 'Nandy', 'Thornberry')



## Mathematically modelling the overall Labour result

The Zipf model I outlined here looks to be reasonably robust – though maybe the coefficient needs to drop to somewhere between 1.25 and 1.29 – but can we use this result to draw any conclusions on the actual result itself?

That’s what I am going to try to do here – but be warned there are a whole host of assumptions in here and this isn’t really anything other than a mathematical diversion.

The idea is this: if supporters of any given candidate are randomly distributed across all Constituency Labour Parties (dubious – discuss) and we make certain assumptions about the sizes of Constituency Labour Parties, what level of support tends to generate the sort of results for nomination meetings that we are seeing.

On the size of the 11 Labour party regions and countries we also assume a Zipf distribution and so work on a basis that 339,306 members vote and that in the biggest region (nominally London, but we’re not basing this on real membership figures for London, just using a simple model) that means 120,000 voters and 9223 in the smallest region. These figures decline using a coefficient of 1.07 over the rank of the ‘region’ (1.07 is the figure seen across the globe for major national cities rank).

Each one of these notional regions has 56 CLPs which range in size from 2143 voters for the biggest to 165 at the smallest.

The target we are trying to hit is the Zipf prediction (for a notional 616 nominations) of Starmer 358 nominations, Long-Bailey 145 nominations, Nandy 86 nominations and Thornberry 25 nominations.

OK, you’ve heard all the blah – here’s the bit you really came for – what does it say about support. Well, it’s sort of good news for Keir Starmer who, this model suggests, is getting about 27% support. Rebecca Long-Bailey is picking up 25.5% so is close behind, but Lisa Nandy is not far off either at 25.0%, while Emily Thornberry has 22.5%. On a typical run (as the process is random the precise number varies) this gives Starmer 335 nominations, Long-Bailey 154 nominations, Nandy 101 nominations and Thornberry 26 – the precise figures don’t matter so much beyond showing that it’s close.

Now, YouGov’s poll – which I’d trust much more than my prognostications – had very different figures, with Starmer on 46% first preferences and Long-Bailey on 32%.

So why the difference and why do I trust the poll more than this model?

Firstly and most importantly because support for candidates isn’t randomly distributed – I reason Long-Bailey and Nandy are likely to have disproportionally more supporters in the North West and Starmer in London – and there are many more members in London.

And secondly, because, as I’ve already said, the model makes far too many assumptions.

On the other hand – I do think Nandy has been doing better than the initial polling suggested so this model is probably right to suggest she’s doing relatively well.

Code (in R) used is shown below… but the bottom line is: this guess isn’t likely to be a very good one.

#!/usr/bin/env Rscript

clpSize<-c(2142, 1021, 661, 486, 383, 315, 267, 232, 204, 182, 165)
shares<-c(0.275, 0.525, 0.775, 1.0)

starmerW<-0
rlbW<-0
nandyW<-0
etW<-0

for (reg in 1:11)
{
starmerC<-0
rlbC<-0
nandyC<-0
etC<-0
for (p in 1:56)
{
starmerV<-0
rlbV<-0
nandyV<-0
etV<-0
for (v in 1:clpSize[reg])
{
ans<-runif(1)
if (ans <= shares[1]) {
starmerV = starmerV + 1
next
}
if (ans <= shares[2]) {
rlbV = rlbV + 1
next
}
if (ans <= shares[3]) {
nandyV = nandyV + 1
next
}
etV = etV + 1
}
if (max(starmerV, rlbV, nandyV, etV) == starmerV) {
starmerC = starmerC + 1
starmerW = starmerW + 1
next
}
if (max(rlbV, nandyV, etV) == rlbV) {
rlbC = rlbC + 1
rlbW = rlbW + 1
next
}
if (max(nandyV, etV) == nandyV) {
nandyC = nandyC + 1
nandyW = nandyW + 1
next
}
etC = etC + 1
etW = etW + 1
}
}
result<-sprintf("Starmer won %i, RLB won %i, Nandy won %i, Thornberry won %i \n", starmerW, rlbW, nandyW, etW);
print(result)


## Mathematically modelling the Labour leadership nomination race

No politics here – just some maths.

But if we use a Zipf distribution (see here for more about that) we get a pretty good fit for the three front runners – Keir Starmer, who currently has 43 nominations from constituency labour parties, Rebecca Long-Bailey who has 17 and Lisa Nandy who has 10 – if we use a coefficient of 1.35 over their rank.

All three of these are on the ballot anyway because of trade union and other support, so the question is whether fourth placed candidate Emily Thornberry, currently with just three nominations, can make it.

The bad news for her is that this (admittedly simple) model suggests not. Indeed she is already seriously under-performing based on her rank. If the coefficient is correct she ought to be on 7 or 8 nominations – but right now she is performing as if she was in seventh place.

If her performance remains at this level it’s essentially mathematically impossible for her to make the ballot threshold of 33 nominations.

So – a prediction: if (around) 400 CLPs nominate then the model points to 236 for Starmer, 92 for Rebecca Long-Bailey, 53 for Lisa Nandy and 17 for Emily Thornberry.

Update: People better informed than me suggest 400 is a low figure for the number of nominating constituencies and for 500 the figures are: Keir Starmer 295, Rebecca Long-Bailey 116, Lisa Nandy 67 and Emily Thornberry 21. For Thornberry to make the field (on current performance remember) there would have to be 750 nominations – which is about 100 more than the mathematically possible maximum. So either Thornberry’s performance will have to significantly improve or she is out.

## Leslie Huckfield case exposes Wikipedia’s weaknesses

Les Huckfield is hardly likely to be famous outside his own household, but 35 years after he was a junior minister in Jim Callaghan’s Labour government he is back in the news again today – because, now living in Scotland, he has backed Scottish independence.

The pro-independence “Yes” campaign are, not surprisingly, doing all they can to milk this endorsement: they desperately need some “Labour” support if they are to have the remotest chance of winning.

Ordinary folk might be inclined to say “Leslie Huckfield [as he now calls himself], who’s he then?” and go to Wikipedia and look him up (click the link to see).

What they get there is a short article that is pretty light on detail and does not do much to impart the flavour of his politics – having once been a strong critic of far left entryism into Labour, Huckfield turned into one of the strongest defenders of the Militant Tendency’s/Revolutionary Socialist League’s presence in the Labour Party and, reports John Rentoul, once proposed banning all car imports into the UK.

But more importantly, it completely leaves out the one thing from his time as an elected politician that Huckfield should be famous for: successfully stopping his attempted prosecution for allegedly dishonestly obtaining expenses of more than £2,500 from the European Parliament by deception.

The story of that – and why it proved important in more recent British political history – is covered in this article in the Law Society Gazette.

There is no sign, that I can see, that someone has deleted this information from the Wikipedia article and certainly no suggestion that Huckfield himself has stopped this from getting out. (Nor, I should add, is there any suggestion that Huckfield did anything improper in seeking to stop his prosecution.)

But this is a warning against relying on Wikipedia as a complete source. And it is also a reminder of why paying someone to do a job thoroughly – such as compiling an encyclopaedia – may still have advantages over relying on crowd sourcing.

I love Wikipedia, it is surely one of the greatest things to come out of the Internet – but it is not something I would rely on when it really mattered.

## A bit more on Universal Credit and “Agile”

I think I need to give a little bit more background on the politics of the decision by the DWP to trumpet its use of “agile” methods and how, bluntly, the department has misused to potential of agile to give it cover in its huge gamble with public money and the living standards of millions of the least well off.

Of course we have to start from the basic fact that most software projects – whether they are in the public or private sector – fail. The failure could be relatively small – a budget overshoot or a lack of sought for capability. Or the failure could be huge – your space craft blows up on launch, your ambulances are never dispatched and people die and so on (these last two are real and will be familiar to almost anyone who has done a development methodologies course).

The 1997 – 2010 Labour government had some successes in software driven projects – the UK Passport Service, for instance, is now much more efficient. But it also had a fair number of high profile failures – especially in its efforts to modernise computer use in the health service (though the major ambulance dispatch failure was not in this time, but a software update did fail) – especially attempts to create a single patient electronic record.

The then Conservative opposition used this often – David Cameron in particular repeatedly suggesting that it was because the Labour government had tried to buy a single supercomputer to run the NHS: something he must have known was simply untrue but presumably worked well in focus groups.

So software projects were and are a hot political topic.

The new government, coming to office in May 2010 did several things to get a grip on failing projects. Firstly, they went for a good old fashioned gouge of the contractors’ margins: essentially saying cut your prices on existing contracts if you even want to be considered for future work. Secondly, they said that new software projects had to seriously consider using free and open source software to avoid proprietary lock-in and thirdly, they said that a new centralised control mechanism had to be applied to ensure that No 10 and the Cabinet Office had a grip on costs and efficiency: it was out of this that the Major Projects Authority, that has now reported UC is close to failure, came.

The three elements have generally worked well. It pains me to give this government political credit, but essentially they deserve it.

Yet UC has been allowed to escape this framework, and “agile” was the excuse given for this.

In, I think, early 2011 the Institute for Government published a report on government software projects and recommended that “agile” methods be used. They cited an experimental, relatively small scale project by the Metropolitan Police Service as an example of how agile could work successfully in government.

I attended the launch seminar which was rather more like a religious mission meeting than a serious seminar on how to get the best value for public money. The room, in one of the government’s finest buildings in St. James’s, was packed to bursting with representatives from small software houses, who saw agile as their ticket to the big time and certainly none of them were going to suggest that “world’s biggest” and “Agile” was a risky mix.

At the meeting the DWP announced that they would be using “Agile” as the basis on which UC was developed. I was only an MSc student but even I thought this looked like exactly the sort of project that the textbooks said Agile was not designed for – but I wasn’t confident enough to say that then and nobody else in the room seemed remotely interested in hearing such a thing.

But it didn’t take long once the meeting was over for people to point out that this was a high risk proposition: but it also became crystal clear that the Secretary of State in the department had decided that Agile was the secret sauce for government IT and that it, and it alone, would lead him to the promised land.

For well over a year it has been an open secret that the Treasury want to pull the plug on UC because they do not believe it can be delivered in anything like a working form to budget. And the signs are all there – essentially the project has already failed as its scope has been cut repeatedly and its final implementation date put back and back.

But the politics of the Conservative Party do not allow anyone in government to say this openly and even when the government’s own project watchdog says the system is on the brink of collapse the department come out and rubbish the assessment – even at the price of contradicting themselves.

This would be funny were it not for the fact that in just a few months millions of the poorest people in Britain will depend on this system working if they are to eat, to heat their homes and clothe their children.

## Getting away with it

One thing that working towards a PhD has taught me is that textbooks are of low value in academia.

Of course great textbooks are essential works, but in the end a textbook is not a peer reviewed publication: it’s what you and the publisher think you can get away with.

And, yes, some text books are, in effect, peer reviewed, as they are based on refereed research (so I can now plug my friend Dr Joanne Murphy’s new book – Policing for Peace in Northern Ireland: Change, Conflict and Community Confidence – as I know it is just such a work.)

But the point about lack of peer review was brought home to me this afternoon when, fruitlessly wandering round the York University library looking for a desk, I stumbled on the politics shelves and scanned through Dominic Wring’s The Politics of Marketing the Labour Party: A Century of Stratified Electioneering.

Now this happens to be about something I know quite a lot about and when it comes to the 1997 election campaign I certainly think I know rather more about it than Dominic Wring – whose book maybe ought to be subtitled “Labour must lose”. Words and phrases such as “authoritarianism” and “hollow populism” and “empty rhetoric” are sprinkled through the pages and no justification offered for such judgements – and this is what suggests to me it lacks any sort of peer review, for clearly a reviewer would demand something more than the author’s political prejudices as a basis on which to make such claims?

Wring is, of course, a member of the Labour Party.

Well, my point is not to tackle Wring’s hatred of New Labour (though it’s a bit much for him to describe the 1996 Littleborough and Saddleworth byelection as turning point as though it marked the start of the decline of Tony Blair – as he subsequently went on to win three general elections), but to highlight the unreliable nature of textbooks. So I’ll stop it there.

## Success of #downgradedChancellor suggests Parliament has a big TV audience

Ed Miliband, Labour leader, used his televised reply, on the floor of the House of Commons, to the Chancellor of the Exchequer‘s Budget to announce the “#downgradedChancellor” hastag and within an hour it was trending worldwide. Seems more people watch parliament, at least on Budget day, than many give credit for.

I predict there will be an attempt to repeat this – probably from Prime Minister David Cameron – in next week’s Prime Minister’s Questions.

## Here we go again

I used to have a blog. It was meant to be about “politics and free software” (not the politics of free software) but ended up being mainly about politics. I wrote the last entry on that in January 2008 and subsequently took it off line (the content is still on my server at home and it was amusing to read it again just now, but it’s not going back up).

My politics haven’t changed – so if you want to do something to make Britain a better place to live I still recommend you start here.

But I am not going to write about politics here. The geeky title ought to give the game away – this one is about computing (and, I suppose, mathematics to an extent).

My inspiration came from this: generally speaking I am in the n – log(n) part of this matrix and while I am not interested in pursuing a career in computing I am passionate about improving my skills and competency, so the comment that a log(n) programmer “maintains a blog in which personal insights and thoughts on programming are shared”, left me with little choice.

Of course I’ll actually have to demonstrate some insights and thoughts too.