I have updated the model in several ways – to make it a slightly better analogue of the real world and to follow the developments in the contest itself.

No beating about the bush – the predictions for the outcome of the first round of balloting (as I say here please don’t take this seriously) are:

**Starmer** has a 99.9% chance of topping the first ballot with 30.6% support.**Long-Bailey** has a 0.1% chance of topping the first ballot with 26.3% support.**Nandy** cannot top the ballot, but with 23.9% support she has a 5.1% chance of beating Long-Bailey (and so coming second at this point).

There isn’t really any good news for **Thornberry** on 19.2% support.

As a reminder: the support shares are the central case – we tested 7000 examples where we stochastically (or randomly in simple terms) alter the variables around this central case (a so-called ‘Monte Carlo simulation‘).

The core improvements are:

- We stochastically alter the support for each candidate on a constituency by constituency basis against the central case – we think this is more likely to reflect the real world where there will be pools of support or distrust for each candidate locally – eg because they recently spoke at a meeting and so on.
- We assume a much lower turnout for the nomination meetings than before – this makes it easier for more marginal candidates to win them – so the central case is a 6% turnout but again there is a stochastic variation around that with an upper and lower limit.
- We have a better fitting procedure for the core vote share – before I was just plugging values in and picking an answer that looked like a good fit. Now I still do that but then subject that guess to a Monte Carlo method test to see if we can find a better fit by exploring values in what is in effect the four-dimensional hyperspace of this problem – looking to minimise the squared difference error.

Code is below:

#!/usr/bin/env Rscript region<-c('London', 'Scotland', 'WMids', 'EMids', 'Yorkshire', 'North', 'NWest', 'SEast', 'Eastern', 'SWest', 'Wales') Membership<-c(115943, 20123, 39296, 34001, 50562, 27971, 73250, 66183, 40943, 46530, 26894) CLPs<-c(73, 73, 59, 46, 54, 29, 74, 84, 58, 55, 40) sharesX<-c(0.306, 0.263, 0.239, 0.192) results.data<-data.frame(Starmer = integer(), RLB = integer(), Nandy = integer(), Thornberry = integer(), stringsAsFactors=FALSE) for (x in 1:7000) { starmerShare<-rnorm(1, sharesX[1], 0.01) rlbShare<-rnorm(1, sharesX[2], 0.01) nandyShare<-rnorm(1, sharesX[3], 0.01) thornberryShare<-rnorm(1, sharesX[4], 0.01) shares<-c(starmerShare, starmerShare + rlbShare, starmerShare + rlbShare + nandyShare, starmerShare + rlbShare + nandyShare + thornberryShare) starmerW<-0 rlbW<-0 nandyW<-0 etW<-0 votesStarmer<-0 votesRLB<-0 votesNandy<-0 votesET<-0 for (reg in 1:11) { nameRegion<-region[reg] starmerC<-0 rlbC<-0 nandyC<-0 etC<-0 avMembership<-Membership[reg]/CLPs[reg] distMembership<-rnorm(CLPs[reg], avMembership, avMembership/2.5) for (p in 1:CLPs[reg]) { localStarmer<-rnorm(1, starmerShare, 0.05) if (localStarmer < 0.02) { localStarmer<-0.02 } localRLB<-rnorm(1, rlbShare, 0.05) if (localRLB < 0.02) { localRLB<-0.02 } localNandy<-rnorm(1, nandyShare, 0.05) if (localNandy< 0.02) { localNandy<-0.02 } localThornberry<-rnorm(1, thornberryShare, 0.05) if (localThornberry < 0.02) { localThornberry<-0.02 } localShares<-c(localStarmer, localStarmer + localRLB, localStarmer + localRLB + localNandy, localStarmer + localRLB + localNandy + localThornberry) supportNorm<-localStarmer + localRLB + localNandy + localThornberry turnoutThreshold<-rnorm(1, 0.6, 0.1) if (turnoutThreshold > 0.9) { turnoutThreshold <- 0.9 } if (turnoutThreshold < 0.3 ) { turnoutThreshold <- 0.3 } if (turnoutThreshold * distMembership[p] < 30) { turnoutThreshold <- 30/distMembership[p] } starmerV<-0 rlbV<-0 nandyV<-0 etV<-0 for (v in 1:distMembership[p]) { turnout<-runif(1) if (turnout > turnoutThreshold) { next } ans<-runif(1) * supportNorm if (ans <= localShares[1]) { starmerV = starmerV + 1 next } if (ans <= localShares[2]) { rlbV = rlbV + 1 next } if (ans <= localShares[3]) { nandyV = nandyV + 1 next } etV = etV + 1 } votesStarmer<-votesStarmer + starmerV votesRLB<-votesRLB + rlbV votesNandy<-votesNandy + nandyV votesET<-votesET + etV if (max(starmerV, rlbV, nandyV, etV) == starmerV) { starmerC = starmerC + 1 starmerW = starmerW + 1 next } if (max(rlbV, nandyV, etV) == rlbV) { rlbC = rlbC + 1 rlbW = rlbW + 1 next } if (max(nandyV, etV) == nandyV) { nandyC = nandyC + 1 nandyW = nandyW + 1 next } etC = etC + 1 etW = etW + 1 } regionalResult<-sprintf( "In %s, Starmer won %i, RLB won %i, Nandy won %i, Thornberry won %i", region[reg], starmerC, rlbC, nandyC, etC) print(regionalResult) } result<-sprintf( "Starmer won %i, RLB won %i, Nandy won %i, Thornberry won %i \n", starmerW, rlbW, nandyW, etW); print(result) votesOutcomes<-sprintf("Starmer: %i RLB: %i Nandy: %i Thornberry: %i", votesStarmer, votesRLB, votesNandy, votesET) print(x) print(votesOutcomes) results.data<-rbind(results.data, c(votesStarmer, votesRLB, votesNandy, votesET)) } names(results.data)=c('Starmer', 'RLB', 'Nandy', 'Thornberry')