# Pointers versus references

Some people don’t like pointers – and for that reason, I think, we have references in C++. But as a confirmed pointer person, I find references very hard going.

I had a piece of C++ code that did this:

PartialPage& DoubleTree::oldestPage()
{
PartialPage& pageToKill = pageTree.begin()->second);
long timeToKill = pageTree.begin()->second.getTime();
map<long, PartialPage&>::iterator itOld;
for (itOld = pageTree.begin(); itOld != pageTree.end(); itOld++) {
if (itOld->second.getTime() < timeToKill) {
timeToKill = itOld->second.getTime();
pageToKill = itOld->second;
}
}
return pageToKill;
}


This produced rubbish results – because re-assigning the reference didn’t make it refer to a new element of the map. Essentially you cannot mutate a reference in C++ at all.

Switching to pointers fixed the problem though.

PartialPage* DoubleTree::oldestPage()
{
PartialPage* pageToKill = &(pageTree.begin()->second);
long timeToKill = pageTree.begin()->second.getTime();
map<long, PartialPage>::iterator itOld;
for (itOld = pageTree.begin(); itOld != pageTree.end(); itOld++) {
if (itOld->second.getTime() < timeToKill) {
timeToKill = itOld->second.getTime();
pageToKill = &(itOld->second);
}
}
return pageToKill;
}


# Going atomic … or concurrency is hard

In my PhD world a year’s worth of software experimentation has proved what we all knew already … that systems using traditional memory models struggle in the Network-on-Chip environment and so I am now trying something slightly different.

My “model” (it’s all in software) is of a 16 core system, with each core having a small amount of on-chip memory (32k), which are combined together to form a flat memory space. Memory in this space can be accessed quickly, memory outside it, in the next level up in the hierarchy, is roughly 100 times further away.

Using any form of traditional paging model (including Belady’s optimal page replacement algorithm) this system starts to thrash on even moderate loads – the cost of moving pages in and out of the local memory determines performance and so there is no benefit from adding additional processors (in fact it just slows the individual processors down).

Such an outcome makes any promise of improved performance from parallelism void – it does not really matter how efficiently you have parallelised the code (some corner cases excepted – eg if all chips were accessing the same memory at the same time), you are trapped by a memory I/O bound.

So now I want to look at alternatives beyond the usual 4k (or 2k) paging – but I have been struggling all week to get the locking semantics of my code right. Concurrency is hard.

The one thing that debugging parallel code and locks teaches you again and again is never to assume that some event will be so rare you don’t need to bother about it: because when you are executing millions of instructions a second, even rare events tend to happen.

It has also taught me to check return values – code that will “always” work in a single threaded environment may actually turn out to be quite a tricky customer when running in parallel with other instances of itself or when it is accessing shared memory.

But, finally, the main lesson this week has been about going atomic.

I have a tendency to think – if I can release that lock for a few lines of code that might improve overall performance and I can just lock it again a little later. Beware of that thought.

If you need to make a series of actions atomic you need to hold the same lock across them all – releasing it for even a few lines breaks atomicity and will quite likely break your code.

# A probably entirely naïve question about the principle of relativity

Surely I can quite easily design an experiment that shows the relativity principle is false.

If turn around on the spot the principle, as I understand it, asserts that I cannot build an experiment that proves it was me that moved as opposed to everything else that moved while I stayed still.

But the rest of the universe is very massive – possibly of infinite mass – and so to move it through $2\pi$ radians takes a hell of a lot more energy than moving me.

# The real reason why you haven’t been polled in #indyref

Some people – Yes supporters essentially – are claiming that it is plain that the opinion polls – none of which (so far, at least – I hope I am not tempting fate) have reported a Yes lead – in the Scottish independence referendum are rigged is because they have never been asked.

Well, there is a simple reason for that: polls are small and the electorate is very large.

There are about 4 million electors able to vote in the Scottish independence referendum.

If we assume every elector has an equally random chance of being asked (which is not true for many cases: if you are not on an online panel it just won’t happen), and that each poll asks 1200 electors then the chances of you being asked in any given poll are 1200/4000000 or about 1 in 3,333: a bit better than winning the lottery jackpot I’d admit, but who bets on a 3332/1 chance?

Of course, though, there are multiple polls but to have just a 1 in 100 chance of being asked then 33 polls would have to be taken. To make it more likely than unlikely that you had been polled then 1667 polls would have to be taken.

What Scotland Thinks, at the time of writing, records 80 polls on the referendum question – so the chances of any individual elector being asked are (given all my approximations) about 1 in 42, or in bookies’ odds terms, it’s a 41/1 shot.

If you think a race is fixed because your 41/1 wager never comes home, I’d suggest you weren’t to be trusted in a betting shop.

Update: Should make it clear this is a pretty crude approximation to make a point – opinion poll sample sizes vary and if they are closer to 1000 in sample size then the odds of you being asked go up to about 49/1 (ie., it’s a fair bit less likely).

A further update: My intention on writing this was to demonstrate, in the broad brush terms why an argument based “I have never been polled so therefore the polls are wrong” didn’t hold any water. It seems the article now being touted around as an exact prediction of how likely it was you’d been asked: it’s not. As I say above much (most probably) polling these days is via online panel – if you are not on the panel you cannot be asked to begin with.

# (Scottish) opinion polls – a reminder

This is not about which way you should use your vote if you have one – that is here.

Instead it’s reminder of the maths of opinion polling, because I suspect we are going to see great numbers of polls in the next few weeks.

So here are some things to remember:

1. The best an opinion poll can do is tell you what a good opinion poll would show. In other words, opinion polls cannot be thought of as reliable predictors of results. Simply put – if people systematically fail to tell the truth to opinion pollsters then no opinion poll is going to perfectly correct for that (the pollsters try but their work here is merely based on informed guessing). So when pollsters talk about “margins of error” they don’t mean in comparison to a real election result, but to what another – well taken – poll would show.

2. One in twenty opinion polls – no matter how well conducted – will be very wrong. This is the so-called “rogue” poll and it’s incorrectness is not because it has been conducted improperly but because sampling a small subset of a large population is inherently statistically risky.

3. Doubling the sample size does not mean your poll more twice as accurate. In fact it only makes it $\sqrt2$ more accurate. The important point here is that when you look at small samples – such as Scottish regions – you are looking through the other end of this telescope – so a smaple that contained $\frac{1}{5}$th of the poll sample would actually have a margin of error that was $\sqrt 5$ (or about 2.2) times bigger (and that assumes the sampling in that region actually matches the population in that region as opposed to Scotland as a whole – if it doesn’t, and the chances are that it won’t, then you are better off just ignoring the subsample).

4. “Margin of error” is really a measure of how likely other polls will give the similar results. We have already covered this – but here’s a longer explanation. If we say that the margin of error on a poll is plus or minus three per cent, then typically what we mean is that 95% (i.e., 19 out of 20) polls will give results where the figures do not differ by more than three per cent. This also means if you describe a 1 per cent change in a rating as in some way significant then you are very wrong – because actually your poll does not give you enough information to make that claim. To go from a 3% margin of error to a 1% margin requires you to increase the sample size by a factor of 9. To go to a margin of plus or minus 0.5% would require an increase in sample size by a factor of 36.

5. The margin of error actually depends on the score polled. The highest margin of error is at 50% – where for a 1000 sample poll it is:

$2 \times \sqrt \frac{0.5 \times 0.5}{1000} = \pm3.4\%$

For 40% the margin becomes $2 \times \sqrt \frac{0.6 \times 0.4}{1000} = \pm 3.1\%$

(And these figures are for the numbers before the don’t knows are discounted.)

# All in the mind?

For the last 14 months running has been a big thing in my life – with the pinnacle (in distance, if not speed) being my completion of the Hackney half marathon in June (the picture is of me struggling to get through the 10th mile).

The core run each week has been the Finsbury (or occasionally some other) parkrun – an approximate 5km timed run.

The Finsbury Park course is famously tough – two laps of one long, relatively  gentle climb and one short, relatively steep hill.

Back in June, in heavy training for the Hackney half, I got my PB down to 23 minutes and 17 seconds on that course. Since then I have run it three times – every time worst than the last and every time above 25 minutes (today’s was a very bad 25’59”).

Training runs, too, have not been distinguished by speed (though I am gradually returning to longer distances as I train for the Royal Parks half marathon in October – sponsor me (for Oxfam) here) and two 10k races have shown me post times that were slower than the first 10k race I ran (which was in May).

What’s gone wrong? Running performance is pretty much all in the mind or at least it is about the mind’s tolerance of pain and discomfort – and I just do not want it badly enough, I think. Today I did a pretty decent first lap – the GPS on the phone is a bit iffy, typically reporting too fast a time/over-reporting the distance run, but I managed the first 2.5km in about 11’50” – not brilliant but not a million miles from that PB time, but then I effectively decided I didn’t like the discomfort much and so the second lap was in around 14’08”.

The thing is, I had gone out this morning with the intention of ending my run of worsening performances, but in the end just didn’t want that enough. I can try again next week, I suppose.

# Islamism is bad for your health – and not just in the obvious ways

Thanks to the New Scientist I have discovered that Islamic fundamentalism can have more damaging effects than just its attack on science, reason, liberty and equality: it can also damage your health.

Evidence from Iran, where the 1979 revolution led to both men and women adopting far more conservative modes of dress, is that the incidence of multiple sclerosis also began to increase – in fact ,according to this paper in the British Medical Journal, the incidence of MS increased eightfold between 1989 and 2006.

Scientists think the most likely reason is that the skin of Iranians was much less exposed to the Sun and consequently vitamin D production (as the New Scientist notes technically “vitamin D” produced in this way is not a vitamin at all, but that’s a different story) fell. The evidence that vitamin D production is closely linked to a variety of autoimmune diseases, including MS, is also growing.