A book recommendation

Lying in the bath this morning – relaxing into the Christmas holiday (despite the raging epidemic of omicron-variant Covid that is currently rampaging through London in an increasingly frightening manner), I had thoughts about several blog posts to write over the break, and this is the first…

It’s a book recommendation: Ian Stewart‘s Concepts of Modern Mathematics. It’s around 50 years since its first edition (and even this Dover edition is 26 years old) but – part from the disconcerting habit of assuming all humans are male – it doesn’t feel like a dated work at all.

In one other way, though, it is a product of its times – the controversy over “new mathematics” – the teaching methods adopted widely across the west in the 60s and 70s in an effort to improve mathematical understanding of students and move beyond rote-teaching. Interestingly, as the wikipedia article demonstrates, its adoption was driven, in part, by the post-Sputnik panic in US science and engineering.

It’s adoption was certainly controversial as Tom Lehrer makes clear here…

Now, having been educated in both Ireland (primary and secondary) and England (secondary) I have actually seen both approaches in operation at first hand – and its certainly the case that the traditional methods gave me a solid (algorithmic in the sense of using a simple and repetitive method) grounding in such things as long division which I think were missing in England, but it’s also not difficult to see why discussions of things like sets, modular arithmetic and matrices are pretty essential for anyone thinking of a career in science and technology.

That said the UK’s School Mathematics Project seems to have avoided what appears to be the excesses of “new math” in the US. I cannot produce any evidence for it but I am tempted to wonder if the UK’s 1980s (and continuing) software boom was in part driven by its innovation in maths teaching (read Francis Spufford’s excellent Backroom Boys for more on that).

Indeed these things are so fundamental to software development that it’s hard to imagine a case for not teaching them – but 50 years ago they were clearly seen as in some way esoteric or even fetishistic.

Professor Stewart’s book concentrates on understanding of the concepts and not rigorous proofs – which makes the book readable and digestible in settings like my bath. I am not going to say it could be mastered by anyone regardless of their current understanding of maths, but its not a traditional textbook and I would suggest it would be a good backgrounder if you had basic (O level/GCSE) maths qualifications and were interested in working as a software developer.

One of the points the author makes in his preface is that the UK has seen its science and technology bases erode as more and more graduates look for careers in law and financial services. I hope that will become the subject of another blog here shortly.

Coin tossing conundrum

Tossing a coin

This is from The Mathematics of Various Entertaining Subjects: Research in Recreational Math.

It left me so puzzled it took me a while to get my head around it.

It’s game – Flipping Fun – and the idea is that participants pick a set of three coin tosses (eg THH, HHH, THT and so on) and the winner is the person who has picked the first sequence that comes up.

The mind bending part of this is:

  • The coin is fair so the odds of it turning up as heads or tails on any toss are the same (1/2).
  • Thus any sequence of a given length is equally likely – i.e. if the coin is tossed ten times then HHHHHHHHHH is as likely as HTHTHTHTHT or any other sequence.
  • Despite both of the above facts some sequences of three tosses are much more likely to win than others.

To illustrate this point, think of HHT against TTT. Here HHT is much more likely to win – because once two heads have been tossed (odds 1/4) then it is only a question of waiting for a tail to come up, as tossing a further head keeps the HH sequence alive.

So, for instance, in five tosses the odds of HHT winning are 1/4 (8/32), but the odds of TTT winning inside five tosses are 5/32

What’s the evidence your numbers are faked?

PhD poster

Currently reading: Ten Great Ideas About Chance.

It can be a bit of a frustrating read – a lot of the chapters about gambling and judgement appear to be poorly explained to me – especially if you’ve ever actually placed a bet on anything (I am far from a regular gambler – I’m not sure I’ve placed a bet on anything in the last decade but I still know how it works).

But it also has lots of exceptionally interesting stuff: not least the revelation that naturally occurring number sets tend to begin with 1 with a proportion of 0.3 whilst fraudsters (including election fixers and presumably scientific experiment fixers) tend to use such numbers 0.111… (1/9) of the time.

So, taking my life in my hands – what about results in my PhD – (read thesis here). I am actually calculating this as I go along – so I really do hope it doesn’t suggest fraud!

Anyway – first table, which has a calculated SPECmark figure based on other researchers’ work – so not really my findings, though the calculations are mine.

Numbers are: 35, 22.76, 13.41, 7.26, 3.77, 1.94, 1.01, 0.55, 0.31, 0.20 and 0.14

And (phew) that means 4 out of 11 (0.364) begin with 1 and so looks fraud free on this test.

But as I say, those are my calculations, but they are not my findings at root.

So moving on to some real findings.

First, the maximum total length of time it takes for a simulation to complete a set of benchmarks in simulated cycles: 43361267, 1428821, 1400909, 5760495, 11037274, 2072418, 145291356, 5012280.

Here a whole half are numbers that begin with 1. Hmmmm.

The mean time for the same: 42368816, 1416176, 1388897, 5646235, 10964717, 2026200, 143276995, 4733750.

Again, half begin with one.

What about the numbers within the results – in this case the amount of cycles lost due to waiting on other requests?

The numbers are: 35921194, 870627, 844281, 4364623, 1088954, 1446305, 110996151, 3420685

Now the proportion has fallen to 3/8 (0.375) and I can feel a bit more comfortable (not that the other results suggested I was following the fraudsters’ one-ninth pattern in any case.)

Later I produce numbers for an optimised system. How do they perform?

The maxima now become: 2514678, 1357224, 1316858, 3840749, 10929818, 1528350, 102077157, 3202193.

So the proportion beginning with 1 has actually risen to 5/8.

And the means show a similar pattern. Worse news with the blocks though. They now become: 730514, 775433, 735726, 2524815, 806768, 952774, 64982307, 1775537. So I am left with 1/8 (0.125) starting with 1 – something dangerously close to 1/9.

Can I save myself? I hope so … the figures above are for the first iteration of the system, but when we look at subsequent iterations a different pattern (for the blocks) emerges: 130736, 0, 0, 1612131, 97209, 232131, 64450433, 1117599.

This is now back to 3/8.

Of course, I didn’t cheat and I also suspect (I would, wouldn’t I?) that the block count is a better measure of whether I had or not – because the overall execution time of the benchmarks is, in some sense, a function of how long their execution path is and effectively that is determined – the blocking in the system is an emergent property and if I faked the whole thing I’d have less to go on and be much more likely to make it up.

Well, that’s my story and I’m sticking to it.

The Sleeping Beauty Controversy

Waiting for her Philip

I am reading “The Best Writing on Mathematics 2018” and amongst the various fascinating articles is one on “The Sleeping Beauty Controversy” by Peter Winkler.

the problem

(Here’s a video link – which examines the problem but isn’t related directly to Peter Winkler’s piece)

To steal Professor Winkler’s own description from the article:

Sleeping Beauty agrees to the following experiment. On Sunday she is put to sleep, and a fair coin is flipped. If it comes up heads, she is awakened on Monday morning; if tails, she is awakened on Monday morning and again on Tuesday morning. In all cases, she is not told the day of the week, is put back to sleep shortly after, and will have no memory of any Monday or Tuesday awakenings.

When Sleeping Beauty is awakened on Monday or Tuesday, what – to her – is the probability that the coin came up heads?

The bulk of mathematicians are “thirders” – they think the correct answer is 1/3. But a sizable minority are “halfers” – believing the correct answer is 1/2. Still others think the question is undecidable.

For what it’s worth I am with the thirders – Sleeping Beauty would understand that if, for instance, the coin was tossed 100 times she’d be woken around 150 times and so the chances she’s being woken as a result of a head are one-in-three.

The problem restated?

But what really struck me in the article was what Professor Winkler says is a rephrasing (though admits some might disagree) of Sleeping Beauty – because this leads me to a “halfer” position:

Alice, Bob and Charlie have each taken a new sleeping pill. In hospital experiments half the subjects slept through the night… but the other half woke up once in the middle of the night then returned to sleep and woke up in the morning with no memory of the night awakening.

Alice wakes up in the middle of the night… her credence that the pill has worked drops to zero… Bob wakes up in the morning, his credence that the pill worked remains at 1/2…

Charlie, who has blackout shades in his bedroom, wakes up not knowing whether it is morning. According to the thirders, his credence in the efficacy of the pill is 1/3 until he raises the shades…

But how can this be? Charlie has no additional information beyond that he has woken up (as he would do in any case) and so surely his prior information – that the pill is 1/2 effective remains unchanged?

(Here’s another link to a different article on this question.)

“Waiting for her Phillip”by Omnimoving is licensed under CC BY-NC-SA 2.0

Mathematicians: please help!

from maths Stack Exchange

I am still stuck with a problem with the M/G/1 queue: not quite the same as my original problem (discussed here) as I understand that now – but the next stage really – involving some manipulation of Laplace transforms.

I won’t post all the details here, because you can read them here instead and, if this sort of thing matters to you (and why wouldn’t it?) pick up the bounty I am offering on the maths Stack Exchange.

Puzzle about an M/G/1 queue

patched and unpatched compilation times

I am deeply puzzled by a question about the behaviour of an M/G/1 queue – i.e., a queue with a Markovian distribution of arrival times, a General distribution of service times and 1 server. I have asked about this on the Math Stackexchange (and there’s now a bounty on the question if you’d like to answer it there – but as I am getting nowhere with it, I thought I’d ask it here too.

(This is related to getting a more rigorous presentation on thrashing into my PhD thesis.)

Considering an M/G/1 queue with Poisson arrivals of rate λ – this comes from Cox and Miller’s (1965) “The Theory of Stochastic Processes” (pp 240 – 241) and also Cox and Isham’s 1986 paper “The Virtual Waiting-Time and Related Processes“.

My question is what is the difference between (using the authors’ notation) p_0 and p(0,t)? The context is explained below…

In the 1965 book (the 1986 paper presents the differentials of the same equations), X(t) is the “virtual waiting time” of a process and the book writes of “a discrete probability p_0(t) that X(t)=0, i.e., that the system is empty, and a density p(x,t) for X(t)>0“.

The system consumes virtual waiting time in unit time, i.e., if X(t)\leq\Delta t and there are no arrivals in time \Delta t then X(t + \Delta t) = 0.

The distribution function of X(t) is then given by:

They then state:
p(x, t+ \Delta t) = p(x + \Delta t, t)(1 - \lambda \Delta t) +p_0(t)b(x)\lambda\Delta t + \int_{0}^{x}p(x - y, t)b(y)dy\lambda\Delta t + o(\Delta t)

I get all this – the first term on the RHS is a run-down of X(t)>0 with no arrivals, the second is adding b(x) of service time when the system is empty at t and the third, convolution-like, term is adding b(y) of service time from an arrival when it’s not empty at t. (The fourth accounts for their being more than one arrival in \Delta t but it tends to zero much faster than \Delta t so drops out as \Delta t approaches the limit.)

And … and this is where I have the problems …

p_0(t+\Delta t)=p_0(t)(1-\lambda\Delta t) +p(0,t)\Delta t(1 - \lambda\Delta t) + o(\Delta t)

The first term of the RHS seems clear – the probability that the system is empty at t multiplied by the probability there will be no arrivals in \Delta t, but the second is not clear to me at all.

I assume this term accounts for the probability of the system “emptying” during \Delta t but I don’t see how that works, is anyone able to explain?

In other words, how does p(0,t)\Delta t(1 - \lambda\Delta t) represent this draining? Presumably (1 - \lambda\Delta t) again represents the possibility of zero arrivals in \Delta t, so how does p(0, t)\Delta t represent the X(t) \leq \Delta t situation?

If we take the equilibrium situation where p_0(t) = p_0 and p(x, t) = p(x) then, if we differentiate and as p^{\prime}_0 = 0, we get p_0 = \lambda p(0) – so, again, what does p(0) represent?

Reviving this blog … with a question about binary trees

Work changes and a determination to actually finish my PhD mean I really should make a bit of an effort here and so I will.

Here is a puzzle that has been bothering me about binary trees which has come from my PhD research…

In that research I am investigating how to implement virtual memory on a many-core Network-on-Chip (NoC) system. Essentially I have been building and running emulators.

The core of the emulation is the “memory tree” – the connection between the cores of the NoC and the global memory controller. The “Bluetree” memory tree is a binary tree arrangement, with the root node (level 0) connecting to the memory controller and the leaves at the NoC tiles (with each tile having a single processor).

At every non-leaf level of the tree there are multiplexors (mux) with a buffer that can hold a single memory packet – so if two requests for global memory arrive at once and the buffer is free there needs to be an arbitration process to decide which packet gets to occupy the buffer and which has to wait.

We have 128 leaf nodes – as in this tree…

Binary tree with 128 leaf nodes

With this many leaf nodes, the question of the arbitration policy of the muxes is important. A naïve approach would be to say, if two packets contest for the buffer, let the packet on the left win the first time and then pass the packet on the right: after all this guarantees no starvation.

But it is actually a hopeless policy … the leftmost leaf node (shall number from zero) faces no opposition, the rightmost (number 127) loses out to every other node.

But number of leaf node alone is not a sufficient way of describing the relative number of blocks – for instance leaf node 16 appears to be in a better place than leaf node 15 – as going up the tree leaf node 15 can be blocked at level 6 (the root node is at level 0), at level 5, level 4, level 3 and level 2: while leaf node 16 is only blocked at level 1.

Practical testing of this naïvely fair approach in the emulator gives results like the following… as can be seen the right hand part of the tree performs appallingly but even there lots of variation can be seen:

some test results

My question is this: what is the correct (i.e., correct mathematically) way of describing the order of the leaf nodes?

From “The Foundations of Mathematics”

This is a fragment of a question from the first chapter of “The Foundations of Mathematics” (Amazon link):

Record whether you think the following statements are true or false:

(a) All of the numbers 2, 5, 17, 53, 97 are prime.

(b) Each of the numbers 2, 5, 17, 53, 97 is prime.

(c) Some of the numbers 2, 5, 17, 53, 97 are prime.

(d) Some of the numbers 2, 5, 17, 53, 97 are even.


For what-it’s-worth I thought all of these statements are true.

Could someone explain this contradiction to me?

Reading on with Julian Havil’s Gamma: Exploring Euler’s Constant and inspired by his discussion of the harmonic series, I come across this:

\frac{1}{1-e^x} = 1 + e^x + e^{2x} + e^{3x} + ...

Havil calls this a “non-legitimate binomial expansion” and it seems to me it can be generalised:

(1 - r^x)^{-1}= 1 + r^x + r^{2x} + r^{3x} + ...

as 1 = (1 - r^x)(1 + r^x +r^{2x}+r^{3x}+... )= 1 + r^x +r^{2x}+r^{3x}+...-r^x-r^{2x}-r^{3x}-...

And, indeed if we take x=-1, r=2 we get:

\frac{1}{1-2^{-1}} = 2 = 1 + \frac{1}{2} + \frac{1}{4} + \frac{1}{8} +... at the limit.

But if we have x \geq 0 it is divergent and the identity, which seems algebraically sound to me, breaks down. E.g., r=2, x=2:

\frac{1}{1-4} = -\frac{1}{3} = 1 + 4 + 8 + 16 + ...

So what is the flaw in my logic?

In Pursuit of the Traveling Salesman

In Pursuit of the Traveling Salesman: Mathematics at the Limits of Computation

English: A representation of the relation amon...
English: A representation of the relation among complexity classes, which are subsets of each other. (Photo credit: Wikipedia)

This is a strange little book – and while I am happy to recommend it, I am not really clear in my own mind what sort of a book it is meant to be.

A popular description of a famous problem in the “NP” space and a gentle introduction to the whole issue of computational complexity and complexity classes? Not really, it assumes a bit too much knowledge for that.

So, a canter round the basic maths of the TSP and complexity? Not really that either, as it is light on mathematical specifics.

Still, it’s a decent book.