## Reviving this blog … with a question about binary trees

Work changes and a determination to actually finish my PhD mean I really should make a bit of an effort here and so I will.

Here is a puzzle that has been bothering me about binary trees which has come from my PhD research…

In that research I am investigating how to implement virtual memory on a many-core Network-on-Chip (NoC) system. Essentially I have been building and running emulators.

The core of the emulation is the “memory tree” – the connection between the cores of the NoC and the global memory controller. The “Bluetree” memory tree is a binary tree arrangement, with the root node (level 0) connecting to the memory controller and the leaves at the NoC tiles (with each tile having a single processor).

At every non-leaf level of the tree there are multiplexors (mux) with a buffer that can hold a single memory packet – so if two requests for global memory arrive at once and the buffer is free there needs to be an arbitration process to decide which packet gets to occupy the buffer and which has to wait.

We have 128 leaf nodes – as in this tree…

With this many leaf nodes, the question of the arbitration policy of the muxes is important. A naïve approach would be to say, if two packets contest for the buffer, let the packet on the left win the first time and then pass the packet on the right: after all this guarantees no starvation.

But it is actually a hopeless policy … the leftmost leaf node (shall number from zero) faces no opposition, the rightmost (number 127) loses out to every other node.

But number of leaf node alone is not a sufficient way of describing the relative number of blocks – for instance leaf node 16 appears to be in a better place than leaf node 15 – as going up the tree leaf node 15 can be blocked at level 6 (the root node is at level 0), at level 5, level 4, level 3 and level 2: while leaf node 16 is only blocked at level 1.

Practical testing of this naïvely fair approach in the emulator gives results like the following… as can be seen the right hand part of the tree performs appallingly but even there lots of variation can be seen:

My question is this: what is the correct (i.e., correct mathematically) way of describing the order of the leaf nodes?

## From “The Foundations of Mathematics”

This is a fragment of a question from the first chapter of “The Foundations of Mathematics” (Amazon link):

Record whether you think the following statements are true or false:

(a) All of the numbers 2, 5, 17, 53, 97 are prime.

(b) Each of the numbers 2, 5, 17, 53, 97 is prime.

(c) Some of the numbers 2, 5, 17, 53, 97 are prime.

(d) Some of the numbers 2, 5, 17, 53, 97 are even.

For what-it’s-worth I thought all of these statements are true.

## Could someone explain this contradiction to me?

Reading on with Julian Havil’s Gamma: Exploring Euler’s Constant and inspired by his discussion of the harmonic series, I come across this:

$\frac{1}{1-e^x} = 1 + e^x + e^{2x} + e^{3x} + ...$

Havil calls this a “non-legitimate binomial expansion” and it seems to me it can be generalised:

$(1 - r^x)^{-1}= 1 + r^x + r^{2x} + r^{3x} + ...$

as $1 = (1 - r^x)(1 + r^x +r^{2x}+r^{3x}+... )= 1 + r^x +r^{2x}+r^{3x}+...-r^x-r^{2x}-r^{3x}-...$

And, indeed if we take $x=-1, r=2$ we get:

$\frac{1}{1-2^{-1}} = 2 = 1 + \frac{1}{2} + \frac{1}{4} + \frac{1}{8} +...$ at the limit.

But if we have $x \geq 0$ it is divergent and the identity, which seems algebraically sound to me, breaks down. E.g., $r=2, x=2$:

$\frac{1}{1-4} = -\frac{1}{3} = 1 + 4 + 8 + 16 + ...$

So what is the flaw in my logic?

## In Pursuit of the Traveling Salesman

This is a strange little book – and while I am happy to recommend it, I am not really clear in my own mind what sort of a book it is meant to be.

A popular description of a famous problem in the “NP” space and a gentle introduction to the whole issue of computational complexity and complexity classes? Not really, it assumes a bit too much knowledge for that.

So, a canter round the basic maths of the TSP and complexity? Not really that either, as it is light on mathematical specifics.

Still, it’s a decent book.

## Was new maths really such a disaster?

On holiday now, so I fill my time – as you do – by reading books on maths.

One of these is Julian Havil’s Gamma: Exploring Euler’s Constant.

The book begins with a foreword by Freeman Dyson, in which he discusses the general failure, as he sees it (and I’d be inclined to agree), of mathematical education. He first dismisses learning by rote and then condemns “New Mathematics” as – despite its efforts to break free of the failures of learning by rote – an even greater disaster.

Now, I was taught a “new maths” curriculum up to the age of 16 and I wonder if it really was such a disaster. The one thing I can say is that it didn’t capture the beauty of maths in the way that the ‘A’ level (16 – 18) curriculum came close to doing. At times I really wondered what was it all about – at the age of 15 matrix maths seems excessively abstract.

But many years later I can see what all that was about and do think that my new maths education has given me quite a strong grounding in a lot of the fundamentals of computing and the applied maths that involves.

To the extent that this blog does have an audience I know that it is read by those with an interest in maths and computing and I would really welcome views on the strengths and weaknesses of the new maths approach.

## A better demonstration of the product rule

Inspired by The Theoretical Minimum: What You Need to Know to Start Doing Physics: here’s a better proof/justification for the product rule in differential calculus than the one I set out here last month.

We will start with what we will treat as an axiomatic definition of the differential of the function $y=f(x)$:

$\frac{dy}{dx} = \frac{df(x)}{dx} = \frac{f(x+\Delta x) - f(x)}{\Delta x}$ as $\Delta x \rightarrow 0$

In this case we have $y=f(x)g(x)$, so $\frac{dy}{dx} = \frac{f(x + \Delta x)g(x +\Delta x) - f(x)g(x)}{\Delta x}$

From our definition we can substitute for $f(x+\Delta x)$ and $g(x + \Delta x)$ and simplifying our notation for presentational reasons so that $\frac{df(x)}{dx} = f^{\prime}$ etc:

$f(x+\Delta x) = f^{\prime}\Delta x + f(x)$

$g(x+\Delta x) = g^{\prime}\Delta x + g(x)$

Giving (after dividing through by $\Delta x$):

$y^{\prime} =f^{\prime}g^{\prime}\Delta x + g(x)f^{\prime} + \frac{f(x)g(x)}{\Delta x} + g^{\prime}f(x) - \frac{f(x)g(x)}{\Delta x}$

$=f^{\prime}g^{\prime}\Delta x + g(x)f^{\prime} +g^{\prime}f(x)$

As $\Delta x \rightarrow 0$ the first term falls to zero and so we are left with:

$y^{\prime}=f^{\prime}g(x) + g^{\prime}f(x)$

Which, of course, is the product rule.

Update: See this most excellent comment from Professor Rubin.

## Product rule

And here’s a quick recap of a demonstration of the product rule (after Leibniz):

$\frac{d}{dx}(u(x)v(x)) = (u(x) + \frac{du}{dx})(v(x) + \frac{dv}{dx}) - u(x)x(x)$

$= u(x)v(x) - u(x)v(x) + u(x)\frac{dv}{dx} + v(x)\frac{du}{dx}$

Which is, of course, the product rule.

## Learnt this week … 31 January 2014

1. Roots (solutions) to a polynomial in a single variable

Quite why I was not taught this for ‘A’ level maths is beyond me (or more likely I was and have simply forgotten) but, if we have a polynomial in a single variable:

$a_n x^n + a_{n-1} x^{n-1} + ... + n_0 = 0$

Then the general form of its factorisation is:

$a_n(x - q_n)(x - q_{n- 1}) ... (x - q_1)$

and it has the roots:

$q_n, q_{n - 1}, ..., q_1$

Here’s an (very) outline proof…

Take a polynomial $f(x) = 0$ with a known root $q$ then $f(x)$ will divide evenly (no remainders) by $x - q$ and so we can say $f(x) = g(x)(x - q)$ where $g(x)$ is of one degree less than $f(x)$. We can continue this until we are left with a function of degree 0 – i.e. the constant $a_n$ (possibly 1) and then we have the form $f(x) = a(x-q_n)...(x-q_1)$.

(And I recommend Elliptic Tales: Curves, Counting, and Number Theory for those who want to know more – I am on my second iteration of trying to read this, but it is good fun.)

2. I can construct a curve that has no tangent at any point

At least, algebraically, it has no tangent.

For instance, the line defined by $y^2 - 2xy + x^2 = 0$.

We define the tangents for such lines, $f(x,y) = 0$ as, at the point $(a, b)$ as being of the form: $\frac {\partial f}{\partial x} x + \frac{\partial f}{\partial y}y = f(a, b)$

Now, $f(a, b) = 0$, but note that $\frac {\partial f}{\partial x}$ and $\frac {\partial f}{\partial y}$ are also equal to zero for all points, so we have $0 = 0$ which describes no line.

But – geometrically – this line is the same as that described algebraically by $y = x$ which has a tangent line of $y = x$.

Are they different lines? They are, algebraically.

(This comes from the same source as point 1).

## Poisson distribution puzzle

I asked this on math.stackexchange but have yet to get an answer, so will try here too…

In their monograph “Queues“, Cox and Smith state (paraphrased – this is p5):

In interval (t,t+Δt) the probability of no arrivals in a completely random process is 1−αΔt+o(Δt), for one arrival αΔt+o(Δt) and for more than one arrival o(Δt) where α is the mean rate of arrival.

I cannot follow this… here is my thinking – we take N to be the probability of no arrivals, W to be the probability of one arrival, Z to be the probability of more than one arrival, and A to be the probability of any arrivals.

So 1=N+W+Z=N+A

By my understanding of Cox and Smith:

1=1−αΔt+o(Δt)+αΔt+o(Δt)+o(Δt) =1+3o(Δt) which is surely nonsense.

So, what have I got wrong here?

## So, how good are the English premiership top seven?

oops: got the maths wrong first time – this is the corrected version

Last weekend was said to be one of the worst in the history of English bookmakers (I know, your heart bleeds) as for the first time ever all seven top teams in the English premiership won – so ensuring a lot of fixed odds accumulator payouts.

Well, the English premier league has been going since the summer of 1992, so this is its 21st season. In the first two seasons there were 22 teams, and since then there have been 20.

So the total game “weeks” have been:

1992/93 – 1994/95: 42 x  2 = 84
1995/96 – 2012/13: 38 x 18 = 684
2013/14 (to last week): 21

Giving a total of 789. So just one week out of 789 makes it sound like the top seven are not very reliable.

But, of course, this ignores the fact that – on most weeks – at least one of the top seven is playing another team in top seven (this is certainly happening this week with Chelsea playing Manchester United).

On any given week the chances of this NOT happening are (for a 20 team league):

$\frac{13}{19} \times \frac{12}{17} \times \frac{11}{15} \times \frac{10}{13} \times \frac{9}{11} \times \frac{8}{9} \times \frac{7}{7} = 0.198$

While for a 22 team league these odds are:

$\frac{15}{21} \times \frac{14}{19} \times \frac{13}{17} \times \frac{12}{15} \times \frac{11}{13} \times \frac{10}{11} \times \frac{9}{9} = 0.248$

So, in other words there have been only 84 x 0.198 + 705 x 0.248 = 191 weeks (on average – I’d have to look at the precise run of games this season to be clear) where this was even possible.

And how often should the top teams win? If they decided games on tossing of coins (and we ignore draws as they are ‘wins’ for the bookies) then the top seven would only all win 1 out of 128 weeks. The fact they managed it after 191 weeks might suggest they were worse than that, but we cannot be sure – it probably needs a bigger sample size.