Recently I had to write some code to generate a pseudorandom number in a system with very limited sources of entropy. So, of course I turned to Donald Knuth and, in particular, Volume 2 – Seminumerical Algorithms – in the magisterial The Art of Computer Programming.
Reading through the questions/exercises I then came across this one:
Prove that the middle-square method using 2n-digit numbers to the base b has the following disadvantage: if the sequence includes any number whose most significant n digits are zero, the succeeding numbers will get smaller and smaller until zero occurs repeatedly.
(Knuth rates this question as ’14’ and, using his scale of difficulty which places a ’10’ as a minute to solve and a 20 as twenty minutes, probably means this should take about 7 or 8 minutes but I’ve spent much, much longer on it than that!)
A quick explanation: the middle-square method is a naive (though actually first suggested by none other than John von Neumann) random number generation method where we take a number n-digits long, square it and take the middle n-digits as our next seed or random number.
At first I thought I’d found an example where Knuth’s proposition appears to be false.
Let and the seed number be then every subsequent number in the sequence is also 60 (obviously that’s as useless as repeated zeros for a random number generator.) But the problem with that is, that although it demonstrates a weakness in the middle square method, it doesn’t fit Knuth’s definition of the problem. What is here? If ( is the minimum for middle values), then $N_o = 0060 $ and and so (thanks to Hagen von Eitzen for clarifying this for me.)
So let’s look at the general case. (I also this explanation to Hagen von Eitzen, as compared to my very long-winded first attempt – though any errors that follow are mine not his.)
So we have a number which we think of as a digit number – though the first digits are 0. Then (as the largest can be is .)
Thus as the largest can be is then:
If we have a digit number then the biggest number of digits we can get from the out put is , but in our case we only have digits to worry about so the biggest size can be is digits, as again the leading digits will be 0.
So, to apply the middle square method we need to lose the lower digits – i.e., take .
From the above:
If then will always , so the sequence decreases until .