Recently I had to write some code to generate a pseudorandom number in a system with very limited sources of entropy. So, of course I turned to Donald Knuth and, in particular, Volume 2 – Seminumerical Algorithms – in the magisterial The Art of Computer Programming.
Reading through the questions/exercises I then came across this one:
Prove that the middle-square method using 2n-digit numbers to the base b has the following disadvantage: if the sequence includes any number whose most significant n digits are zero, the succeeding numbers will get smaller and smaller until zero occurs repeatedly.
(Knuth rates this question as ’14’ and, using his scale of difficulty which places a ’10’ as a minute to solve and a 20 as twenty minutes, probably means this should take about 7 or 8 minutes but I’ve spent much, much longer on it than that!)
A quick explanation: the middle-square method is a naive (though actually first suggested by none other than John von Neumann) random number generation method where we take a number n-digits long, square it and take the middle n-digits as our next seed or random number.
At first I thought I’d found an example where Knuth’s proposition appears to be false.
Let and the seed number be
then every subsequent number in the sequence is also 60 (obviously that’s as useless as repeated zeros for a random number generator.) But the problem with that is, that although it demonstrates a weakness in the middle square method, it doesn’t fit Knuth’s definition of the problem. What is
here? If
(
is the minimum for middle values), then $N_o = 0060 $ and
and so
(thanks to Hagen von Eitzen for clarifying this for me.)
So let’s look at the general case. (I also this explanation to Hagen von Eitzen, as compared to my very long-winded first attempt – though any errors that follow are mine not his.)
So we have a number which we think of as a
digit number – though the first
digits are 0. Then
(as the largest
can be is
.)
Thus as the largest can be is
then:
And
If we have a digit number
then the biggest number of digits we can get from the out put is
, but in our case we only have
digits to worry about so the biggest size
can be is
digits, as again the leading
digits will be 0.
So, to apply the middle square method we need to lose the lower digits – i.e., take
.
From the above:
If then
will always
, so the sequence decreases until
.