Tag Archives: Stochastic process

Learnt this week… 24 January

My friend and former colleague Adam Higgitt every Friday posts a list of “five things I have learned this week”. It’s popular and good fun – especially as Adam is not afraid of an argument if you challenge some of his claims.

For a while I tried to do the same thing myself, but failed miserably.

I am not going to try again, but I am proposing to try something different, if inspired by Adam.

So here is the first list of things “learnt this week” scientific or mathematical facts and amusements. I will aim for five, but this week just did not make it.

1. A random walk can be used to build a binomial distribution – but not a very good one!

Imagine a left-right ruled line centred on zero and a marker than can, in every time step move either left or right be one step where the probability of moving left p_l and of moving right, p_r are both the same: i.e., p_l = p_r = 0.5 . At the “beginning of time” the marker stands at 0.

Then if we count the times the marker is at any given position they will be distributed bionomially (well, as we approach an infinite time). The BASIC code below (which I wrote using BINSIC) should give you an idea (this code runs the risk of an overflow though, of course and the most interesting thing about it is how unlike a binomial distribution the results can be).

10 DIM A(1001)
12 FOR I = 1 TO 1001
14 LET A(I) = 0
20 LET POS = 500
30 FOR I = 1 TO 50000
40 LET X = RND * 2
60 LET A(POS) = A(POS) + 1
90 FOR I = 1 TO 1001
95 LET X = I - 500
110 PRINT X," ",A(I)
120 NEXT I

Here’s a chart of the values generated by similar code (actually run for about 70,000 times):
Not much like a binomial distribution2. Things that are isomorphic have a one-to-one relationship

Up to this point I just had an informal “things that look different but are related through a reversible transformation” idea in my head. But that’s not fully correct.

A simple example might be the logarithms. Every real number has a unique logarithm.

Enhanced by Zemanta

Is the time pages are in the working set stochastic?

Reading about the Monte Carlo method has set me thinking about this and how, if at all, it might be applied to page reclaim in the Linux kernel.

In my MSc report I show that my results show that working set size is not normally distributed – despite occasional claims to the contrary in computer science text books. But it is possible that a series of normal distributions are overlaid – see the graphic below:

Working set size for MySQL daemonThe first question is: how do I design an experiment to verify that these are, indeed a series of normal distributions?

(I may find out how I have done in the degree in the next week or so – wish me luck)