## Entropy and randomness in communications, a quick introduction

Last year I remarked on how Roger Penrose‘s discussion of entropy in Cycles of Time did more to explain the physical nature of the concept (to me, at least) than many years of formal education.

And now, another book, Information and Coding Theory has given me some real insight into how computers deliver pseudo-randomness and why they look for sources of entropy.

The maths of all this are deceptively simple, but are of fundamental importance in information theory.

First of all, we can think of what sort of information events convey. There is a famous aphorism about the habits of bears, the direction from which the Sun rises and the title of the head of the Catholic Church. No newspaper or TV bulletin would ever bother to tell us, on a daily basis, that the Sun has indeed risen from the East. In effect that event conveys no information because it is completely expected. Toss a coin, though, and we do not know the outcome in advance – the result is of interest because the outcome is not known in advance.

Hence we can posit an information function $I(e)$ which tells us how much information we can derive from event $e$, or, if we think of symbols in a data stream we can call this $I(s_i)$ – the amount of information we can derive if the next symbol we see is $s_i$.

From our earlier discussion we can see that the result of this function is dependent on the probability of the event occurring or the symbol being received – i.e. if we knew the Sun was always going to rise in the East we can derive no information from that very thing happening. Hence: $I(s_i)$ is a function of $p_i$ where $p_i$ is the probability that the next symbol to be seen is $i$. As we will only consider ‘memoryless’ events we will also insist that $p_i$ is independent of time or any previously seen symbol. Hence: $I(s_is_j)$ (i.e. the information we can gain from seeing the symbol $s_i$ followed by the symbol $s_j$ is $I(s_i) + I(s_j)$

Taking these points together we can define $I(s_i) = log\frac{1}{p_i} = -log(p_i)$

We can see that $I$ tends to infinity if the probability of an event tends to zero (think of the Sun rising in the West!) and tends to 0 if nothing unexpected happens.

The entropy of a system (communication) $H(S)$ then becomes: $H(S) = \sum\limits_i^q p_iI(s_i) = \sum\limits_i^q p_ilog\frac{1}{p_i} = -\sum\limits_i^q p_i log(p_i)$