## McMillian’s inequality

Kraft’s inequality states that: There is an instantaneous $r$-ary code $C$ with word lengths $l_1...l_q$ if and only if :

$\sum_{i=1}^q \frac{1}{r^{l_i}} \leq 1$

But what of codes that are merely uniquely decodable, and not necessarily instantaneous?

There number is given to us by McMillian’s inequality and the surprising thing is that, despite the looser bound this is:

There is a uniquely decodable $r$-ary code $C$ with word lengths $l_1...l_q$ if and only if :

$\sum_{i=1}^q \frac{1}{r^{l_i}} \leq 1$

And here’s a proof (again drawn from Information and Coding Theory):

We have a code $C$ with word lengths $l_1 ... l_q$ and we define:

$K = \sum_{i=1}^q \frac{1}{r^{l_i}}$.

Additionally we say that $x = max(l_1 ... l_q)$ and $y = min(l_1 ... l_q)$

Let us now consider $K^n$ or $(\sum_{i=1}^q \frac{1}{r^{l_i}})^n$ where $n \geq 1$.

This is $\sum\sum...\sum \frac{1}{r^{l_i}}$ and is the sum of a series of products of the form:

$\frac{1}{r^{l_1}}\times\frac{1}{r^{l_2}}...\times\frac{1}{r^{l_q}}$

(I found this a bit difficult to follow at first, but imagine we had $(\frac{1}{A} + \frac{1}{B})(\frac{1}{C} + \frac{1}{D})(\frac{1}{E} + \frac{1}{F})$, then we would could expand this as:

$(\frac{1}{A}\times\frac{1}{C} + \frac{1}{A}\times\frac{1}{D} + \frac{1}{B}\times\frac{1}{C} + \frac{1}{B}\times\frac{1}{D})(\frac{1}{E} + \frac{1}{F})$

$\frac{1}{A}\times\frac{1}{C}\times\frac{1}{E} +\frac{1}{A}\times\frac{1}{C}\times\frac{1}{F} + \frac{1}{A}\times\frac{1}{D}\times\frac{1}{E} +\frac{1}{A}\times\frac{1}{D}\times\frac{1}{F} + \frac{1}{B}\times\frac{1}{C} \times\frac{1}{E}+ \frac{1}{B}\times\frac{1}{C}\times\frac{1}{F}+\frac{1}{B}\times\frac{1}{D}\times\frac{1}{E}+\frac{1}{B}\times\frac{1}{D}\times\frac{1}{F}$

And we can also write:

$\frac{1}{r^{l_1}}\times\frac{1}{r^{l_2}}...\times\frac{1}{r^{l_q}} = \frac{1}{r^j}$ where $j = l_1 + l_2 +...+l_q$.

Now $yn \leq j \leq xn$ since $y \leq l_i \leq x$ for all $i$.

Now, here’s the fancy bit (at least for me!)…

This means we can rewrite $K^n$ as $\sum^{xn}_{j=yn}\frac{N_{j,n}}{r^j}$

The upper and lower bounds for $j$ follow from the inequalities above (we could sum over a wider range but then the coefficient $N_{j,n}$ would be zero beyond these bounds. And what is this coefficient?

$N_{j,n}$ is the number of ways of writing any given $j$ (or, alternatively, the number of different code word sequences whose total lengths sum to $j$).

Let’s take a simple example again – we have a binary code $C$ with code words $0,01,011$ (plainly not instantaneous as $01$ is a prefix of $011$).

Here $K = \sum_{i=1}^q \frac{1}{r^{l_i}} = \frac{1}{2} + \frac{1}{4} + \frac{1}{8} = \frac{7}{8}$

But what if we look at $K^2$?

Then we have:

$\frac{1}{2}\times\frac{1}{2} + \frac{1}{2}\times\frac{1}{4} + \frac{1}{2}\times\frac{1}{8} +\frac{1}{4}\times\frac{1}{2} + \frac{1}{4}\times\frac{1}{4} + \frac{1}{4}\times\frac{1}{8} + \frac{1}{8}\times\frac{1}{2} + \frac{1}{8}\times\frac{1}{4} + \frac{1}{8}\times\frac{1}{8}$

Writing this out in the form above we then have – and you can see the coefficients:

$\frac{1}{2^2} + \frac{2}{2^3} + \frac{3}{2^4} + \frac{2}{2^5} + \frac{1}{2^6}$

The coefficient $N_{j,n}$ can never be greater than $r^j$ (as it can never be greater than the number of possible ways to arrange the code), so $\frac{N_{j,n}}{r^j} \leq 1$.

We also know that $K^n$ is a sum of $xn + 1 - yn$ terms, so:

$K^n \leq(x - y)n + 1$.

As $n$ increases, if $K > 1$ then the left hand side increases exponentially while the right (where $x$ and $y$ are independent of $n$) increases only linearly, hence $K \leq 1$.

I am sure this will be one of the less read postings on the blog (congratulations if you got this far), but if it helps just one person get a better undestanding of McMillan’s inequality then it has been worth it!

Advertisements

## President Obama’s campaign and free software

It seems a row has broken out between staff on President Barack Obama’s re-election campaign over the fate of the free software it produced (the article linked here refers to it all as “open source” but on this issue I tend to side with RMS and not ESR).

Actually I do not blame the DNC at all for not wanting to release any source (if that is what they want to do – it is not entirely clear). It would simply be foolish to surrender an advantage they have over their opponents if there is no need to do so. Nor does there appear to be any ethical issue involved: the core idea of the free software movement is surely that any user of software should have access to the source code out of which it is built. If the DNC does not distribute the software then they are under no moral or any other obligation to hand out the source code.

By far the worst idea the article talks of is selling the software: that would truly be a breach of the ethics of free software – because plainly trying to use the built software as a revenue stream means keeping the software hidden or forcing users, 1970s Unix-style, to sign NDAs. Either of those is worse than keeping a piece of private software private.

There is a wider question, of course, could distributing the software help build a better world. But if the distribution helps the US republican party, then surely the answer for the DNC is no?

## Kraft’s inequality

Getting round to re-reading Information and Coding Theory– in physical form after the disappointment of the Kindle edition.

Maybe it’s because I’m six months older and wiser, but this time more determined to grasp more of the maths in the book, so came to a basic precept of coding theory – Kraft’s inequality – which states:

There is an instantaneous $r$-ary code $C$ with word lengths $l_1...l_q$ if and only if :

$\sum_{i=1}^q \frac{1}{r^{l_i}} \leq 1$

Let’s examine a proof of this: I found the proof in the book a bit difficult to follow, so this is why this follows the ideas in the book but in a little more detail.

So, we can imagine the space of the code words as a triangle (as shown):

The triangle $S$ contains the complete range of available code words and the small triangle $t$ shows that picking a non-leaf code word denies us the ability to use leaves that are dominated by the non-leaf code word (in an instantaneous code).

So, if we total levels $l$ and picked a word at level $l_x$ the we would lose access to $r^{l_l - l_x}$ leaves.

Now, we know that $r^l\sum_{i=1}^q\frac{1}{r^{l_i}} \leq r^l$ (as the sum is less than or equal to 1).

But was can also state that $r^{l-l_x} < r^l\sum_{i=1}^q\frac{1}{r^{l_i}}$ because the sum contains the fraction – think of a 32 leaf binary code triangle where we had picked a code word from the third level, then we would have:

$2^{5-3} < 2^5\sum_{i=1}^5\frac{1}{2^i}$

$2^2 < 2^4 + 2^3 + 2^2 + 2^1$

We thus see we can keep picking code words on this basis and we will always have an instantaneous code. But we also know that an instantaneous code is a prefix code and thus for a prefix code this inequality applies:

$\sum_{i=1}^qr^{l-l_i} \leq r^l$

Diving both sides by $r^l$ we get:

$\sum_{i=1}^q\frac{r^{l-l_i}}{r^l} \leq 1$

$\sum_{i=1}^q\frac{1}{r^{l_i}} \leq 1$

## Gamma ray bursts are not that rare

Yesterday it was reported that scientists have suggested that an anomalous peak in radioactive materials discovered in antarctic ice sediments and in ancient Japanese cedar trees could be explained by a gamma ray burst hitting the Earth in the 8th century CE.

The BBC radio report I head described gamma ray bursts as “extremely rare” and the website article – and much other coverage – repeats the idea that they are rare events.

But they are not.

There are an estimated $1.25\times10^{11}$ galaxies in the universe. It is estimated that a gamma ray burst happens at least once every million years in any galaxy – or approximately every $3\times10^8$ days. That means that today there will be approximately 1000 gamma ray bursts. Now, let’s assume that due to relativistic effects  we can at most only observe one-tenth of the universe, that still means 100 events in the observable part of the universe (how big the observable universe is in comparison to the universe is another matter however).

Of gamma ray bursts are narrowly beamed so even with this high rate of production not many get seen on Earth (probably a good thing given the energies involved), but they are far from rare.

Of course, events in our galaxy are rare (the fact that we are here at all is testament to that), but on the universal scale that is drawing a very tight boundary on the region being tested.

## After “comment spam”, comes “like spam”

Maybe if I had a blog that was visited by tens of thousands every day and on which hundreds wanted to comment it would not be so easy, but “comment spam” is not a problem here – anything which the algorithmic spam filter does not pick up can be chucked out by hand.

But in recent weeks I have seen a new form of blog spamming – “like spamming” against which WordPress seem to have given me no protection. Spammers – usually they are SEO snake oil sellers or link farmers – come here and like a post – that gives them an automatic link back to their blog.

I don’t want to discourage anyone who genuinely likes my blog – but let me warn the spammers. From this point onwards spam-like likes will see your blog reported to WordPress.com as spam.

## Summing an infinite series

I dare say what I am about to write is no great revelation and indeed I was probably taught this for ‘A’ level maths, but forgot it…

Anyway, I was reading about the Fast Fourier Transform (FFT), which, discovered in the 1960s, radically improved the speed of signal analysis. Older algorithms had been able to analyse $N$ signal samples in $O(N^2)$ time, while the FFT manages it in $O(Nlog_2N)$ time.

It does this by relying on the fact that one can combine two $N$ samples to get a $2N$ sample in $N$ operations, ie.,

$N \rightarrow 2N$ in $N$ operations
$2N \rightarrow 4N$ in $2N$ operations
$4N \rightarrow 8N$ in $4N$ operations

Hence we can get $N$ samples in $log_2N$ doublings. But how many operations does that take?

For the above example ($N = 8$) we can see it is $\frac{N}{2} + \frac{N}{4} + \frac{N}{8}$ and for the more general case we can see this is $N(2^{-1} + 2^{-2} + ... )$$N$ times an infinite series.

For the overall complexity of the algorithm to be $O(Nlog_2N)$ this series should add to 1. Does it?

Yes, it does, and here’s a proof…

We can rewrite our series as…

$X = r + r^2 + r^3 + r^4 + ...$

Then

$Y = \frac{X}{r} = 1 + r + r^2 + r^3 + ...$  $= 1 + X$

Hence $\frac{X}{r} = 1 + X$

And $\frac{X}{r} - X = 1$.

As we know $r = \frac{1}{2}$ we get $2X - X = 1$ so $X = 1$

## I have another blog

Decided I wanted to write about the books I read, even if they are not (computer) science related – so Not Computer Science Related

## We lost more than our record shops because of the MP3

Today HMV, once His Master’s Voice and still one of the UK’s leading retailers of CDs and DVDs, went into administration (one step away from going bust). Hopes are high that at least some of the shops can be saved – not least because the record labels and the DVD manufacturers want something that will compete on scale with Amazon.

There has been a lot of online commentary on the company’s failure to realise the scale of competition it faced from both online retailers of CDs and DVDs and from downloaders – with particular attention being given to the fact that a one-time chief executive told their advertising agency that downloading was “just a fad” early in the new millennium. LOLZ all round.

Well, it is worth recalling that this was once a fairly common feeling and was not entirely based on ignoring the evidence – MP3 players took a long time to get going, and I remember extolling the virtues of the compressed download for a few years before anybody took any notice – indeed things only really got going once Apple swung their marketing budget into action. Then the herd stampeded. But for a while there appeared to be nothing inevitable about it (of course this ignored the fact that NAND memory was doubling in density in an even shorter time scale than Moore’s Law suggested for CPU transistor numbers).

Things could have been different – a dozen years ago the MP3 as a format was even being shunned by some manufacturers because it was (and is, at least in some jurisdictions) encumbered by a patent. There were even commercially available OGG (an open and free format) players out there. But along came the people’s favourite jailers in the form of Apple and the moment when we could have struck a small blow for user freedom was over.

## Renamed my project Postmeta

As Steven Kelly pointed out to me there was a trade marked product called MetaEdit+ I renamed my project to build a editor/viewer of MetaPost files Postmeta: I don’t think that is taken.

I also worked out why I could not see the mplib.h header for the MetaPost library – or indeed many of the source files: I had to download the code from the MetaPost SVN repo – here – and then build the code. I think the fact it took me a day/a question on tex.stackexchange.com to get that answer shows how sadly neglected the excellent MetaPost package is.

## New coding project – metaedit – started

I have started writing a MetaPost editor – metaeditpostmeta – which is on Github.

I am not sure how much time and effort I will be able to devote to it, but hopefully I will produce something somebody (even if it is just me) can use.

Yesterday I discovered someone had written a DSL with Haskell to generate MetaPost output – and so maybe I could do something similar using Groovy, but to commit myself to that would be a step too far for now.

Still, it is good to be writing some code again.