Now, it’s not really relevant to the questions I may get asked and I am about to display some mathematical vulnerability here, but I just need to know the answers, so here we go anyway…(plus it also allows me to test out the native LaTeX support on the blog).
Tanenbaum’s book is structured as an examination of the various layers in a network stack and so begins its examination of the physical layer – starting with an examination of the impact of bandwidth and Nyquist and Shannon’s work on the fundamental limitations of any physical medium.
He starts his explanation of the impact of limited bandwidth with the concept of the Fourier series. [Fourier’s work showed that any periodic function could be represented as a superposition of other periodic, orthogonal, functions – crudely you can make any finite line by adding together lots of sine and cosine lines of different frequency and amplitude].
Now, I have obviously come across these before – both in A level maths and at university: it would be difficult to have a degree in the physical sciences without having done so! But that was a quarter of a century and more ago and I really cannot recall whether I was given a rigorous or semi-rigorous explanation or just asked to take it all on trust.
So I wanted to explain it all to myself – I don’t doubt it, clearly. Even intuitively it seems close to obvious, but I wanted to understand the maths a bit more.
Looking through various explanations I found the one on Wolfram Mathworld rather easier to grasp than the one on Wikipedia. But here’s the rub…
So I thought I’d try this out, and used a crude spread sheet to do it. Here is a graph of where .
And here it is when
And so on … the first function would clearly integrate to something close to – just assume it’s average value is 0.5 and you can see that (as the domain is to ) and similarly visual inspection shows the second to be zero.
Update: Below here I start to go all wrong … see here
And in fact this seem to work well for even fractional ratios over 2 – here’s 2.5 and 9.5:
But this is what I get with :
Plainly that does not integrate to 0 (nor for that matter). So what have I got wrong?
Binary trees are seen and used frequently in computing science and computing. They are a good abstraction for many naturally occurring relationships (most of our mathematics is based on binary operations, for instance) and have O(log n) complexity (ie if you went from searching a tree of 1000 elements to a tree of 100,000 elements then the search should not take 100 times longer but about 10.)
Of course that log n goodness requires the tree to be “balanced” ie for any given node there should be roughly equal numbers to the left and the right. One way of doing this is through a “red black tree” – here nodes in the tree are assigned a colour: the root is always black. The rule is that any traversal from the root to the leaves should always go through an equal number of black nodes and to ensure this is possible red nodes may be inserted in the tree, but no red node may have another red node as an immediate descendant. (A full explanation is in Introduction to Algorithms though one can also work off various explanations on the internet, though they tend to be less than complete.)
The Linux kernel natively implements a red black tree (in C) and the first bit of work I started on my MSc project was to write my own, userland, implementation so I could see processes in the same way the kernel did.
As I had got a less than glorious mark (still a pass, so that’s what counts) in the C++ exam last year I also decided that I would write this in C++ using templates. A few days after I started I discovered that actually the writers of the STL had got there before me, but as this was an academic exercise I ploughed on with my own implementation.
Essentially this is what I did on my two week summer holiday in Scotland last year! When I was there I also started (though completed when I got home) a couple of helper applications to position the tree according to Reingold and Tilford’s algorithm (which I had to translate from PASCAL) for “better drawing of trees” and a Qt application to display it all.
In fact I had a nagging problem with the Reingold-Tilford algorithm which I finally got around to fixing last night.
(Interestingly the code also allows you to use the Turing-complete capabilities of LaTeX by specifying a TeX output that uses LaTeX’s own positioning algorithm – something I picked up from The LATEX Graphics Companion – that is what the example shown above uses, though unfortunately for even moderately loaded systems the LaTeX processor objects to the width of the output).
Fancy trying it? I could do with someone giving it a bash on a BSD system – not needed for my course but interesting none the less.
The code is all at GitHub – http://github.com/mcmenaminadrian: memball gives the basic GraphML or TeX or plaintext output, treedraw will convert the GraphML to an SVG or serialiszed stream using Reingold and Tilford’s algorithm and treeqt will use Qt to display the tree using the serialized class. You may have to download various libraries to get it to work (certainly the libproc-dev package on Ubuntu/Debian) – I couldn’t get memball to work on a Windows machine using Cygwin but maybe even that is fixable.
There is a script in the treeqt repo to make it easier: download the sources form all three repos, build them and then run:
./setup | ./treeqt --r 1
(Update: Treeqt does not display an SVG – so I have updated the text to reflect that)
I use LaTeX (LyX as the front end) to generate the occasional bits of mathematical script that appear here. Is it bad to admit it gives me a small thrill to see how the LaTex is transformed into something mathematical – makes me feel like a proper scientist/mathematician. Well, I’ve done it now…
WordPress.com will not display postscript natively (or if it can I have never been able to get it to work), so I usually open the .ps files in GIMP and convert them to a PNG – PNGs being optimised to handle things like line drawings and graphical images of text.
The PNGs that come out of this are of a reasonable size but they could be smaller. Some years ago I wrote some Perl code to do that (I needed something that worked in a CGI environment and while Perl is not so fast for even integer maths it was the best option).
That code is available on CPAN as Image::Pngslimmer and on a typical Linux/BSD (OSX too?) install (I am assuming you have perl and the cpan module installed) you should be able to get it on to your box with sudo cpan Image::Pngslimmer – which may ask you a lot of questions before it installs if you have never used CPAN but while YMMV the default answers should probably work.
You can write your own script or just use this quick and dirty one for PNGs of 64K or less:
my $srcfile = shift;
open INF, $srcfile;
read(INF, $buffer, 65535);
my $blob2 = Image::Pngslimmer::indexcolours($buffer);
my $blob3 = Image::Pngslimmer::discard_noncritical($blob2);
To use this copy the script above into your editor and save it with a name eg shrinker.pl and then execute like this: perl shrinker.pl some.png > smaller.png which will create a new file smaller.png from some.png (which is untouched).
The results can be reasonably impressive. This file – created by cutting from the GIMP – comes in at 14.4KB (NB the graphics you see here are preprocessed by wordpress.com, click on the graphics to see the original):
But after a bit of processing with the above script, this comes in at 9.7KB:
Finally, please do not say this doesn’t matter because your pipe is so fat. Millions of these things are being pumped through the internet every minute and the collective saving of bandwidth really would matter if we could deliver it…
Anyone who works on software development and in the FOSS world generally is used to seeing books and documentation in English. It is certainly a great advantage to be a fluent speaker and reader.
But it is not always the case – as I have just found out.
Right now I have returned to writing my MSc project proposal and that means back to using the power of LyX and LaTeX. But with great power comes great complexity and it can be tough navigating all of this.
So I discovered there is an O’Reilly “Hacks” book for LaTeX – LaTeX Hacks.
Great! I was going to order it without even bothering to read a review, so sure was I that it would be helpful and useful: until I discovered it was in German and there is no sign of an English translation.
To make matters worse, it seems that the O’Reilly quick reference – LaTeX – is also auf Deutsch.
This all has an odd, and unsettling feel to it. A century ago German domination of the physical and mathematical sciences was near-complete. Think of 1905 and Einstein just for starters.
But since the tragedy and disaster of Hitler we are used to thinking of the Germans as great engineers but the US clearly as the world’s leading centre of scientific research. And when a threat to that is identified it is usually seen as being from China (as Barack Obama said only a few weeks ago in his state of the union address). But maybe the LaTeX domination of Germany suggests there is life in the old world yet.
Either that or O’Reilly need to pull their fingers out on translating this stuff.
(The graph shows the numbers of people in EU members states who speak German as a foreign or second language: I did a year of it at High School but would not claim to know much beyond some very basic vocabulary and grammar).
I started work last night on writing up a very early draft of my project proposal. For several reasons it was a lot more difficult than I had expected.
Firstly, and typically, I let the technology of the writing tool get in the way of the actual writing. I spent much more time fiddling with LyX and various templates than writing anything. Should I just write plain text in a word processor and then copy that into the LaTeX tool or soldier on with LyX (after all even the project proposal will have to include mathematical notation)?
Secondly, while I hoped to use Christmas, and the time off work, to find the time to work on this, doing without distraction means staying up to 3am as the house is only quiet after 1am or later at Christmas. Not really sustainable.
Thirdly, and this is the most difficult one – all my mental images of what I was going to write – here’s the hypothesis (currently framed as “a more thorough-going application of the working set concept in the Linux kernel will improve performance”) and everything else flows from that, just melted away.
Indeed I am sort of working on the idea that I really need to explain some of the core ideas and then present a hypothesis to be tested.