## How slow is a fast computer?

I am working on a simulation environment for a NoC. The idea is that we can take a memory reference string – generated by Valgrind – and then test how long it will take to execute with different memory models for the NoC. It’s at an early stage and still quite crude.

The data set it is working with is of the order of 200GB, though that covers 18 threads of execution and so, very roughly speaking, it is 18 data sets of just over 10GB each. I have written some parallel Groovy/Java code to handle it and the code seems to work, though there is a lot of work to be done.

I am running it on the University of York’s compute server – a beast with 32 cores and a lot of memory. But it is slow, slow, slow. My current estimate is that it would take about 10 weeks to crunch a whole dataset. The code is slow because we have to synchronise the threads to model the inherent parallelism of the NoC. The whole thing is a demonstration – with a vengeance – of Amdahl’s Law.

Even in as long a project as a PhD I don’t have 10 weeks per dataset going free, so this is a problem!

## Get your books more cheaply – possibly

I often link from this site to Amazon – the hope is that you might buy some of the things I talk about and then I get a (small) return which I can use to buy things I like at Amazon. It doesn’t work too well for me – I think I have made 26 pence in the last six months or so. Back when I ran a blog about politics it was a bit more successful – people obviously thought I had something worthwhile to say on politics, but less so on science and computing. For shame. (Actually, another factor is that the vast majority of this blog’s readership is from outside the UK.)

So, I guess I am not exactly cutting my own throat when I tell you that recent legal changes in the UK may mean that while Amazon is a great place to see what is available and even to look who to buy from, it might not make financial sense to buy from them directly.

Until recently Amazon exercised some pretty heavy controls over the prices third-party sellers could charge customers who bought directly from their (the third party’s) website. In essence they forbid them from charging less than they offered through Amazon, at pain of being delisted from Amazon.

But that approach has now been ruled anti-competitive and so it might just make sense to use Amazon to window shop and to buy directly from the third party seller.

Here’s a real example I have just spotted.

The Pragmatic Programmer is a very well regarded book on programming (often compared to the brilliant Programming Pearls, a book absolutely every programmer should read) – one I was thinking of buying for myself. Amazon sell it for £22.39 – including the price of delivery.

The two cheapest third party sellers offer it for £17.86 and £17.87 respectively, but also charge an additional £2.80 for delivery – bumping up the price to £20.66 and £20.67, still cheaper than buying directly from Amazon though. But if you go to the second of these two suppliers – UK Paperback Shop – they are selling it direct for £18.76.

Other sellers may even be cheaper – I haven’t checked.

Of course, there might be other reasons why you want to shop through Amazon, but it is worth remembering that it might pay to look around. I am guessing that the savings will be most easily realised if you are buying low volume or specialist or technical books.

Of course if you want to buy a Kindle Tablet I’d love it if you did it through that link 🙂

## Lousy wireless on OS X Mavericks

To be honest I have not noticed any improvement on upgrading to OS X Mavericks on this laptop, but I have noticed a significant degradation in the performance of the wireless networking – notably dropped connections and long re-connect times. Not impressed.

And, of course, it is proprietary software so there is little chance of a community-originated fix appearing.

A while back I reported that detailed statistics from the UK’s Higher Education Statistics Authority showed that recent computer science graduates had a pretty poor record in finding a job.

Yesterday new statistics from the UK’s National Statistics Office showed that unemployment for all recent graduates has remained persistently high – it’s been close to 9% since 2008. Even so that is much better than the position faced by younger non-graduates – where unemployment is around 14%, while older non-graduates might be seeing a small fall in the unemployment rate towards 6%.

The statistics published yesterday do not break down the position of recent graduates by subject area but they do show that maths and computer science graduates (the report insists on referring to “undergraduate degrees” when, of course, there are no such things – once you have a degree you are by definition no longer an undergraduate) have an employment rate of 89%. That is less than stellar and behind those with degrees in medicine, medical-related subjects, agriculture, technology, architecture, business and finance, media and information studies, physical sciences, and linguistics, and little better than engineering and biological sciences.

Graduates in maths and computer sciences do tend to be well paid though – the average salary is £34,008 – behind medicine, physical sciences, engineering, and architecture but ahead of everyone else.

Given that those in the software industry perennially complain that they cannot fill vacancies I am left thinking one of the following applies:

• Recruiters expect too much (and offer too little on-the-job training) and so have to pay a premium when they do recruit;
• Many computer science graduates are actually quite poor at programming and so are not easily employable;
• Many computer science graduates do not work in the industry and so recruitment is difficult.

## If you see this, then beware…

A friend posted a link to his blog on Facebook. Went there to be confronted by this…

It’s a complete scam. I am not even using Windows and nor is my Java plugin out of date. If you read the grey text you discover that what they are really trying to do is get you to download malware or software is as close to malware as one can go without actually letting someone else take control of your machine.

You have been warned!

## y^2 + x^2 = 1

This entry is based on the prologue for the book Elliptic Tales: Curves, Counting, and Number Theory (challenging but quite fun reading on the morning Tube commute!):
$y^2 + x^2 = 1$ is the familiar equation for the unit circle and in the prologue the authors show how a straight line with a rational slope intersects a circle at two point which is rational i.e, of the form $(x, y) = (\frac{a}{b},\frac{c}{d})$ then the second point is also rational and that all such lines trace out the full set of rational points on the circle.

But then the book goes further –

We say that circle $C$ has a “rational parametrization”. This means that the coordinates of every point on the circle are given by a couple of functions that are themselves quotients of polynomials in one variable. The rational parametrization for $C$ we have found is the following:

$(x,y) = (\frac{1 - m^2}{1+m^2},\frac{2m}{1+m^2})$

So this is what I wanted to check… it surely isn’t claiming that all the points on the circle are rational, is it? Merely that the above – if $m$ (which corresponds to the slope of our line through the two points) is rational, generates the full set of rational points on the circle. Because if $m$ is not rational then the second point will not be either? Is that right?

## More on the Summa Metaphysica

David Birnbaum’s office contacted me having seen what I wrote on the “Summa Metaphysica” and sent me this link – www.paradigmchallenge.com

I have to say, of course, that I am a deep sceptic and not endorsing these ideas in any way – quite the opposite – but as both he and his office were perfectly pleasant I haven’t really got the heart to argue about it here.

## Quickly deleting multiple lines from a text file

This may be useful to someone, it was for me.

To delete (say) between lines 10 and 1000000000 (inclusive) use sed:

sed 10,1000000000d <infile >outfile

## New, improved Hexxed

I have not had much luck in hunting down what is wrong with my code or the Xerces-c SAX2 parser – but I do think I have successfully updated by hex editor, Hexxed, to handle 64 bit (ie >4GB) files.

Indeed it performs rather better than vi for some editing tasks (Hexxed has a vi like interface).

So, if a hex editor, capable of handling little and big endian code and able to display output in Unicode is what you are after, and if you are vi-conditioned, then maybe Hexxed is your thing.

Groovy code can be found at: https://github.com/mcmenaminadrian/hexxed

While a runnable jar for those of you who have Java but are not yet Groovy can be downloaded at: http://88.198.44.150/hexxed.jar

And there is more about it here: https://cartesianproduct.wordpress.com/2012/06/03/hexxed-usage-options/

Just remember it is code for playing with – don’t bet the farm on it. But, that said, I have no reason to think it does not work.

## Going on a bug(?) hunt

I now have some code that is meant to parse an XML file of approximately 5 billion lines.

Unfortunately it fails, every time (it seems), on line 4,295,025,275.

This is something of a nightmare to debug – but it looks like an overflow bug (in the xerces-c parser) of some sort.

Do I try to find it by inspection or by repeating the runs (it takes about 4 – 5 hours to get to the bug point)?

One is probably quite difficult but simple to organise. The other is (relatively) easier – just step through the code – but is perhaps impossible to organise – how many weeks of wall clock time in a debugger before we get to that instance?