So, dear reader, I wrote a Gauss-Jordan solver


Number of steps in the Euclidean algorithm for...
Number of steps in the Euclidean algorithm for GCD(x,y). Red points indicate relatively few steps (quick), whereas yellow, green and blue points indicate successively more steps (slow). The largest blue area follows the line y = Φx, where Φ represents the Golden ratio. (Photo credit: Wikipedia)

Been a while…

…Anyway: I am now designing (PhD related) a simulation of a NoC system and I need to have it run some code – so I thought I’d look at code to solve a system of linear equations and sought to build the same using Gauss-Jordan elimination.

It proved to be a lot more difficult than I had hoped: not that the maths was too complex, but that producing something I could later simplify from C++ into something that looked like synthetic assembly has and is proving very difficult indeed.

My system needs to execute a simple synthetic assembly because what I am modelling is the way the memory interactions work – so I have to get close to the metal and not hide behind a high-level language (no, not even C will do). Obviously – or at least it feels like that to me – I don’t want to have to mess about with floating point implementations, so I wanted integer arithmetic.

And that is where the first problem comes – to implement a Gauss-Jordan solver (or even to solve linear equations in the way you used to at school: the Gauss-Jordan algorithm is merely a scaling up of that), you have to divide numbers and sadly (for this purpose) not all numbers are multiples of one another – so the only way to preserve integer maths is to use rationals.

That, of course, is not such a problem. The C++ pair utility even makes it easy to model a rational out of two integers (writing a separate rational class would be overkill for something as small as this.)

But then, quickly, another problem becomes apparent. Even when using long long types to represent the numerator and denominator of our rational fraction repeated multiplications to enable fractional subtraction breaks the integer bounds after little more than a handful of row operations (and I want to model a large system – with literally thousands of equations in each system).

Even when adding a repeatedly applied gcd function to apply Euclid’s algorithm to the rational fractions, the numbers go out of range quickly.

So I have been forced to rely on an arbitrary precision library – GNU’s GMP. The code now works well (you can see at its Github repro and the solver code is below:

#include <iostream>
#include <sstream>
#include <fstream>
#include <string>
#include <vector>
#include <cstdlib>
#include <utility>
#include <gmpxx.h>



//Gauss-Jordan elimination

using namespace std;

void gcd(pair<mpz_class, mpz_class>& input)
{
	mpz_class first = abs(input.first);
	mpz_class second = abs(input.second);
	if (first > second) {
		first = second;
		second = abs(input.first);
	}
	while (second != 0) {
		mpz_class temp = second;
		second = first%second;
		first = temp;
	}
	input.first /= first;
	input.second /= first;
	if (input.second < 0) {
		input.first *= -1;
		input.second *= -1;
	}
}

int main()
{
	string path("./variables.csv");
	ifstream inputFile(path);

	vector<mpz_class> answers;
	//answer line
	string rawAnswer;
	getline(inputFile, rawAnswer);
	istringstream stringy(rawAnswer);
	string number;
	while(getline(stringy, number, ',')) {
		answers.push_back(atol(number.c_str()));
	}

	//now read in the system
	vector<vector<pair<mpz_class, mpz_class> > > lines;
	while(getline(inputFile, rawAnswer)) {
		istringstream stringy(rawAnswer);
		vector<pair<mpz_class, mpz_class> > innerLine;
		while (getline(stringy, number, ',')) {
			pair<mpz_class, mpz_class>
				addPair(atol(number.c_str()), 1);
			innerLine.push_back(addPair);
		}
		lines.push_back(innerLine);
	}

	for (int i = 0; i < lines.size(); i++) {
		pair<mpz_class, mpz_class> pivot(lines[i][i].second,
			lines[i][i].first);
		if (lines[i][i].first != 0) {
			lines[i][i].first = 1;
			lines[i][i].second = 1;
		} else {
			continue;
		}
		for (int j = i + 1; j <= lines.size(); j++) {
			lines[i][j].first *= pivot.first;
			lines[i][j].second *= pivot.second;
			gcd(lines[i][j]);
		}
		for (int j = i + 1; j < lines.size(); j++) {
			pair<mpz_class, mpz_class> multiple(lines[j][i].first,
				lines[j][i].second);	
			lines[j][i] = pair<mpz_class, mpz_class>(0, 1);
			for (int k = i + 1; k <= lines.size(); k++) {
				pair<mpz_class, mpz_class>
					factor(
					multiple.first * lines[i][k].first,
					multiple.second * lines[i][k].second);
				gcd(factor);
				lines[j][k] = pair<mpz_class, mpz_class>
					(lines[j][k].first * factor.second -
					factor.first * lines[j][k].second,
					lines[j][k].second * factor.second);
				gcd(lines[j][k]);
			}
			
		}
	}
	// now eliminate upper half
	for (int i = lines.size() - 1; i > 0; i--) {
		if (lines[i][i].first == 0) {
			continue;
		}
		pair<mpz_class, mpz_class> answer = lines[i][lines.size()];
		for (int j = i - 1; j >= 0; j--) {
			pair<mpz_class, mpz_class> multiple = lines[j][i];
			if (multiple.first == 0) {
				continue;
			}
			lines[j][i] = pair<mpz_class, mpz_class>(0, 1);
			lines[j][lines.size()].first =
				lines[j][lines.size()].first * multiple.second
				- answer.first * multiple.first * 
				lines[j][lines.size()].second;
			lines[j][lines.size()].second =
				lines[j][lines.size()].second *
				answer.second * multiple.second;
			gcd(lines[j][lines.size()]);
		}
	}
	cout << "DIAGONAL FORM" << endl;
	for (int i = 0; i < lines.size(); i++) {
		for (int j = 0; j < lines.size(); j++) {
			if (lines[i][j].first == 0) {
				cout << "0 , ";
			} else {
				cout << lines[i][j].first << "/" << lines[i][j].second << " , ";
			}
		}
		cout << " == " << lines[i][lines.size()].first << " / " << lines[i][lines.size()].second << endl;
	}

}

(NB: I know that GMP supports rationals “out of the box” but I am trying to model something that I can translate to my synthetic assembly and so hiding the details of the implementation is not what I want to do.)

How on Earth, though, am I going to model arbitrary precision code in my “assembly” code?

Not even on Wikipedia…


If you are old enough, like me, to remember the Cold War before the days of glasnost and perestroika, you will also recall that one of the strategic weaknesses of the Soviet Union was that it was forced to steal and copy advanced western technologies, seemingly unable to invent them itself.

In many cases that was plainly true – spies stole the secrets of the Manhattan Project to give Stalin his atomic bomb (though Soviet scientists devised H-bomb mechanisms independently).

But in the case of computing, the decision to copy the west was a deliberate and conscious one, taken despite real skill and specialism existing inside the Soviet Union. A while back I wrote about how Soviet computer scientists appeared to be some years ahead of the west in the study of certain algorithms that are important for operating system management. In hardware it was not that the Soviets had a lead – but the first electronic computer on continental Europe was build in the Soviet Union and was based on independent research – but they certainly had real know-how. What killed that was a decision by the Soviet leadership to copy out-of-date IBM machines instead of continuing with their own research and development.

All this is recounted, in novelised form, in the brilliant Red Plenty. The book highlights the role of  Sergey Alekseevich Lebedev, the Ukrainian known as “the Soviet Turing“. Like Turing, Lebedev was taken from his work (as an electrical/electronic engineer rather than a mathematician) by the war and played an important role in Soviet tank design. Afterwards he returned to his studies with a vengeance and by the mid-fifties he was building some of the world’s fastest and, arguably, the best engineered, computer systems  – the so-called MESM (a Russian acronym for “Small Electronic Calculating Machine”.)

Yet today he does not even appear to rate an entry in Wikipedia.

The Soviet computer industry was not just killed by poor decisions at the top, but by the nature of the Soviet system. Without a market there was no drive to standardise or commoditise computer systems and so individual Soviet computers were impressive but the “industry” as a whole was a mess. Hopes that computers could revolutionise Soviet society also fell flat as the centralised planning system ran out of steam. Switching to copying IBM seemed like a way of getting a standardised system off the shelf, but it was a blow from which Soviet computing never recovered.

In continued praise of “Programming Pearls”


Algorithm City

I have read two books in the last year that have fundamentally changed the way I think about computers and programming – The Annotated Turing and Programming Pearls.

Programming Pearls in particular continues to inspire me in the way it makes you think about building better algorithms and using data structures better – and here’s a short tale about what I have just done (and why I did it) with my valext program.

Valext runs a program under step tracing to allow the detailed examination of patterns of memory use – interrogating the /proc/pid/pagemap at every step.

The way I was doing this was to query the /proc/pid/maps, find the blocks that were being mapped and then seek through /proc/pid/pagemap, reading off the page status.

For every page that was mapped I would build a data structure and stick it into a linked list

struct blocklist {
    uint64_t address;
    struct blocklist *nextblock;
};
...
for (i = startaddr; i < endaddr; i++)
{
    block = malloc(sizeof (struct blocklist));
    block->address = i;
    block->nextblock = NULL;
    if (*lastblock == NULL){
        *lastblock = block;
        *header = block;
    } else {
        (*lastblock)->nextblock = block;
        *lastblock = block;
    }
}

But the problem with this code – an \mathcal{O}(n) algorithm – is that performance stinks. I was running it on a very underpowered laptop and got to about four days of wallclock time, in which time about 2 million steps had been taken.

So today I rewrote it with an essentially \mathcal{O}(1) algorithm – I allocate an array of arbitrary size (512 elements in current code) and fill that up – with a 0 value being the guard (ie marking the end). If more than 512 elements are needed then another 512 will be allocated and chained to the top and so on…


struct blockchain {
    int size;
    uint64_t *head;
    struct blockchain *tail;
};

uint64_t* blockalloc(int size)
{
    uint64_t *buf = calloc(size, sizeof(uint64_t));
    return buf;
}

struct blockchain *newchain(int size)
{
    struct blockchain *chain = NULL;
    chain = calloc(1, sizeof(struct blockchain));
    if (!chain)
        return NULL;
    chain->head = blockalloc(size);
    if (chain->head == NULL) {
        free(chain);
        return NULL;
    }
    chain->size = size;
    return chain;
}

/* recursively free the list */
void cleanchain(struct blockchain *chain)
{
    if (!chain)
        return;
    cleanchain(chain->tail);
    free(chain->head);
    free(chain);
    return;
}

/* set up a list */
struct blockchain* getnextblock(struct blockchain** chain,
struct blockchain** header, char* buf)
{
    int match, t = 0;
    uint64_t startaddr;
    uint64_t endaddr;
    uint64_t i;
    const char* pattern;
    regex_t reg;
    regmatch_t addresses[3];

    pattern = "^([0-9a-f]+)-([0-9a-f]+)";
    if (regcomp(&reg, pattern, REG_EXTENDED) != 0)
        goto ret;
    match = regexec(&reg, buf, (size_t)3, addresses, 0);
    if (match == REG_NOMATCH || match == REG_ESPACE)
         goto cleanup;
    startaddr = strtoul(&buf[addresses[1].rm_so], NULL, 16) >> PAGESHIFT;
    endaddr = strtoul(&buf[addresses[2].rm_so], NULL, 16) >> PAGESHIFT;
    if (*chain == NULL) {
        *chain = newchain(MEMBLOCK);
        if (*chain == NULL)
            goto cleanup;
        *header = *chain;
    }
    for (i = startaddr; i < endaddr; i++)
    {
        if (t >= MEMBLOCK) {
        struct blockchain *nxtchain = newchain(MEMBLOCK);
        if (!nxtchain)
             goto cleanup;
        (*chain)->tail = nxtchain;
        *chain = nxtchain;
        t = 0;
    }
    (*chain)->head[t] = i;
     t++;
    }
cleanup:
    regfree(&reg);
ret:
    return *chain;
}

As I am only testing this code now I don’t know if it will be faster – though everything tells me that it should be (though I am not entirely clear if this memory allocation issue is the real bottleneck in the program or whether it is just that a Pentium III is a Pentium III and there is nothing much I can do about that).

What I do know is that if it was not for the inspiration of Programming Pearls I probably would not even have bothered trying.

“Basically, you would be able to compute anything you wanted”


The quote that forms the title here comes from Lance Fortnow, a computer scientist at Northwestern University, in an article (here – subscription required) in the current edition of the New Scientist on the P = NP question.

It’s an odd statement for a computer scientist to make – most numbers are transcendental numbers and so are fundamentally incomputable: for instance there are \aleph_{0} (aleph null – the smallest of Cantor’s hypothesised infinities) transcendental numbers between 0 and 1 (or between any range of integers).

But besides that oddity it is actually a very good article – calling the world where P = NP Algorithmica – “the computing nirvana”.

I have written before of how much I hope we do live in Algorithmica, though the consensus is we live in a world of NDAlgorithmica (ND for non-deterministic).

The article’s beauty is that it also examines the states between the two: what if, for instance, we discovered that the class of P problems were identical to the class of NP problems but that we could not find the P algorithm, or that the P algorithm was of such a degree of complexity it “would not amount to a hill of beans”?

The opposite of science, but could be fun


Venn diagram for P, NP, NP-Complete, and NP-Ha...
Image via Wikipedia

I have created a prediction market on Vladimir Romanov’s P=NP proposal. Starting price for it being proved correct by the end of 2011 is very low – or the odds are long, depending on how you look at it (10 cents wins $99.90 ie 999/1).

Have a bet – it’s free and it’s fun.

It is here.

Possibly the most important news you will read this year


This an example of how a public and private ke...
Image via Wikipedia

Apparently P==NP. (So public key encryption – used for internet commerce – is broken and many more problems than we previously thought are quickly solvable).

At least that is the suggestion you can read here. Slashdot also has this here.

If it’s true then the revolution has just begun. If it’s false, well, tomorrow’s another day…

What if P = NP?


Update (5 March): read a better version here.

I admit I now going slightly out of my depth, but I will try to explain what this is about and why it is interesting.

It is known that computers can solve some problems in what is called “polynomial time“: that is to say a finite time that is proportional to a polynomial of complexity of the input. The key thing is that these problems are computable using mechanical steps (ie algorithms) in a way that is (sometimes euphemistically) described as “quickly”.

These can be simple problems – like what is the sum of the first ten integers – or more complex ones, such as creating self-balancing trees, sorting database records alphabetically and so on.

These are the so-called “P” (for polynomial time class) problems. Here’s a definition:

A problem is assigned to the P (polynomial time) class if there exists at least one algorithm to solve that problem, such that the number of steps of the algorithm is bounded by a polynomial in n, where n is the length of the input.

Then there are another class of problems which seem fiendishly difficult to solve but which it is relatively simple to prove the correctness of any solution offered. These problems can also be solved (computed) in polynomial time – ie a finite time – and they can also be computed by a Turing machine (a simple model of a computer) and so an algorithmic solution exists. It is just that one cannot tell what that algorithm is. These are said to be solvable in unbounded polynomial time – and in the worst case the only way – it is thought – that a solution can be found is through an exhaustive search of all algorithms – in other words a form of “brute force“. These are the NP (Not in class P) problems.

Now most mathematicians think that NP does not equal P and that may or may not be a good thing as much of our internet commerce relies on encryption which is thought to be an NP problem.

(In Afghanistan in 2001 US cryptanalysts seemingly brute forced a Taliban Windows NT box but it used much weaker encryption than most electronic commerce.)

But what if it were the case that all seemingly NP problems were actually P problems? There are a lot of people studying this issue – but according to the New Scientist (their Christmas edition, the first in my subscription and delivered this morning, inspired me to write this) we should expect to wait at least until 2024 for an answer (by which point the problem – first formulated in 1971 – will have reached the age at which 50% of major mathematical problems will have been solved).

Some problems thought to be NP have already been shown to be P and there was a big fuss earlier in 2010 when a draft proof of P = NP (edit: it was actually P != NP) was published (the proof was flawed). And unlike, say, Fermat’s last theorem, proving P = NP is likely to have more or less immediate effects on the lives of many of us (how would you feel if you knew that it was possible, even if not likely, that someone had developed a way to quickly crack open all your internet credit card transactions?)

Of course, proving P = NP for all problems does not necessarily mean we will have determined polynomial time based solutions for all the current NP problems out there. But I would expect it would quickly lead to the solution of a multitude of them.

And, actually, I think the benefits to humanity would be enormous. The most likely immediate effects would be in improvements in computing/operating system efficiency. fast computers for less money and/or less energy consumption would be a huge benefit. From that alone many other benefits will flow in terms of information availability and universality.