### In continued praise of “Programming Pearls”

I have read two books in the last year that have fundamentally changed the way I think about computers and programming – The Annotated Turing and Programming Pearls.

Programming Pearls in particular continues to inspire me in the way it makes you think about building better algorithms and using data structures better – and here’s a short tale about what I have just done (and why I did it) with my valext program.

Valext runs a program under step tracing to allow the detailed examination of patterns of memory use – interrogating the /proc/pid/pagemap at every step.

The way I was doing this was to query the /proc/pid/maps, find the blocks that were being mapped and then seek through /proc/pid/pagemap, reading off the page status.

For every page that was mapped I would build a data structure and stick it into a linked list -

struct blocklist {
struct blocklist *nextblock;
};
...
{
block = malloc(sizeof (struct blocklist));
block->nextblock = NULL;
if (*lastblock == NULL){
*lastblock = block;
} else {
(*lastblock)->nextblock = block;
*lastblock = block;
}
}


But the problem with this code – an $\mathcal{O}(n)$ algorithm – is that performance stinks. I was running it on a very underpowered laptop and got to about four days of wallclock time, in which time about 2 million steps had been taken.

So today I rewrote it with an essentially $\mathcal{O}(1)$ algorithm – I allocate an array of arbitrary size (512 elements in current code) and fill that up – with a 0 value being the guard (ie marking the end). If more than 512 elements are needed then another 512 will be allocated and chained to the top and so on…


struct blockchain {
int size;
struct blockchain *tail;
};

uint64_t* blockalloc(int size)
{
uint64_t *buf = calloc(size, sizeof(uint64_t));
return buf;
}

struct blockchain *newchain(int size)
{
struct blockchain *chain = NULL;
chain = calloc(1, sizeof(struct blockchain));
if (!chain)
return NULL;
free(chain);
return NULL;
}
chain->size = size;
return chain;
}

/* recursively free the list */
void cleanchain(struct blockchain *chain)
{
if (!chain)
return;
cleanchain(chain->tail);
free(chain);
return;
}

/* set up a list */
struct blockchain* getnextblock(struct blockchain** chain,
{
int match, t = 0;
uint64_t i;
const char* pattern;
regex_t reg;

pattern = "^([0-9a-f]+)-([0-9a-f]+)";
if (regcomp(&reg, pattern, REG_EXTENDED) != 0)
goto ret;
match = regexec(&reg, buf, (size_t)3, addresses, 0);
if (match == REG_NOMATCH || match == REG_ESPACE)
goto cleanup;
if (*chain == NULL) {
*chain = newchain(MEMBLOCK);
if (*chain == NULL)
goto cleanup;
}
{
if (t >= MEMBLOCK) {
struct blockchain *nxtchain = newchain(MEMBLOCK);
if (!nxtchain)
goto cleanup;
(*chain)->tail = nxtchain;
*chain = nxtchain;
t = 0;
}
t++;
}
cleanup:
regfree(&reg);
ret:
return *chain;
}



As I am only testing this code now I don’t know if it will be faster – though everything tells me that it should be (though I am not entirely clear if this memory allocation issue is the real bottleneck in the program or whether it is just that a Pentium III is a Pentium III and there is nothing much I can do about that).

What I do know is that if it was not for the inspiration of Programming Pearls I probably would not even have bothered trying.