I have a problem.
I need to parse 220GB of XML to find an OPT reference string (i.e. to profile the memory references of a running application to create a page replacement algorithm that will know which is the most efficient page replacement decision.)
I had one algorithm in mind for this and I wrote the Groovy code but it kept exhausting the memory pool of the VM (as it generated a lot of anonymous memory). Even whacking the VM up in size quite considerably didn’t seem to fix the problem.
So I used another algorithm – and it works. After a fashion. Because even with 18 threads running in parallel on the University of York’s big iron compute server I think it might take around two years to complete. And I don’t have two years.
So I need to find something better – something where I have much more control over the memory allocation code and where I can also be sure of performance.
C is the obvious answer. Or, in this case, C with an interface to some C++ code I wrote a few years ago to build a red-black tree.
The strange thing is that, despite Groovy’s claim that it removes the really tedious bits of code writing by doing away with boiler plate, this weekend I have found writing all that getter/setter code quite therapeutic: perhaps this is a programmer’s comfort blanket?