I now have some code that is meant to parse an XML file of approximately 5 billion lines.
Unfortunately it fails, every time (it seems), on line 4,295,025,275.
This is something of a nightmare to debug – but it looks like an overflow bug (in the xerces-c parser) of some sort.
Do I try to find it by inspection or by repeating the runs (it takes about 4 – 5 hours to get to the bug point)?
One is probably quite difficult but simple to organise. The other is (relatively) easier – just step through the code – but is perhaps impossible to organise – how many weeks of wall clock time in a debugger before we get to that instance?
- DOMParser (forums.thedailywtf.com)
- Minutes from Friday (Nov 8 2013) meeting (thesourcecraft.wordpress.com)
- How to parse xml in android-java? (stackoverflow.com)