Regular expressions are surely one of the greatest pleasures, puzzles and pains any programmer has to deal with.
So, here’s one for you to figure out: I have already solved it, but will be intrigued if someone comes up with a better version than mine.
BASIC syntax includes the
IF ... THEN construct e.g.
IF X > 5 THEN GOTO 150 .
Now, for BINSIC, the BASIC-like domain specific language I am building using Groovy, I have to parse these structures, putting brackets round the if clause and so on. So, I have to be able to pull out the conditional. But BASIC might also have code like this
IF X > 5 THEN IF Y < 10 THEN GOTO 150 (NB: I know this can be replicated with a single conditional using boolean operators, but that’s not the point: the language allows
THEN to be followed by any other valid BASIC statement and that must include another
IF ... THEN clause.)
So, what regex would you use to pick out the first conditional statement but not any subsequent statement (these can be passed recursively into the parser)?
You can cheat by looking at the BINSIC code on GitHub, but what would be the point? I’ll post an/my answer sometime later this weekend…