Your assembly is my domain-specific language

A few years ago I set out to write a clone of Sinclair BASIC as a domain-specific language (DSL) in Groovy. The end result was BINSIC, but it was much closer to an interpreter than a DSL: it turned out that strange ways that Groovy handled capitalised function/closure names meant that I could not match keywords at all and so interpretation became the only option (though there were DSL-like features lurking under the covers).

Now I have set out to write an interpreter for RISCV assembly and have written something which is much more like a DSL. My aim was to track the memory reference patterns of various real-time benchmarks running on RISCV systems – based on the disassembly generated by Spike, the RISCV emulator.

Getting that done requires tracking register state – because in true RISC fashion reads and writes are done not to immediates, but to addresses held in registers with immediates used as offsets.

To make this happen every (or at least every used) RISCV instruction is mapped to a closure and the operands treated as closure parameters. The closures are all inside the RegisterFile class so all can access the registers to keep the state updated. This makes it quite like a DSL but I make no claims for purity: every statement is passed through a regular expression based interpreter to separate out the parameters: if nothing else that eliminates a lot of boilerplate code that would have to be stuck inside the closures.

Memory is treated as a sparse array through a Groovy Map instance – with an address (64-bit Long) mapped to an 8-bit Byte.

The process isn’t perfect – a disassembly doesn’t given you the contents of initialised data, so an attempt to access those addresses is quite likely to raise an exception which needs to be trapped and the memory accounted for – by simply assigning zero to the address: it’s a kludge but it seems to work.

Groovy mandating camel case?

Camel case” is the capitalisation system preferred by Java and many other programmers – classes are capitalised e.g. FooBar, while members have a lower case initial e.g. fooBar.

English: Logo of the Groovy project
English: Logo of the Groovy project (Photo credit: Wikipedia)

It seems like Groovy may be trying to enforce this or something like it, and that is what is causing my BINSIC code to fail – as I have put PRINT all in capitals (as in BASIC), it is being treated much less flexibly than if it were lower case. Indeed the issue seems to be the capitalisation of the first letter.

I have not read anywhere of this as a mandatory piece of the language and so it looks like a bug to me.

BASIC as a domain specific language

A few posts back I was bemoaning the end of the simplicity of the BASICs I used thirty years ago – then I could just write a few lines to visually solve an equation and so on.

Cartridge with BASIC computing language for At...
Cartridge with BASIC computing language for Atari 8 bit computers. Model CXL4002. Photo by the uploader. (Photo credit: Wikipedia)

That got me thinking about how to recreate that as domain specific language (DSL) – writing some other code that interprets the graphics primitives to put dots and lines on the screen. That pit seems quite simple. But, of course, to be able to plot a function you need to implement the maths code too – including some relatively complex stuff like SIN, COS, TAN etc.

And, presumably, you would also want loops to advance your parameters along a bit as well – pretty soon you would end up implementing a fairly substantial BASIC interpreter.

Still a project worth thinking about in my view – but it seems someone, not surprisingly, has already done this – treating BASIC as a DSL using Scala.