An x86-64 backend for raven-uxn
Uxn is a fictional CPU, used as a target for various applications in the Hundred Rabbits ecosystem. It's a simple stack machine with 256 instructions:
My implementation of the Uxn CPU now has an x86-64 assembly implementation, which is about twice as fast as my Rust implementation. This required porting about 2000 lines of ARM64 assembly to x86-64, which was accomplished with the help of a robot buddy.
Let me provide a little more context.
A few years back, I wrote a Rust implementation of the CPU and peripherals, which was 10-20% faster than the reference implementation. For more background info, see that project's writeup:
The Rust implementation is fast, but suffers from the usual downsides of a bytecode-based VM: the main dispatch statement is an unpredictable branch.
I then wrote an assembly implementation of the interpreter, which proved to be about 30% faster than the Rust version. This was hard: it took several days of work, and there were lingering bugs that I didn't discover until I added a fuzz tester to check for discrepancies between the Rust and assembly implementation.
The assembly implementation is written for an ARM64 target, for two reasons:
I'm working on an ARM Macbook
Writing ARM assembly by hand is a fun intellectual exercise because the ISA is pleasantly orthogonal and well-organized, while x86 assembly is... less so
... continue reading