Tech News
← Back to articles

Poking holes into bytecode with peephole optimisations

read original related products more articles

This article highlights the first optimizations I made while redesigning and semi-porting the runtime from C to Rust. These aren’t benchmarked or verified, since the virtual machine is currently under construction and will probably be finished this week.

At a high level, purple-garden current works like this, with 2+3*4-1 as an exemplary input:

TEXT

1 . 2 +- Tokenizer 3 | 4 ]: Token(2) Token(+) Token(3) 5 ]: Token(*) 6 ]: Token(4) Token(-) Token(1) 7 | 8 \ 9 +- Parsing (Tokens -> Abstract Syntax Tree) 10 | 11 ]: (Asteriks 12 ]: (Plus 13 ]: Integer("2") 14 ]: Integer("3") 15 ]: ) 16 ]: (Minus 17 ]: Integer("4") 18 ]: Integer("1") 19 ]: ) 20 ]: ) 21 | 22 | 23 24 \ 25 +- Planned IR and Optimisation Boundary 26 | 27 / \ 28 | +- JIT Compiler (IR -> x86/ARM) 29 | ^ 30 | \ 31 | \ 32 | \ 33 | \ 34 | \ 35 |Calls 36 | |JIT'ed 37 \ |functions 38 +- Compiler (AST/IR -> bytecode) | 39 | / 40 ]: __entry: / 41 ]: load_imm r0, #2 | 42 ]: load_imm r1, #3 | 43 ]: add r2, r0, r1 | 44 ]: load_imm r1, #4 | 45 ]: load_imm r0, #1 | 46 ]: sub r3, r1, r0 | 47 ]: mul r0, r2, r3 | 48 | | 49 \ | 50 +- Peephole Optimiser | 51 | | 52 ]: __entry: | 53 ]: load_imm r2, #5 | 54 ]: load_imm r3, #3 | 55 ]: mul r0, r2, r3 | 56 | / 57 \ / 58 +- Baseline interpreter --+ 59 | 60 ] [vm][0000] LoadImm { dst: 2, value: 5 } 61 ] [vm][0001] LoadImm { dst: 3, value: 3 } 62 ] [vm][0002] Mul { dst: 0, lhs: 2, rhs: 3 } 63 | 64 '

Peephole Optimizations

Peephole optimisations are, as the name implies, optimisations performed on a small section of a larger input. For a virtual machine, like purple-garden this means using a window of size 3 (anything larger is no longer local subject to IR optimisation, not peephole and will have happened before peephole optimisation is reached in the compilation pipeline) and merging operators, rewriting redundant or useless operations.

So to summarize:

peephole optimisations are done on the bytecode in a local setting (non-global)

in this case they are fallback optimisations for things the IR optimisation rewriting missed

these are local, single-pass, and meant only to catch what the IR opt didn’t

... continue reading