A Fast Bytecode VM for Arithmetic: The Compiler

In this post, we write the compiler for our AST to bytecode, and a decompiler for the bytecode. In this series of posts, we write a fast bytecode compiler and a virtual machine for arithmetic in Haskell. We explore the following topics: In this series of posts, we write a fast bytecode compiler and a virtual machine for arithmetic in Haskell. We explore the following topics: Parsing arithmetic expressions to Abstract Syntax Trees (ASTs). Unit testing for our parser. Interpreting ASTs. Compiling ASTs to bytecode. Disassembling and decompiling bytecode. Unit testing for our compiler. Property-based testing for our compiler. Efficiently executing bytecode in a virtual machine (VM). Unit testing and property-based testing for our VM . Benchmarking our code to see how the different passes perform. All the while keeping an eye on performance. In this post, we write the compiler for our AST to bytecode, and a decompiler for the bytecode. This is the second post in a series of posts: A Fast Bytecode VM for Arithmetic: The Parser A Fast Bytecode VM for Arithmetic: The Compiler A Fast Bytecode VM for Arithmetic: The Virtual Machine Introduction AST interpreters are well known to be slow because of how AST nodes are represented in the computer’s memory. The AST nodes contain pointers to other nodes, which may be anywhere in the memory. So while interpreting an AST , the interpreter jumps all over the memory, causing a slowdown. One solution to this is to convert the AST into a more compact and optimized representation known as Bytecode. Bytecode is a flattened and compact representation of a program, usually manifested as a byte array. Bytecode is essentially an Instruction Set (IS), but custom-made to be executed by a Virtual Machine (VM), instead of a physical machine. Each bytecode instruction is one byte in size (that’s where it gets its name from). A bytecode and its VM are created in synergy so that the execution is as efficient as possible . Compiling source code to bytecode and executing it in a VM also allows the program to be run on all platforms that the VM supports without the developer caring much about portability concerns. The most popular combo of bytecode and VM is probably the Java bytecode and the Java virtual machine. The VM s can be stack-based or register-based. In a stack-based VM , all values created during the execution of a program are stored only in a Stack data-structure residing in the memory. Whereas, in a register-based VM , there is also an additional set of fixed number of registers that are used to store values in preference to the stack . Register-based VM s are usually faster, but stack-based VM s are usually simpler to implement. For our purpose, we choose to implement a stack-based VM . We are going to write a compiler that compiles our expression AST to bytecode. But first, let’s design the bytecode for our stack-based VM . The Bytecode Here is our expression AST as a reminder: data Expr = Num ! Int16 | Var ! Ident | BinOp ! Op Expr Expr | Let ! Ident Expr Expr deriving ( Eq , Generic ) newtype Ident = Ident BS.ByteString deriving ( Eq , Ord , Generic , Hashable ) data Op = Add | Sub | Mul | Div deriving ( Eq , Enum , Generic ) ArithVMLib.hs Let’s figure out the right bytecode for each case. First, we create Opcodes for each bytecode, which are sort of mnemonics for actual bytecode. Think of them as Assembly is to Machine Code. Num For a number literal, we need to put it directly in the bytecode so that we can use it later during the execution. We also need an opcode to push it on the stack. Let’s call it OPush with an Int16 parameter. BinOp Binary operations recursively use Expr for their operands. To evaluate a binary operation, we need its operands to be evaluated before, so we compile them first to bytecode. After that, all we need is an opcode per operator. Let’s call them OAdd , OSub , OMul , and ODiv for Add , Sub , Mul , and Div operators respectively. Var and Let Variables and Let expressions are more complex . In the AST interpreter we chucked the variables in a map, but we cannot do that in a VM . There is no environment map in a VM , and all values must reside in the stack. How do we have variables at all then? Let’s think for a bit. Each expression, after being evaluated in the VM , must push exactly one value on the stack: its result. Num expressions are a trivial case. When a binary operation is evaluated, first its left operand is evaluated. That pushes one value on the stack. Then its right operand is evaluated, and that pushes another value on the stack. Finally, the operation pops the two values from the top of the stack, does its thing, and pushes the resultant value back on the stack—again one value for the entire BinOp expression. A Let expression binds a variable’s value to its name, and then the variable can be referred from the body of the expression. But how can we refer to a variable when the stack contains only values, not names? Let’s imagine that we are in middle of evaluating a large expression, wherein we encounter a Let expression. First we evaluate its assignment expression, and that leaves a value on the top of the stack. Let’s say that the stack has n values at this point. After this we get to evaluate the body expression. At all times when we are doing that, the value from assignment stays at the same point in the stack because evaluating sub-expressions, no matter how complicated, only adds new values to the stack, without popping an existing value from before. Therefore, we can use the stack index of the assignment value ( n−1 ) to refer to it within the body expression. So, we encode Var as an opcode and an integer index into the stack. We choose to use a Word8 to index the stack, limiting us to a stack depth of 256. We encode the variable references with an opcode OGet , which when executed gets the value from the stack at the given index and pushes it on the stack. For a Let expression, after we compile its assignment and body expressions, we need to make sure that the exactly-one-value invariant holds. Evaluating the assignment and body pushes two values on the stack, but we can have only one! So we overwrite the assignment value with the body value, and pop the stack to remove the body value. We invent a new opcode OSwapPop to do this, called so because its effect is equivalent to swapping the topmost two values on the stack, and then popping the new top value . Putting all the opcodes together, we have the Opcode ADT: data Opcode = OPush ! Int16 -- 0 | OSwapPop -- 1 | OGet ! Word8 -- 2 | OAdd -- 3 | OSub -- 4 | OMul -- 5 | ODiv -- 6 deriving ( Show , Read , Eq , Generic ) instance NFData Opcode ArithVMLib.hs Notice that we also assigned bytecodes—that is, a unique byte value—to each Opcode above, which are just their ordinals. Now we are ready to write the compiler. The Compiler The compiler takes an expression with the bytecode size, and compiles it to a strict ByteString of that size. Recall that in the previous post, we wrote our parser such that the bytecode size for each AST node was calculated while parsing it. This allows us to pre-allocate a bytestring of required size before compiling the AST . We compile to actual bytes here, and don’t use the opcodes. type Bytecode = BS.ByteString compile :: SizedExpr -> Result Bytecode = compile' defaultStackSize compilecompile' defaultStackSize compile' :: Int -> SizedExpr -> Result Bytecode = compile' stackSize (expr, bytecodeSize) uncurry ( fmap . const ) . BSI.unsafeCreateUptoN' bytecodeSize $ \fp -> do BSI.unsafeCreateUptoN' bytecodeSize\fp (bytecodeSize,) <$> fmap Right >>= checkSize fp . TS.fst) (compileIO bytecodeSize stackSize fp fp exprcheckSize fpTS.fst) `catch` ( pure . Left ) where = do checkSize fp ip let actualBytecodeSize = ip `minusPtr` fp actualBytecodeSizeipfp == bytecodeSize) $ unless (actualBytecodeSizebytecodeSize) . Error Compile $ throwIO "Compiled bytecode size " <> show actualBytecodeSize actualBytecodeSize <> " is not same as expected size: " <> show bytecodeSize bytecodeSize compileIO :: Int -> Int -> Ptr Word8 -> Ptr Word8 -> Expr -> IO ( Pair ( Ptr Word8 ) Int ) = go Map.empty 0 ip compileIO bytecodeSize stackSize fp ipgo Map.emptyip where = fp `plusPtr` bytecodeSize epfpbytecodeSize ! sp ! ip = \ case go envspip Num n | sp + 1 <= stackSize -> do spstackSize let ! lb = fromIntegral $ n .&. 0xff lb ! mb = fromIntegral $ (( fromIntegral n :: Word16 ) .&. 0xff00 ) `shiftR` 8 mb(( 0 -- OPush writeByte ip `plusPtr` 1 ) lb writeByte (ip) lb `plusPtr` 2 ) mb writeByte (ip) mb pure (ip `plusPtr` 3 :!: sp + 1 ) (ipsp Num _ -> throwCompileError "Stack overflow" throwCompileError BinOp op a b -> do op a b :!: sp') <- go env sp ip a (ip'sp')go env sp ip a :!: sp'') <- go env sp' ip' b (ip''sp'')go env sp' ip' b $ translateOp op writeByte ip''translateOp op pure (ip'' `plusPtr` 1 :!: sp'' - 1 ) (ip''sp'' Let x assign body -> do x assign body :!: sp') <- go env sp ip assign (ip'sp')go env sp ip assign :!: sp'') <- go (Map.insert x sp env) sp' ip' body (ip''sp'')go (Map.insert x sp env) sp' ip' body 1 -- OSwapPop writeByte ip'' pure (ip'' `plusPtr` 1 :!: sp'' - 1 ) (ip''sp'' Var x | sp + 1 <= stackSize -> case Map.lookup x env of spstackSizeMap.lookup x env Nothing -> throwCompileError $ "Unknown variable: " <> show x throwCompileError Just varScope varScope | varScope < stackSize && varScope < fromIntegral ( maxBound @ Word8 ) -> do varScopestackSizevarScope 2 -- OGet writeByte ip `plusPtr` 1 ) $ fromIntegral varScope writeByte (ipvarScope pure (ip `plusPtr` 2 :!: sp + 1 ) (ipsp Just _ -> throwCompileError "Stack overflow" throwCompileError Var _ -> throwCompileError "Stack overflow" throwCompileError writeByte :: Ptr Word8 -> Word8 -> IO () () ! ip ! val writeByteipval | ip < ep = poke ip val ipeppoke ip val | otherwise = throwCompileError $ throwCompileError "Instruction index " <> show (ip `minusPtr` fp) (ipfp) <> " out of bound " <> show (bytecodeSize - 1 ) (bytecodeSize = \ case translateOp Add -> 3 -- OAdd Sub -> 4 -- OSub Mul -> 5 -- OMul Div -> 6 -- ODiv = throwIO . Error Compile throwCompileErrorthrowIO defaultStackSize :: Int = 256 defaultStackSize ArithVMLib.hs We use the unsafeCreateUptoN' function from the Data.ByteString.Internal module that allocates enough memory for the provided bytecode size, and gives us a pointer to the allocated memory. We call this pointer fp for frame pointer. Then we traverse the AST recursively, writing bytes for opcodes and arguments for each case. We use pointer arithmetic and the poke function to write the bytes. Int16 numbers are encoded as two bytes in little endian fashion. In the recursive traversal function go , we pass and return the current stack pointer sp and instruction pointer ip . We update these correctly for each case . We also take care of checking that the pointers stay in the right bounds, failing which we throw appropriate errors. We also pass an env parameter that is similar to the variable names to values environment we use in the AST interpreter, but this one tracks variable names to stack indices at which they reside. We update this information before compiling the body of a Let expression to capture the stack index of its assignment value. When compiling a Var expression, we use the env map to lookup the variable’s stack index, and encode it in the bytecode. At the end of compilation, we check that the entire bytestring is filled with bytes till the very end, failing which we throw an error. This check is required because otherwise the bytestring may have garbage bytes, and may fail inexplicably during execution. All the errors are thrown in the IO monad using the throwIO function, and are caught after compilation using the catch function. The final result or error is returned wrapped into Result . Let’s see it in action: $ echo -n "1 + 2 - 3 * 4" | arith-vm compile | hexdump -C 00000000 00 01 00 00 02 00 03 00 03 00 00 04 00 05 04 |...............| 0000000f $ echo -n "let x = 4 in let y = 5 in x + y" | arith-vm compile | hexdump -C 00000000 00 04 00 00 05 00 02 00 02 01 03 01 01 |.............| 0000000d You can verify that the resultant bytes are indeed correct. I assume that it is difficult for you to read raw bytes. We’ll fix this in a minute. Meanwhile, let’s ponder upon some performance characteristics of our compiler. Compiling, Fast and Slow You may be wondering why I chose to write the compiler in this somewhat convoluted way of pre-allocating a bytestring and using pointers. The answer is: performance. I didn’t actually start with pointers. I iterated through many different data and control structures to find the fastest one. The table below shows the compilation times for a benchmark expression file when using different data structures to implement the compile' function: Data structure Time (ms) Incremental speedup Overall speedup List 4345 1x 1x Seq 523 8.31x 8.31x DList 486 1.08x 8.94x BS Builder 370 1.31x 11.74x Pre-allocated BS 54 6.85x 80.46x Bytearray 52 1.02x 83.55x I started with the bread-and-butter data structure of Haskellers, the humble and known to be slow List , which was indeed quite slow. Next, I moved on to Seq and thereafter DList , which are known to be faster at concatenation/consing. Then I abandoned the use of intermediate data structures completely, choosing to use a bytestring Builder to create the bytestring. Finally, I had the epiphany that the bytestring size was known at compile time, and rewrote the function to pre-allocate the bytestring, thereby reaching the fastest solution. I also tried using Bytearray , which has more-or-less same performance of bytestring, but it is inconvenient to use because there are no functions to do IO with bytearrays. So I’d anyway need to use bytestrings for reading from STDIN or writing to STDOUT, and converting to-and-fro between bytearray and bytestring is a performance killer. Thus, I decided to stick to bytestrings. The pre-allocated bytestring approach is 80 times faster than using lists, and almost 10 times faster than using Seq . For such gain, I’m okay with the complications it brings to the code. Here are the numbers in a chart (smaller is better): Compilation time using different data-structures The other important data structure used here is the map (or dictionary) in which we add the mappings from identifiers to their stack indices. This data structure needs to be performant because we do a lookup for each variable we encounter while compiling. I benchmarked compilation for some data structures: Data structure Time (ms) Slowdown Data.HashMap.Strict.HashMap 55 1.00x Data.List.List 63 1.14x Data.Map.Strict.Map 71 1.29x Data.Trie.Trie 80 1.45x Data.Vector.Hashtables.Dictionary 104 1.89x Data.HashTable.IO.BasicHashTable 312 5.67x Strict hashmap turns out to be the fasted one, but interestingly, linked list is a close second. Mutable hashtable is the slowest even though I expected it to be the fastest. Here are the times in a chart (smaller is better): Compilation time using different map data-structures Another choice I had to make was how to write the go function. I ended up passing and returning pointers and environment map, and throwing errors in IO , but a number of solutions are possible. I tried out some of them, and noted the compilation times for the benchmark expression file: Control structure Time (ms) Slowdown IO 57.4 1.00x IO + IORef 65.0 1.13x IO + ReaderT 60.9 1.06x IO + StateT 65.6 1.14x IO + ExceptT 65.9 1.15x IO + ReaderT + ExceptT 107.1 1.87x IO + StateT + ExceptT 383.9 6.69x IO + StateT + ReaderT 687.5 11.98x IO + StateT + ReaderT + ExceptT 702.0 12.23x IO + CPS 78.2 1.36x IO + DCPS 78.4 1.37x IO + ContT 76.5 1.33x I tried putting the pointer in IORef s and StateT state instead of passing them back-and-forth. I tried putting the environment in a ReaderT config. I tried using ExceptT for throwing errors instead of using IO errors. Then I tried various combinations of these monad transformers. Finally, I also tried converting the go function to be tail-recursive by using Continuation-passing style (CPS), and then defunctionalizing the continuations, as well as, using the ContT monad transformer. All of these approaches resulted in slower code. The times are interesting to compare (smaller is better): Compilation time using different control-structures There is no reason to use IORef s here because they result in slower and uglier code. Using one monad transformer at a time results in slight slowdowns, which may be worth the improvement in the code. But using more than one of them degrades performance by a lot. Also, there is no improvement caused by CPS conversion, because GHC is smart enough to optimize the non tail-recursive code to be faster then handwritten tail-recursive one that allocates a lot of closures (or objects in case of defunctionalization). Moving on … The Decompiler It is a hassle to read raw bytes in the compiler output. Let’s write a decompiler to aid us in debugging and testing the compiler. First, a disassembler that converts bytes to opcodes: type Program = Seq Opcode disassemble :: Bytecode -> Result Program = go 0 Seq.empty disassemble bytecodegoSeq.empty where ! size = BS.length bytecode sizeBS.length bytecode ! ip ! program goipprogram | ip == size = pure program ipsizeprogram | otherwise = case readInstr bytecode ip of readInstr bytecode ip 0 | ip + 2 < size -> ipsize + 3 ) $ program |> OPush (readInstrArgInt16 bytecode ip) go (ipprogram(readInstrArgInt16 bytecode ip) 0 -> throwIPOOBError $ ip + 2 throwIPOOBErrorip 1 -> go (ip + 1 ) $ program |> OSwapPop go (ipprogram 2 | ip + 1 < size -> ipsize + 2 ) $ program |> OGet (readInstrArgWord8 bytecode ip) go (ipprogram(readInstrArgWord8 bytecode ip) 2 -> throwIPOOBError $ ip + 1 throwIPOOBErrorip 3 -> go (ip + 1 ) $ program |> OAdd go (ipprogram 4 -> go (ip + 1 ) $ program |> OSub go (ipprogram 5 -> go (ip + 1 ) $ program |> OMul go (ipprogram 6 -> go (ip + 1 ) $ program |> ODiv go (ipprogram n -> throwDisassembleError $ throwDisassembleError "Invalid bytecode: " <> show n <> " at: " <> show ip ip = throwDisassembleError $ throwIPOOBError ipthrowDisassembleError "Instruction index " <> show ip <> " out of bound " <> show (size - 1 ) ip(size = throwError . Error Disassemble throwDisassembleErrorthrowError ArithVMLib.hs A disassembled program is a sequence of opcodes. We simply go over each byte of the bytecode, and append the right opcode for it to the program, along with any parameters it may have. Note that we do not verify that the disassembled program is correct. Here are the helpers that read instruction bytes and their arguments from a bytestring: readInstr :: BS.ByteString -> Int -> Word8 = BS.unsafeIndex readInstrBS.unsafeIndex {-# INLINE readInstr #-} readInstrArgWord8 :: BS.ByteString -> Int -> Word8 = readInstr bytecode (ip + 1 ) readInstrArgWord8 bytecode ipreadInstr bytecode (ip {-# INLINE readInstrArgWord8 #-} readInstrArgInt16 :: BS.ByteString -> Int -> Int16 = readInstrArgInt16 bytecode ip let lb = readInstr bytecode (ip + 1 ) lbreadInstr bytecode (ip = readInstr bytecode (ip + 2 ) mbreadInstr bytecode (ip b1 :: Word16 = fromIntegral lb lb = fromIntegral mb `shiftL` 8 b2mb in fromIntegral (b1 .|. b2) (b1b2) {-# INLINE readInstrArgInt16 #-} ArithVMLib.hs Next, we decompile the opcodes to an expression: decompile :: Program -> Result Expr = do decompile program <- go Seq.empty program stackgo Seq.empty program Decompile maxBound $ length stack checkStackstack let ast :<| _ = stack aststack pure ast ast where = \ case go stack Seq.Empty -> pure stack stack :<| rest -> case opcode of opcoderestopcode OPush n -> go (stack |> Num n) rest go (stackn) rest OAdd -> decompileBinOp Add >>= flip go rest decompileBinOpgo rest OSub -> decompileBinOp Sub >>= flip go rest decompileBinOpgo rest OMul -> decompileBinOp Mul >>= flip go rest decompileBinOpgo rest ODiv -> decompileBinOp Div >>= flip go rest decompileBinOpgo rest OGet i -> go (stack |> Var (mkIdent $ mkName $ fromIntegral i)) rest go (stack(mkIdentmkNamei)) rest OSwapPop -> decompileLet >>= flip go rest decompileLetgo rest where = case stack of decompileBinOp opstack :|> a :|> b -> pure $ stack' |> BinOp op a b stack'stack'op a b _ -> throwDecompileError $ throwDecompileError "Not enough elements to decompile binary operation: " <> show op op = case stack of decompileLetstack :|> a :|> b -> stack' pure $ stack' |> Let (mkIdent $ mkName $ length stack - 2 ) a b stack'(mkIdentmkNamestack) a b _ -> throwDecompileError "Not enough elements to decompile let" throwDecompileError = names `Seq.index` i mkName inames = Seq.fromList $ tail $ combinations 2 namesSeq.fromListcombinations = \ case combinations 0 -> [ "" ] n -> let prev = combinations (n - 1 ) prevcombinations (n in prev <> [x : xs | x <- [ 'a' .. 'z' ], xs <- prev] prev[xxs], xsprev] = throwError . Error Decompile throwDecompileErrorthrowError checkStack :: ( MonadError Error m) => Pass -> Int -> Int -> m () m)m () = \ case checkStack pass stackSize 1 -> pure () () 0 -> throwError $ Error pass "Final stack has no elements" throwErrorpass n | n > stackSize -> throwError . Error pass $ "Stack overflow" stackSizethrowErrorpass n | n > 1 -> throwError . Error pass $ "Final stack has more than one element" throwErrorpass _ -> throwError . Error pass $ "Stack underflow" throwErrorpass ArithVMLib.hs Decompilation is the opposite of compilation. While compiling there is an implicit stack of expressions that are yet to be compiled. We make that stack explicit here, capturing expressions as they are decompiled from opcodes. For compound expressions, we inspect the stack and use the already decompiled expressions as the operands of the expression being decompiled. This way we build up larger expressions from smaller ones, culminating in the single top-level expression at the end. Finally, we check the stack to make sure that there is only one expression left in it. Note that like the disassembler, we do not verify that the decompiled expression is correct. There is one tricky thing in decompilation: we lose the names of the variables when compiling, and are left with only stack indices. So while decompiling, we generate variable names from their stack indices by indexing a list of unique names. Let’s see it in action: $ echo -n "1 + 2 - 3 * 4" | arith-vm compile | arith-vm disassemble OPush 1 OPush 2 OAdd OPush 3 OPush 4 OMul OSub $ echo -n "1 + 2 - 3 * 4" | arith-vm compile | arith-vm decompile ( ( 1 + 2 ) - ( 3 * 4 ) ) $ echo -n "let x = 4 in let y = 5 in x + y" | arith-vm compile | arith-vm disassemble OPush 4 OPush 5 OGet 0 OGet 1 OAdd OSwapPop OSwapPop $ echo -n "let x = 4 in let y = 5 in x + y" | arith-vm compile | arith-vm decompile ( let a = 4 in ( let b = 5 in ( a + b ) ) ) That’s all for compilation and decompilation. Now, we use them together to make sure that everything works. Testing the Compiler We write some unit tests for the compiler, targeting both success and failure cases: compilerSpec :: Spec = describe "Compiler" $ do compilerSpecdescribe $ \(input, result) -> forM_ compilerSuccessTests\(input, result) "compiles: \"" <> BSC.unpack input <> "\"" ) $ do it (BSC.unpack input `shouldBe` Right (Seq.fromList result) parseCompile input(Seq.fromList result) $ \(input, err) -> forM_ compilerErrorTests\(input, err) "fails for: \"" <> BSC.unpack input <> "\"" ) $ do it (BSC.unpack input `shouldSatisfy` \ case parseCompile input Left ( Error Compile msg) | err == msg -> True msg)errmsg _ -> False "fails for greater sized expr" $ do it Num 1 , 4 ) `shouldSatisfy` \ case compile ( Left ( Error Compile "Compiled bytecode size 3 is not same as expected size: 4" ) -> True _ -> False "fails for lesser sized expr" $ do it Num 1 , 2 ) `shouldSatisfy` \ case compile ( Left ( Error Compile "Instruction index 2 out of bound 1" ) -> True _ -> False where = parseSized >=> compile' 4 >=> disassemble parseCompileparseSizedcompile'disassemble compilerSuccessTests :: [( BSC.ByteString , [ Opcode ])] [(, [])] = compilerSuccessTests "1" , [ ( [ OPush 1 ] ), ( "1 + 2 - 3 * 4 + 5 / 6 / 1 + 1" , [ OPush 1 , OPush 2 , OAdd , OPush 3 , OPush 4 , OMul , OSub , OPush 5 , OPush 6 , ODiv , OPush 1 , ODiv , OAdd , OPush 1 , OAdd ] ), ( "1 + (2 - 3) * 4 + 5 / 6 / (1 + 1)" , [ OPush 1 , OPush 2 , OPush 3 , OSub , OPush 4 , OMul , OAdd , OPush 5 , OPush 6 , ODiv , OPush 1 , OPush 1 , OAdd , ODiv , OAdd ] ), ( "let x = 4 in x + 1" , [ OPush 4 , OGet 0 , OPush 1 , OAdd , OSwapPop ] ), ( "let x = 4 in let y = 5 in x + y" , [ OPush 4 , OPush 5 , OGet 0 , OGet 1 , OAdd , OSwapPop , OSwapPop ] ), ( "let x = 4 in let x = x + 1 in x + 2" , [ OPush 4 , OGet 0 , OPush 1 , OAdd , OGet 1 , OPush 2 , OAdd , OSwapPop , OSwapPop ] ), ( "let x = let y = 3 in y + y in x * 3" , [ OPush 3 , OGet 0 , OGet 0 , OAdd , OSwapPop , OGet 0 , OPush 3 , OMul , OSwapPop ] ), ( "let x = let y = 1 + let z = 2 in z * z in y + 1 in x * 3" , [ OPush 1 , OPush 2 , OGet 1 , OGet 1 , OMul , OSwapPop , OAdd , OGet 0 , OPush 1 , OAdd , OSwapPop , OGet 0 , OPush 3 , OMul , OSwapPop ] ), ( "1/0" , [ OPush 1 , OPush 0 , ODiv ]), , []), ( "-32768 / -1" , [ OPush ( - 32768 ), OPush ( - 1 ), ODiv ]) , [),),]) ] compilerErrorTests :: [( BSC.ByteString , String )] [()] = compilerErrorTests "x" , "Unknown variable: x" ), [ (), ( "let x = 4 in y + 1" , "Unknown variable: y" ), ), ( "let x = y + 1 in x" , "Unknown variable: y" ), ), ( "let x = x + 1 in x" , "Unknown variable: x" ), ), ( "let x = 4 in let y = 1 in let z = 2 in y + x" , "Stack overflow" ), ), ( "let x = 4 in let y = 5 in x + let z = y in z * z" , "Stack overflow" ), ), ( "let a = 0 in let b = 0 in let c = 0 in let d = 0 in d" , "Stack overflow" ) ] ArithVMSpec.hs In each test, we parse and compile an expression, and then disassemble the compiled bytes, which we match with expected list of opcodes, or an error message. Let’s put these tests with the parser tests, and run them: main :: IO () () = hspec $ do mainhspec parserSpec astInterpreterSpec compilerSpec ArithVMSpec.hs Output of the test run $ cabal test -O2 Running 1 test suites... Test suite specs: RUNNING... Parser parses: "1 + 2 - 3 * 4 + 5 / 6 / 0 + 1" [✔] parses: "1+2-3*4+5/6/0+1" [✔] parses: "1 + -1" [✔] parses: "let x = 4 in x + 1" [✔] parses: "let x=4in x+1" [✔] parses: "let x = 4 in let y = 5 in x + y" [✔] parses: "let x = 4 in let y = 5 in x + let z = y in z * z" [✔] parses: "let x = 4 in (let y = 5 in x + 1) + let z = 2 in z * z" [✔] parses: "let x=4in 2+let y=x-5in x+let z=y+1in z/2" [✔] parses: "let x = (let y = 3 in y + y) in x * 3" [✔] parses: "let x = let y = 3 in y + y in x * 3" [✔] parses: "let x = let y = 1 + let z = 2 in z * z in y + 1 in x * 3" [✔] fails for: "" [✔] fails for: "1 +" [✔] fails for: "1 & 1" [✔] fails for: "1 + 1 & 1" [✔] fails for: "1 & 1 + 1" [✔] fails for: "(" [✔] fails for: "(1" [✔] fails for: "(1 + " [✔] fails for: "(1 + 2" [✔] fails for: "(1 + 2}" [✔] fails for: "66666" [✔] fails for: "-x" [✔] fails for: "let 1" [✔] fails for: "let x = 1 in " [✔] fails for: "let let = 1 in 1" [✔] fails for: "let x = 1 in in" [✔] fails for: "let x=1 inx" [✔] fails for: "letx = 1 in x" [✔] fails for: "let x ~ 1 in x" [✔] fails for: "let x = 1 & 2 in x" [✔] fails for: "let x = 1 inx" [✔] fails for: "let x = 1 in x +" [✔] fails for: "let x = 1 in x in" [✔] fails for: "let x = let x = 1 in x" [✔] AST interpreter interprets: "1" [✔] interprets: "1 + 2 - 3 * 4 + 5 / 6 / 1 + 1" [✔] interprets: "1 + (2 - 3) * 4 + 5 / 6 / (1 + 1)" [✔] interprets: "1 + -1" [✔] interprets: "1 * -1" [✔] interprets: "let x = 4 in x + 1" [✔] interprets: "let x = 4 in let x = x + 1 in x + 2" [✔] interprets: "let x = 4 in let y = 5 in x + y" [✔] interprets: "let x = 4 in let y = 5 in x + let z = y in z * z" [✔] interprets: "let x = 4 in (let y = 5 in x + y) + let z = 2 in z * z" [✔] interprets: "let x = let y = 3 in y + y in x * 3" [✔] interprets: "let x = let y = 1 + let z = 2 in z * z in y + 1 in x * 3" [✔] fails for: "x" [✔] fails for: "let x = 4 in y + 1" [✔] fails for: "let x = y + 1 in x" [✔] fails for: "let x = x + 1 in x" [✔] fails for: "1/0" [✔] fails for: "-32768 / -1" [✔] Compiler compiles: "1" [✔] compiles: "1 + 2 - 3 * 4 + 5 / 6 / 1 + 1" [✔] compiles: "1 + (2 - 3) * 4 + 5 / 6 / (1 + 1)" [✔] compiles: "let x = 4 in x + 1" [✔] compiles: "let x = 4 in let y = 5 in x + y" [✔] compiles: "let x = 4 in let x = x + 1 in x + 2" [✔] compiles: "let x = let y = 3 in y + y in x * 3" [✔] compiles: "let x = let y = 1 + let z = 2 in z * z in y + 1 in x * 3" [✔] compiles: "1/0" [✔] compiles: "-32768 / -1" [✔] fails for: "x" [✔] fails for: "let x = 4 in y + 1" [✔] fails for: "let x = y + 1 in x" [✔] fails for: "let x = x + 1 in x" [✔] fails for: "let x = 4 in let y = 1 in let z = 2 in y + x" [✔] fails for: "let x = 4 in let y = 5 in x + let z = y in z * z" [✔] fails for: "let a = 0 in let b = 0 in let c = 0 in let d = 0 in d" [✔] fails for greater sized expr [✔] fails for lesser sized expr [✔] Finished in 0.0147 seconds 73 examples, 0 failures Test suite specs: PASS Awesome, it works! That’s it for this post. Let’s update our checklist: In the next part, we write a virtual machine that runs our compiled bytecode, and do some benchmarking. If you have any questions or comments, please leave a comment below. If you liked this post, please share it. Thanks for reading!

A Fast Bytecode VM for Arithmetic: The Compiler

Share this article

Related Articles