The Easiest Way to Build a Type Checker
Type checkers are a piece of software that feel incredibly simple, yet incredibly complex. Seeing Hindley-Milner written in a logic programming language is almost magical, but it never helped me understand how it was implemented. Nor does actually trying to read anything about Algorithm W or any academic paper explaining a type system. But thanks to David Christiansen, I have discovered a setup for type checking that is so conceptually simple it demystified the whole thing for me. It goes by the name Bidirectional Type Checking.
Bidirectional Type Checking
The two directions in this type checker are inferring types and checking types. Unlike Hindley-Milner, we do need some type annotations, but these are typically at function definitions. So code like the sillyExample below is completely valid and fully type checks despite lacking annotations. How far can we take this? I'm not a type theory person. Reading papers in type theory takes me a while, and my comprehension is always lacking, but this paper seems like a good starting point for answering that question.
function sillyExample ( x : number ) : number { let a = 10 ; let b = 20 ; let e = a ; let f = b ; let q = a + e ; let g = "hello" ; let h = "world" ; let i = 100 + q ; return x ; }
So, how do we actually create a bidirectional type checker? I think the easiest way to understand it is to see a full working implementation. So that's what I have below for a very simple language. To understand it, start by looking at the types to figure out what the language supports, then look at each of the infer cases. But don't worry, if it doesn't make sense, I will explain in more detail below.
export type Type = | { kind : "number" } | { kind : "string" } | { kind : "function" ; arg : Type ; returnType : Type } ; export type Expr = | { kind : "number" ; value : number } | { kind : "string" ; value : string } | { kind : "varLookup" ; name : string } | { kind : "function" ; param : string ; body : Expr } | { kind : "call" ; fn : Expr ; arg : Expr } | { kind : "let" ; name : string ; value : Expr ; type ? : Type } | { kind : "block" ; statements : Expr [ ] ; return : Expr } ; export type Context = Map < string , Type > ; export function infer ( ctx : Context , expr : Expr ) : Type { switch ( expr . kind ) { case "number" : return { kind : "number" } ; case "string" : return { kind : "string" } ; case "varLookup" : const type = ctx . get ( expr . name ) ; if ( ! type ) { throw new Error ( ` Unbound variable: ${ expr . name } ` ) ; } return type ; case "call" : const fnType = infer ( ctx , expr . fn ) ; if ( fnType . kind !== "function" ) { throw new Error ( "Cannot call non-function" ) ; } check ( ctx , expr . arg , fnType . arg ) ; return fnType . returnType ; case "function" : throw new Error ( "Cannot infer type for function without annotation" ) ; case "let" : const valueType = infer ( ctx , expr . value ) ; if ( expr . type ) { if ( ! typesEqual ( valueType , expr . type ) ) { let expected = JSON . stringify ( expr . type ) ; let actual = JSON . stringify ( valueType ) ; throw new Error ( ` expected ${ expected } , got ${ actual } ` ) ; } } ctx . set ( expr . name , valueType ) ; return valueType ; case "block" : let blockCtx = new Map ( ctx ) ; for ( const stmt of expr . statements ) { infer ( blockCtx , stmt ) ; } return infer ( blockCtx , expr . return ) ; } } export function check ( ctx : Context , expr : Expr , expected : Type ) : void { switch ( expr . kind ) { case "function" : if ( expected . kind !== "function" ) { throw new Error ( "Function must have function type" ) ; } const newCtx = new Map ( ctx ) ; newCtx . set ( expr . param , expected . arg ) ; check ( newCtx , expr . body , expected . returnType ) ; break ; case "block" : let blockCtx = new Map ( ctx ) ; for ( const stmt of expr . statements ) { infer ( blockCtx , stmt ) ; } check ( blockCtx , expr . return , expected ) ; break ; default : const actual = infer ( ctx , expr ) ; if ( ! typesEqual ( actual , expected ) ) { throw new Error ( ` Type mismatch: expected ${ expected } , got ${ actual } ` ) ; } } } export function typesEqual ( a : Type , b : Type ) : boolean { if ( a . kind !== b . kind ) return false ; if ( a . kind === "function" && b . kind === "function" ) { return typesEqual ( a . arg , b . arg ) && typesEqual ( a . returnType , b . returnType ) ; } return true ; }
Here we have, in ~100 lines, a fully functional type checker for a small language. Is it without flaw? Is it feature complete? Not at all. In a real type checker, you might not want to know only if something typechecks, but you might want to decorate the various parts with their type; we don't do that here. We don't do a lot of things. But I've found that this tiny bit of code is enough to start extending to much larger, more complicated code examples.
Explanation
If you aren't super familiar with the implementation of programming languages, some of this code might strike you as a bit odd, so let me very quickly walk through the implementation. First, we have our data structures for representing our code:
... continue reading