Professor Yoshihiko Futamura
You might not have encountered Futamura projections or partial evaluation before, so what is this strange sounding thing?
The core idea is to automatically transform the code of your interpreter to create individual JIT compiled user methods. Instead of needing to carefully implement the language semantics in two places (interpreter and hand-crafted JIT), it’s sufficient to implement it just once. As the interpreter is memory safe and the transform preserves interpreter semantics, the compiled version of the user’s code is guaranteed to match the interpreter’s behavior and is therefore also automatically memory safe. This makes it much harder to slip up and write an exploitable VM.
There are several tricks that make this possible. The most important is a new form of constant-ness, added to Java using annotations. In normal programming a variable is either mutable or immutable. An immutable variable is marked with a special keyword such as final or const and must be set only once, at the declaration site. Constants are great for compilers because they can be folded, meaning that references to them can be replaced with their value. Consider the following bit of code:
class Example {
private static final int A = 1;
private static final int B = 2;
static int answer() {
return A - B;
}
static String doSomething() {
if (answer() < 0)
return "OK"
else
throw new IllegalStateException();
}
}
It’s easy to see the answer() method will always return the same number. A good compiler will substitute 1 and 2 into the expression yielding return 1 — 2 and pre-compute the answer. Then it will inline any calls to answer (i.e. copy/paste the implementation into the call site), substituting those with -1 and thus removing the call overhead as well. That in turn may trigger even more constant folding, such as in the doSomething method where the compiler will prove that the exception can never be thrown and delete it entirely. Having done that, doSomething can also be optimized out by simply replacing it with “OK”, and so on.
That’s neat, but every compiler can do that … as long as the constant values are known at compile time. Truffle changes that by introducing a third kind of const-ness called compilation final. If in your interpreter implementation you declare a variable like this:
@CompilationFinal private int a = 1;
then it will change its const-ness depending on when it’s being accessed. From inside your interpreter, it’s mutable. You will use such variables to implement your interpreter. They’ll be set when you load your user’s program and maybe also whilst it runs. Once a function in the user’s script becomes hot, Truffle will work together with the Graal compiler to recompile the parts of the interpreter corresponding to the user’s code, and this time a will be treated as if it was a constant, i.e. the same as the literal value 1 .
This works for any kind of data, including complex objects. Consider the following highly simplified pseudocode:
import com.oracle.truffle.api.nodes.Node;
class JavaScriptFunction extends Node {
@CompilationFinal Node[] statements;
Object execute() {
for (var statement : statements) statement.execute();
}
}
This is the sort of class you might find in a typical abstract syntax tree interpreter. The statements array is marked compilation-final. When the program is first loaded we can initialize the array with objects representing the different things a user’s JavaScript function is doing, because it’s mutable. Now imagine that the function represented by this object gets hot. Truffle will start a special compilation of the execute() method in which Graal is told that the this pointer should be treated implicitly as compilation-final. Because the object is treated as constant, so can this.statements also be treated as constant. It’ll be substituted with the exact contents of a specific JavaScriptFunction object on the interpreter heap enabling the compiler to unroll the loop inside execute , transforming it to look like this:
Object execute() {
this.statements[0].execute();
this.statements[1].execute();
this.statements[2].execute();
}
Here Node is a superclass and execute() is virtual, but that doesn’t matter. Because the list is compilation-final the individual objects in the list are also constant folded, so the execute method can be de-virtualized (resolved to whatever concrete type it really is) and then inlined as well.