Project Valhalla, Explained: How a Decade of Work Arrives in JDK 28

On June 15, Oracle engineer Lois Foltan confirmed what a good chunk of the industry had stopped believing: JEP 401: Value Classes and Objects will be integrated into the main OpenJDK repository and is targeting JDK 28.

The change is so large that the remaining committers were asked to hold off on bigger commits during the integration. The pull request alone adds over 197 thousand lines of code across 1,816 files.

Before we pop the champagne, though: this is preview, disabled by default, and, as Brian Goetz was quick to cool everyone down, “only the first part of Valhalla.” Goetz added a great observation that the “they’ll never ship it” crowd will now smoothly switch over to “but they didn’t ship the most important part” (and a joke has been going around the community for years that we’ll sooner end up in Valhalla ourselves, the Norse-afterlife one, than the project ships).

You have to earn your own haters.

So this is a good moment to tell the whole story. This issue is one big deep-dive, written on the assumption that you’ve never followed the work on Valhalla before: from the 2014 problem, through the evolution of ideas (a fair number of which ended up in the trash), all the way to what exactly we’ll be getting our hands on in JDK 28. Brew yourself a coffee. I’ve been sitting on this edition for a long time, saving it for exactly this occasion.

1. Introduction - what this is even about

The slogan Valhalla has carried from the start is: “codes like a class, works like an int.” In a single sentence it captures the whole point of the project: we want to write normal, readable classes with methods, constructor validation, and sensible field names, but we want the JVM to be able to treat them as efficiently as primitives.

To understand why this is a problem, you have to go back to Java’s foundation. In this language, with the exception of the eight primitives (int, long, double, boolean, and the rest), everything is a reference type. When you write Point p = new Point(1, 2), the variable p isn’t a point. The variable p is a pointer, a coat-check number: somewhere on the heap sits an object, and you’re holding a slip of paper with its address. Every time you want to read a field, the JVM has to “go to the coat check,” performing a hop through the pointer (pointer indirection).

For a single object, that’s nothing. The problem starts at scale. Every object on the heap has its own header (a dozen-or-so bytes of metadata: among other things, so the JVM knows what type it is and whether anyone is synchronizing on it). Incidentally, this is exactly the problem Project Lilliput has been tackling lately, helping to shrink object header sizes. But header size isn’t everything. Every object has to be allocated, and later garbage collected. And since objects are scattered across the heap, an array of a million Points is in practice a million slips of paper pointing at a million boxes strewn across the whole warehouse.

Brian Goetz, in his “State of Valhalla” documents, calls such a memory layout “fluffy”: puffed up, bloated. What we dream of is a dense layout, one where the data lies side by side.

... continue reading