Ohm's Peg-to-WASM Compiler

About Ohm Ohm is a user-friendly parsing toolkit for JavaScript and TypeScript. You can use it to parse custom file formats or quickly build parsers, interpreters, and compilers for programming languages. Learn more

A few weeks ago, we announced the Ohm v18 beta, which involved a complete rewrite of the core parsing engine. Since then, we've implemented even more performance improvements: v18 is now more than 50x faster for real-world grammars while using about 10% of the memory.

The new parsing engine works by compiling an Ohm grammar — which is a form of parsing expression grammars, or PEG — into a WebAssembly module that implements a parser. In this post, we'll dive into the technical details of how that works, and talk about some of the optimizations that made it even faster.

In previous versions of Ohm (up to and including v17), the parsing engine used an approach called AST interpretation. Here's how that works.

When you instantiate a grammar with Ohm, it parses your grammar and converts it to an abstract syntax tree. You can think of this tree as a kind of program, which describes a parser for the language. The nodes of the tree are parsing expressions, or PExprs as they're called in the source code.

We'll use the following grammar as an example:

JSONLike {

Value = Object

| "true"

| "false"

... continue reading