JSON Schema Demystified: Dialects, Vocabularies and Metaschemas

If you’ve ever tried to dive into JSON Schema, you’ve probably encountered a wall of terminology that makes your head spin: schemas, metaschemas, dialects, vocabularies, keywords, anchors, dynamic references. It feels like the community invented new words for things that already had perfectly good names, just to make the rest of us feel inadequate.

I’ve been working on a Haskell JSON Schema library that’s actually fully spec-compliant, which meant I had to figure all of this out. The problem isn’t that the concepts are inherently difficult. The terminology creates artificial barriers to understanding.

This post will break down the key concepts in JSON Schema in a way that actually makes sense, connecting the dots between all these terms that seem designed to confuse. By the end, you’ll understand not just what these words mean, but how they fit together into a coherent system.

Starting simple

Before we dive into terminology, let’s look at what we’re actually trying to accomplish. JSON Schema is fundamentally about describing the shape and constraints of JSON data. Here’s a simple example:

{ "type" : "object" , "properties" : { "name" : { "type" : "string" } , "age" : { "type" : "number" , "minimum" : 0 } } , "required" : [ "name" ] }

This schema says: “I expect a JSON object with a string name field (required) and an optional numeric age field that must be non-negative.” Simple enough, right?

Now here’s where it gets interesting: this schema is itself valid JSON. And since JSON can describe the structure of JSON documents, we can describe the structure of schemas using more schemas. This recursive property is what gives rise to metaschemas, and where the terminology starts to get confusing.

What’s a schema anyway?

A schema is just a JSON document that describes constraints on other JSON documents. That’s it. The example above is a schema.

... continue reading