The yaml document from hell written by Ruud van Asseldonk
published 11 January 2023
For a data format, yaml is extremely complicated. It aims to be a human-friendly format, but in striving for that it introduces so much complexity, that I would argue it achieves the opposite result. Yaml is full of footguns and its friendliness is deceptive. In this post I want to demonstrate this through an example.
This post is a rant, and more opinionated than my usual writing.
Yaml is really, really complex
Json is simple. The entire json spec consists of six railroad diagrams. It’s a simple data format with a simple syntax and that’s all there is to it. Yaml on the other hand, is complex. So complex, that its specification consists of 10 chapters with sections numbered four levels deep and a dedicated errata page.
The json spec is not versioned. There were two changes to it in 2005 (the removal of comments, and the addition of scientific notation for numbers), but it has been frozen since — almost two decades now. The yaml spec on the other hand is versioned. The latest revision is fairly recent, 1.2.2 from October 2021. Yaml 1.2 differs substantially from 1.1: the same document can parse differently under different yaml versions. We will see multiple examples of this later.
Json is so obvious that Douglas Crockford claims to have discovered it — not invented. I couldn’t find any reference for how long it took him to write up the spec, but it was probably hours rather than weeks. The change from yaml 1.2.1 to 1.2.2 on the other hand, was a multi-year effort by a team of experts:
This revision is the result of years of work by the new YAML language development team. Each person on this team has a deep knowledge of the language and has written and maintains important open source YAML frameworks and tools.
Furthermore this team plans to actively evolve yaml, rather than to freeze it.
... continue reading