Formal specs as sets of behaviors

Amazon’s recent announcement of their spec-driven AI tool, Kiro, inspired me to write a blog post on a completely unrelated topic: formal specifications. In particular, I wanted to write about how a formal specification is different from a traditional program. It took a while for this idea to really click in my own head, and I wanted to motivate some intuition here.

In particular, there have been a number of formal specification tools that have been developed in recent years which use programming-language-like notation, such as FizzBee, P, PlusCal, and Quint. I think these notations are more approachable for programmers than the more set-theoretic notation of TLA+. But I think the existence of programming-language-like formal specification languages makes it even more important to drive home the difference between a program and a formal spec.

The summary of this post is: a program is a list of instructions, a formal specification is a set of behaviors. But that’s not very informative on its own. Let’s get into it.

What kind of software do we want to specify

Generally speaking, we can divide the world of software into two types of programs. There is one type where you give the program a single input, and it produces a single output, and then it stops. The other type is a program is one that runs for an extended period of time and interacts with the world by receiving inputs over time, and generating outputs over time. In a paper published in the mid 1980s, the computer scientists David Harel (developer of statecharts) and Amir Pneuli (the first person to apply temporal logic to software specifications) made a distinction between programs they called transformational (which is like the first kind) and the another which they called reactive.

A compiler is an example of a transformational tool, but you can think of many command-line tools as falling into this category. An example of the second type is the flight control software in an airplane, which runs continuously, taking in inputs and generating outputs over time. In my world, we call services are a great example of reactive systems. They’re long-running programs that receive requests as inputs and generate responses as outputs. The specifications that I’m talking about here apply to the more general reactive case.

A motivating example: a counter

Let’s consider the humble counter as an example of a system whose behavior we want to specify. I’ll describe what operations I want my counter to support using Python syntax:

class Counter: def inc() -> None: ... def get() -> int: ... def reset() -> None: ...

My example will be sequential to keep things simple, but all of the concepts apply to specifying concurrent and distributed systems as well. Note that implementing a distributed counter is a common system design interview problem.

... continue reading