A meta-analysis of three different notions of software complexity
I want to discuss three different notions of software complexity:
Rich Hickey’s notion of complexity, as explained in his talk Simple Made Easy.
John Ousterhout’s notion of complexity, as explained in his book A Philosophy of Software Design.
Zach Tellman’s notion of complexity, as explained in his newsletter Explaining Software Design.
I’ve picked these three because I’ve found them to be at least somewhat coherent, and the former two to be (relatively) well-known; this blog post is not meant to be exhaustive of all different definitions of complexity that have been offered over the years.
The definitions summarized
(Formatting note: exact quotes are taken from the above-mentioned work.)
Hickey complexity
Hickey defines something as being simple as having some kind of oneness - he uses the phrases “one fold/braid, one role, one task, one concept, one dimension.” Hickey states, “When you’re looking for something simple, you want to see it have focus in these areas. You don’t want to see it combining things.”
Hickey contrasts ‘easy’ with ‘simple’. For ‘easy’, the central idea is proximity – in terms of access (e.g. already having something be installed, being able to easily install something), in terms of existing skills and capabilities, and so on. Hickey points out that ‘easy’ is subjective, because it is relative to the person making the judgement.
As one example here, Hickey states that the parentheses in Clojure are hard (i.e. the opposite of easy) but simple, and that it is users’ responsibility to fix this.
On the other hand, Hickey states that this notion of simplicity is objective, “we can probably go and look and see, I don’t see any connections, I don’t see where this twists with something else.”
Hickey stresses the difference between simplicity and low cardinality (“But not: One instance, one operation.”). He points out that the important thing is “lack of interleaving, not cardinality.”
Hickey sums up his argument against complexity as follows:
“We can only hope to make reliable those things we can understand”
“We can only consider a few things at a time”
“Intertwined things must be considered together”
“Complexity undermines understanding”
According to Hickey, simplicity has the following benefits:
Ease of understanding
Ease of change
Easier debugging
Flexibility
In terms of examples, Hickey has a few slides in his presentation with some tables. The table below is a more condensed representation of Hickey’s tables: I may have misunderstood something; corrections welcome.
Complexity Complects Simpler alternative State, objects Everything that touches it Values Objects State, identity, value Values Methods Function and state, namespaces Functions, namespaces Variables Value, time Managed refs Inheritance Types Polymorphism a la carte Switch/matching Multiple who/what pairs Polymorphism a la carte Syntax Meaning, order Data Imperative loops, fold what/how Set functions Actors what/who Queues Variables Value, time Values ORM OMG Declarative data manipulation Conditionals Why, rest of program Rules
As one example, Hickey points out that “having state in your program is never simple” and “The only time you can really get rid of it, is it you present a functional interface on the outside, a true functional interface, same input, same output.”
Hickey contrasts the words “complect” (to intertwine, interleave, braid) and “compose” (to place together).
Ousterhout complexity
According to Ousterhout, “Complexity is anything related to the structure of a software system that makes it hard to understand and modify the system.” (Sec 2.1)
A closely related aspect is obviousness; a system is considered to be obvious if “a developer can quickly understand how the existing code works and what is required to make a change” and they “can make a quick guess about what to do, without thinking very hard, and yet be confident that the guess is correct” (Sec 2.2) as well as “their first guesses about the behavior or meaning of the code will be correct” (Ch. 18) Ousterhout points out that obviousness is one of the most important goals for system design; and that obviousness is the opposite of high cognitive load and unknown unknowns (see definitions below). In Chapter 18, Ousterhout emphasizes the obviousness is “in the mind of the reader” and that “If someone reading your code says it’s not obvious, then it’s not obvious, no matter how clear it may seem to you.”
“Complexity is caused by two things: dependencies and obscurity” (Sec 2.3) which are defined as:
“A dependency exists when a given piece of code cannot be understood and modified in isolation; the code relates in some way to other code, and the other code must be considered and/or modified if the given code is changed”. According to Ousterhout, dependencies are a “fundamental part of software and can’t be completely eliminated” but “one of the goals of software design is to reduce the number of dependencies and to make the dependencies that remain as simple and obvious as possible.” Ousterhout gives an example of a web site design, where specifying the background color for banners on individual pages creates an implicit dependency across pages. In that situation, factoring out the background color into a central location (e.g. by referencing a CSS variable) amounts to replacing a “nonobvious and difficult-to-manage dependency with a simpler and more obvious one.”
“Obscurity occurs when important information is not obvious.” Ousterhout identifies that the following as being associated with obscurity: Generic variable names which don’t carry much useful information Dependencies whose existence is not obvious (Ousterhout gives the example of a new error status being added, which may require modifying a table of strings for messages) Inconsistency (e.g. reusing the same variable name for different purposes) “Inadequate documentation” (although, Ousterhout adds a cautionary note “If a system has a clean and obvious design, then it will need less documentation. The need for extensive documentation is often a red flag that the design isn’t quite right”)
Ousterhout states that complexity has three different manifestations or symptoms:
“Change amplification: The first symptom of complexity is that a seemingly simple change requires code modifications in many different places.” This is where Ousterhout introduces the web site example, where the original version of the web site suffers from change amplification when the banner’s background color needs to be changed.
“Cognitive load [..] which refers to how much a developer needs to know in order to complete a task. A higher cognitive load means that developers have to spend more time learning the required information, and there is a greater risk of bugs because they have missed something important”
“Unknown unknowns: it is not obvious which pieces of code must be modified to complete a task, or what information a developer must have to carry out the task successfully”
Commentary:
The somewhat circular nature and similarity of the definitions makes it hard to understand what the differences between the various terms are. In particular:
Non-obviousness is considered to comprise of two aspects – high cognitive load and unknown unknowns.
When discussing the two causes of complexity: The discussion on dependencies itself reuses the terms “simple” (opp. complex) and “obvious”. The definition of obscurity uses “not obvious”.
Out of the three manifestations of complexity: The definition of change amplification uses the phrase “seemingly simple” The definition of unknown unknowns refers to non-obviousness.
In visual form:
As one specific thing, it’s weird for one of the purported causes of complexity to be defined in terms of one of the symptoms of complexity. For example, if we think about diseases, it would be weird for one of the potential causes for a disease to be defined in terms of one of the symptoms of the same disease.
Of course, this doesn’t mean that Ousterhout’s points are wrong or meaningless. If you’d said the same things to me in a conversation, I could probably understand what you’re getting at. But from the point of view of defining things, or as a framework, I think these points could be better organized to make operationalizing them easier.
Tellman complexity
Tellman offers the following definition of complexity: “The sum of every explanation. Weighted heavily towards future explanations. Measured in bits, but only relative to your audience’s expectations.”
Here, ‘explanation’ is defined as: “The core task of software development. When we try to understand software, we explain it to ourselves. When we change software, we explain it to others.”
So the word ‘explanation’ is used in both a concrete and somewhat more abstract sense – the concrete sense applies to things like pull request descriptions, code comments, and commit messages, whereas the more abstract sense applies to actions such as reading and debugging.
Tellman also uses a distinct term ‘surprisal’ across several posts, such as this phrasing in the newsletter’s original post: “the information conveyed by a message, its surprisal, depends on its audience. For something to be surprising, it must be unexpected. And so, simplicity is not intrinsic; it does not arise from our code’s size or syntax. Simplicity is a fitness between software and our expectations.”
Tellman also introduces several other concepts in his newsletter, such as the three different parts of an explanation:
The prefix is what your audience already knows, and is unstated. The content is the code, diagrams etc. that comprise your explanation in the now. The suffix is what you expect you’ll explain in the future.
Tellman defines coupling as “the degree to which two things tend to be explained together”, and that coupling has both costs and benefits – the cost being that the coupled concepts need to be explained together, and the benefit being that each concept offers some explanatory power for the other.
Commentary:
I reached out to Tellman before writing this post for some clarifications. According to him, his definition of complexity largely coincides with Ousterhout’s. I technically agree with this point, but I believe that there is a noticeable difference in the overall notion if you consider the entirety of the published work. I cover this in more detail in the next section.
Comparing the three different notions of complexity
Subjectivity
The first obvious thing that sticks out it is that Hickey’s definition of ‘complex’ is put forward as being objective, whereas both of Ousterhout’s and Tellman’s definitions have more subjectivity to them.
To me, this makes Ousterhout’s and Tellman’s definitions more in line with colloquial usage, where we expect developers being on-boarded to find things “more complex” but we expect this perception to reduce as they gain more familiarity.
Although Ousterhout does mention subjectivity in a few parts of the book (e.g. the mentions about code review), I’d argue that multiple parts of the book run counter to this point – in particular, when criticizing (or praising) existing APIs, Ousterhout spends little-to-no time seriously discussing alternate viewpoints.
For example, when praising the Unix I/O APIs, there is no discussion of the flaws in the APIs, or the complexities imposed by the API on applications, such as when debugging issues.
So the message comes across as a bit muddled.
Tellman’s writing doesn’t have a lot of concrete code examples, but given the overall definitions, I’d consider Tellman’s notion of complexity as maximizing the importance of considering different interpretations of the same artifacts.
Hickey does define “easy” as being subjective, but according to him, attempting to maximize that is a non-goal, and sometimes even counter-productive – the primary goal should be to minimize the objective notion of complexity. Consistent with that, the features he lists as being “simple” (opp. complex) are notably all features supported by Clojure. He even goes so far as to point out that it is users’ responsibility for fixing issues involving lack of familiarity. For alternate approaches, see Pyret and Rhombus, which have come out of the Racket community.
Different perspectives on coupling
Hickey considers “composition” as good, but “complecting” as (unconditionally) bad. However, his framing does not really clarify the distinction with sufficient substance to be able to apply it to examples outside of the ones he provides in his presentation.
For example, consider foreign keys in a relational database. In a sense, using foreign keys across tables “complects” the tables – suddenly, code performing write operations on one table potentially needs to worry about another table at the same time. Does that mean that using foreign key relationships is a bad idea per Hickey’s definition?
If we apply the same example to Ousterhout’s notion of complexity, a foreign key relationship enforced by the database would be considered a dependency (which generally should be avoided). However, Ousterhout points out that “obvious” dependencies are better than “non-obvious” ones, so per that, an explicit foreign key relationship would be better than an implicit one (if those are the intended semantics).
Tellman on the other hand, doesn’t really ascribe any intrinsic value judgement to coupling. Rather, he treats it as another tool in one’s toolbox. According to Tellman’s definition of coupling as co-explanation, the foreign key relationship should be added if the two tables often need to be explained together (e.g. if the two tables are meant to be joined together, or data in one needs to be deleted when a row is deleted in the other table).
As another example, consider distributed tracing. Distributed tracing introduces coupling across different parts of a distributed system, because you need to propagate trace IDs, collect spans somewhere, be able to query the spans in a flexible way, etc.
Let’s say you have a system with structured logging, and the question is under what circumstances does it make sense to also have distributed tracing.
By its essence, distributed tracing (and generally any kind of observability tooling) involves a kind of “intertwining” of application code with instrumentation code. You can make this explicit, such as by passing a “tracer” as a parameter to all the functions which are to be traced, or you can make it implicit by somehow having tooling which does that for you (e.g. by using thread-local variables, code rewriting tools etc.). But it’s still there. So per Hickey, should we always avoid adding tracing if we can get away with it?
A Philosophy of Software Design doesn’t talk about debugging much at all. If we try to extrapolate the ideas on reducing dependencies and preferring simplicity of implementation, then it also seems like according to Ousterhout, we should avoid having distributed tracing if we can get away without having it. OTOH, the original request is in some sense a “dependency” for all subsequent work it triggers, so perhaps it’s worth making it explicit (e.g. by using distributed tracing with a unique trace ID per request)?
In contrast, Tellman’s definition of coupling offers a clear answer in the form of a question: how often are you finding yourself trying to explain disparate parts of the system – potentially running on different machines – at the same time? If you’re doing that often (e.g. as part of debugging issues), then it makes sense to couple the two using the available mechanisms such as distributed tracing.
Competence floors
Let’s say you have a team of 4-5 software engineers at work with varying levels of tenure between 1-7 years and things are going alright in terms of shipping velocity.
Say there’s a new engineer who has been hired recently, and has been assigned tasks of varying levels of difficulty based on the level they were hired at, in one subsystem that is under active development/maintenance.
The new programmer has not been able to accomplish many of the tasks assigned to them. It is near the end of the probationary period, and a decision needs to be made whether this person should be kept on the team, or if they should be let go.
Say you are the engineering manager. You ask the new person why they believe they haven’t able to accomplish many of their assigned tasks. The new engineer cites the difficulty of understanding and modifying the existing code as the primary barrier. They state that they would have been able to succeed if:
They had received more thorough task descriptions.
They had received more detailed answers to questions posted in the team’s Slack channel.
If the code had been written in a simpler way.
So in a sense, their response is tied with a notion of software complexity.
How should you, as the engineering manager, factor this perspective into your final decision?
Hickey’s notion of complexity as an objective thing doesn’t really offer much here. According to Hickey, either the subsystem the person was assigned to work on is highly “complected” or not, and that should be unambiguous.
If the subsystem is highly “complected”, then according to Hickey, you should go and simplify it, regardless of whether you decide to keep this person. But given that other people have been able to work on the same subsystems just fine, does it make business sense to go and simplify the subsystem?
If the subsystem is not highly “complected”, then you have a mismatch between perceptions and expectations. Hickey’s notion of ease captures the subjective experience of unfamiliarity, such as that during on-boarding. But unfamiliarity can be fixed over time. So should you chalk down the struggles to unfamiliarity (and hence not think that they are too important)?
On the other hand, I think it’s fair to argue that Ousterhout’s description of complexity offers a clearer answer. If the code is hard to understand and modify, that means the code is complex, and that is not good. And the main goal of software design is to reduce complexity. So this person is a blessing in disguise, because they have surfaced this complexity to you, and the onus is on you to go and try to reduce it.
Finally, coming to Tellman’s ideas about complexity being a measure of future explanations, I think it’s reasonable to interpret them as offering you two main choices:
You retain the new person and try to make them succeed. In that case: Other engineers will likely have to write more thorough task descriptions, write longer comments in Slack explaining more details/concepts, potentially delegate smaller tasks etc. The code may need additional docs for things which are obvious to the existing programmers but not to the new programmer (i.e. the unstated ‘prefix’ needs to become part of the on-going explanations).
You let the new person go. In that case: Existing engineers can likely move faster, as they need to do less hand-holding for the new colleague. The code will not need additional docs that are useful only to the new colleague, but not to other people. You should probably go and reflect on your interview processes to understand what you could’ve done to better judge the skills necessary to succeed at the job.
In essence, this difficult situation is forcing you to better articulate a competence floor – yes, our system has this minimum baseline of complexity because of XYZ reasons, and ABC details need to be understood by people who join the team, because otherwise, the sum of future explanations that the team needs to make to on-board this person is beyond our “explanation budget.” (Yes, I just made up that term.)
Closing thoughts
In my mind, the goodness of a concept’s definition depends on three things:
To what extent does the definition line up against the concept’s instantiation in reality. To what extent can different people agree on whether something falls under that definition or not. To what extent can the definition be tested against examples and situations which were not articulated by the definition’s author.
I think Tellman’s notion of complexity in terms of explanations is probably one of the best definitions of software complexity we have today.
It captures the subjective experience of attempting to write software, both by oneself and with other people.
It is precise, without being over-prescriptive. The related definitions for various terms – coupling, surprisal, prefix and suffix – crisply capture closely related ideas.
When applied to situations outside of those discussed in the newsletter, it is not only able to provide a concrete way of thinking, it is able to illuminate a garden of forking paths, which is more representative of how real-world decision-making works.
Having a good shared definition for software complexity is essential if we as an industry are to move beyond the simplistic “complexity bad, simplicity good” meme or the slightly more refined “accidental complexity bad, essential complexity inevitable” meme. See also: Dan Luu on Against essential and accidental complexity and Lorin Hochstein’s more provocatively titled Dijkstra never took a biology course.
Next time you’re having a discussion with a colleague about complexity, I encourage you to frame the discussion in terms of what future explanations you expect to provide, and I suspect you’ll likely end up with a better conversation than the ones you’ve been having so far. ✌🏽