Abstract Machine Models Also: what Rust got particularly right

Ever since 2010, I have studied the “meta” of software, by studying (and thinking about) the continued dialogue between programming language designers, computer designers, and programmers.

The following constitutes a snapshot of my current thinking.

Epistemological context During the period 2008-2012, I was requested to help design&build programming tools for a proposed new microprocessor architecture. The details of said architecture do not matter here; what is interesting is that folk in that research group had the following idea: their architecture had many design knobs, and they didn’t know what position to choose in the design spectrum.

so instead they decided they would abstract over all the possible design points, and spell out a model that describes them all.

then they asked me to make programming tools that work over all the possible-but-not-yet-existing platforms described by that model. Of note, I did not know this is what was asked of me at the beginning. I only figured it out towards the end. The more remarkable aspect is that these researchers did not understand that this is what they were doing either. Instead, what they were saying they were doing was: “we’re trying to invent a microprocessor.”

“we specified a virtual machine for it and we implemented an emulator.”

“it’s not finished.” (what they meant, but did not understand, is that too many design points were remaining open; plus the emulator was always changing and incomplete.)

“please make a C compiler happen.” What wasn’t clear to me at the time, nor to them, is that the particular design choices that go into a hardware microprocessor have heavy implications on the design and implementation of programming tools. In particular, we found that when we change the hardware too much, it becomes hard to design efficient algorithms using existing languages, even if a compiler/toolchain exists. So the particular project above wasn’t exactly successful (there was no hardware, and too many knobs were left unspecified), I explained so much in the preface to my doctoral thesis, also later ranted about the arrogance of such a scientific premise in the context of software history, then that project stopped. But the insight about this “software meta” was still open: why are programming languages so intimately linked to hardware architectures? I continued studying this question throughout the period 2012-2018, and it remains a hobby to this day. ❦❦❦ As I started my own “research program” on this topic, I spent time to scrutinize the vocabulary in use in the field and in the academic community. It turns out, computer architecture practitioners really like working with models, because models are cheaper to work with than real hardware. After all, there’s never enough money in academia. But then, they also ask computer scientists and programmers to work with them, and everyone is then working in model-land, with emulators and simulators. To simplify and satirize the situation, it is as if millions of euros were spent finding solutions to important social and mechanical problems by analyzing and programming the PICO-8 fantasy console. Everyone says “my architecture, and my software for it, do things” and nods politely to each other, without ever acknowledging that there’s no way for anyone to hold an artifact in their hands that works like what their model predicts. By far, the most insiduous intellectual fallacy I found in that environment is that enough scientists confuse specification and description. They design a model to describe something and make predictions about it (that’s fine), then they change their model and say “we can build a physical system that follows this new model”. The latter is scientific nonsense. Changing a descriptive model is not an act of design. It does not “create” anything in the real world. If one is lucky, the new model can describe something else that happens to exist already. If one is unlucky, the new model describes nothing that exists; that new model is useless and pointless. That was my first self-guided foray in computing epistemology: modeling and specification (& system design) are two fundamentally different intellectual activities. I held my first public talk about this in 2013.

Boundary of functional semantics: syntactic variance One of the major obstacles I encountered on my way to the above insight was the existence of Haskell (the programming language), and a community of peers who were very strong Haskell advocates and practitioners. Haskell presented an obstacle because Haskell has denotational semantics: a machine-independent model of what Haskell programs “do” when they run. It is machine-independent because it does not require the decomposition of a computation into hardware steps to predict the result. At face value, back then, I was thinking that Haskell can be used to specify programs and their behavior in the abstract, but also simultaneously their behavior in the real world. It felt, to me, as if Haskell’s descriptive abstract model somehow had “specification power” over the physical world. As long as was stuck there, I was not able to see the difference between description and specification. My breakthrough happened when I saw these three specifications of an integer sort function: f :: [ Int ] -> [ Int ] f = map sum . transpose . transpose . map ( flip replicate 1 ) f ( p : xs ) = f [ y | y <- xs , y < p ] ++ [ p ] ++ f [ y | y <- xs , y >= p ] f ( h : tl ) = snd $ foldl g ( h , [] ) tl where g ( s , r ) x | x < s = ( x , s : r ) | otherwise = ( s , x : r ) (The first one is an implementation of bead sort, the second one is a quicksort, and the third one is an insertion sort.) In Haskell’s denotational semantics, these three implementations are functionally equivalent: we could swap one for another in any program without change in semantics. And then, because Haskell’s denotational semantics only provides knowledge about functional semantics, this means that there is no difference, under that model, between these three implementations. And yet, these three implementations exist, and they differ from each other. Why would a programmer ever want to choose one over another? There was clearly something that mattered in the expression of each of these three functions, but that the denotational semantics model was not able to represent. ❦❦❦ That “special something”, of course, had something to do with run-time performance. In casual conversations, and some less-than-stellar writeups by fellow practitioners, I often heard the following theory: since the model predicts these functions are equivalent, “it should be possible” to make an optimizing compiler which, given one of them, automatically derives the others, to choose the best one given a target hardware platform. The suggestion that was made to me was that to build software in general, it should be good enough to invent one program that works in an abstract model, and then build “sufficiently clever” compilers that take care of translating that program optimally to any target hardware without additional programmer input. And so I investigated this. Could it be true? Maybe these differences above were just inconsequential noise? Alas for my fellow practictioners, I was vindicated by the discovery of the following article, which cemented my distrust of descriptive semantic models forever: Jeroen Voeten, On the fundamental limitations of transformational design, ACM Transactions on Design Automation of Electronic Systems, Volume 6, Issue 4, October 2001, pp 533–552, DOI 10.1145/502175.502181. In short, Jeroen V. demonstrated mathematically that if a specification system is sufficiently general (like Haskell’s semantics), the automatic derivation of all functionally equivalent specifications from a given starting point is undecidable. So much for universally general optimizing compilers. In other words, the choice made by programmers for one implementation over another, when they are functionally equivalent, matters somehow in a way that cannot be described in the model of their functional semantics. In hindsight, I recognize that sounds obvious, almost tautological. Yet, virtually all of my peers at the time did not believe me at first, and were annoyed that my statements could risk their sources of funding.

Abstract Machine Models Introduction From there, I focused on the following: “what’s in the mind of programmers, when they choose one way of doing things over another that’s functionally equivalent?” The one thing that was clear from the start, is that most programmers “simulate” the behavior of their program in their mind, to predict how the program will behave at run-time. As we’ve determined above, that simulation does not happen in the functional model of the programming language. Meanwhile, I knew from my teaching practice that nobody really understands hardware computers, and so this mental simulation was also not happening with a model of a hardware platform. In fact, I’ve found that folk would rather not think about hardware at all, and thankfully so: this made it possible, over and over, to port software from one hardware platform to another, without rewriting the software. This meant that all programmers are able to construct a somewhat abstract model of their computer in their mind, but not so abstract that it becomes purely functional. That is when I coined the phrase abstract machine model (AMM), and it became the anchor of my subsequent study. I then made a prediction of what AMMs would and would not include: AMMs extend functional models with models/intuition of extra-functional behavior, including: Time to result. Memory usage. Available I/O primitives. Interfaces with debuggers and tracing facilities. Throughput/latency of operations. Jitter of operations. Energy expenditure.

... continue reading