To be a better programmer, write little proofs in your head

This is a brief write-up of a trick I learned that helps me write code faster and more accurately. I say "trick", but it's really something I started to do without noticing as I moved further into my career.

When you're working on something difficult, sketch a proof in your head as you go that your code will actually do what you want it to do. A simple idea, but easier said than done: doing this "online" without interrupting your flow takes a lot of practice. But once you get really good at it, you'll find that a surprising amount of the time your code will work on the first or second try. It feels a little magical.

There are lots of ways to pull this off, and I don't want to be too prescriptive. I'll just list a few examples of the kinds of things I find myself reasoning about on the fly, so you get the general idea.

Monotonicity

Something to keep an eye out for when proving things to yourself about your code is which parts are monotonic.

You're probably familiar with monotonic functions from mathematics. Informally, they're functions that don't "go backwards" - i.e. an increasing monotonic function can only increase or stay the same, while a decreasing monotonic function can only decrease or stay the same (these are also known as nondecreasing and nonincreasing functions, respectively.)

The concept of monotonic code is a little more nebulous than the concept of a monotonic function, but it captures the same idea of a process that can only proceed in one direction. Check-pointing, for example, is a great example of monotonicity. If you have (say) a script that needs to perform multiple tasks in sequence, you can keep a bit of state around on disk that details how many tasks you have completed so far. If something goes wrong and your script crashes, it can check the on-disk state to figure out how far it got, then start again from the earliest state that hasn't been run yet.

Checkpointing means that the "current step" pointer in your script can only go forwards, since the script can't regress and re-run a step it's already done. In this sense the script progresses monotonically, and it's apparent that if the script ever completes successfully, it will have run every step exactly once.

Keeping this kind of activity log is a simple idea that often pops up in surprising places, such as journaling file systems and database write-ahead logs. Another, more involved database example is an LSM tree. LSM trees are used in some databases to store rows in-memory and on-disk, and most of the time they are purely additive. Put loosely - an LSM tree keeps a log of all inserts, deletes, and updates, and scans the log to reconstruct the appropriate value of the row when the row is read. Stale operations are periodically discarded to save space in a process called compaction. The space taken by an LSM tree only grows (except during compaction, when it only shrinks.)

... continue reading