Programming Affordances That Invite Mistakes

Many of my philosophies in work life, my volunteer life, and my personal life stem from experiences. As a developer, many of those come from being burnt by rough edges or mistakes. Just as health & safety principles come from accidents, my development practices come from bugs, errors, and mistakes.

With that in mind, here’s a war-story from my days running an R&D startup when we lost all the data we thought we gathered from a psychology study.

I founded an R&D startup that worked closely with psychology researchers. We would build tools for gathering data from psychology exercises, which the researchers would use in their studies or assessments.

One of these tests was the Stroop test - users would get shown four words in different colours, with one of those colours shown in the middle of the 4 words. The user would have to respond as quickly as possible to pick the right word, through pressing the up/down/left/right keys on their keyboard. The user would be asked to do this with some test words (usually colours, office supplies, words without significant meaning to the user). Once they had an average response time within an acceptable range (in milliseconds), we would introduce some words that we wanted to measure the user's reaction on. For example, if a user was particularly addicted to tea, then their reaction time would be different when "tea" is one of the words they get to choose from than a control wordset without any mention of tea. These tests require complete concentration for a significant period of time.

Here, the participant is expected to press the →arrow, to choose “Chair”

When an addictive word is added, the participant has a different reaction time to the control case (here expected to choose → “Water”)

We built the Stroop test tooling to work with any group of users that the researchers wished to analyze. Sometimes it'd be those addicted to substances, sometimes it would be to measure the societal factors (e.g fear of terrorism). In the normal workflow, we'd get the users to add their study email address so that the researchers could follow up with their results. We'd also send the participants their own results via email, in case they were curious. The tech was boring: a PHP backend, MySQL database, and a minimal JavaScript frontend. The results would be sent after the user had completed the full test, then stored in the database.

One of the studies assessed a group of people who worked in a high security workplace. They would not have internet access, and we would need to provide a locked down device that would only be used for the study. No problem, we setup a laptop with a locked-down Linux install, and put the code on there to run locally just as had in development. We disabled WiFi, Bluetooth, and anything else we could. These instructions came on the same morning as the test: so as we drove to the location, we were fixing the code and testing.

We arrive, set up the test, then the study participants all come and take the test. It all seemed good, each participant managed to complete the test successfully. We left feeling good.

As we arrived back at the office, we realized that the data was missing. All of the data. We had not saved any of the participant responses, and it wasn't clear why. The code had worked locally on the machine when we tested, but not in the secure location. We did, however, have logs. The logs seemed okay, nothing particularly special. Then we decided to test without internet access. The problem became clear.

... continue reading