Skip to content
Tech News
← Back to articles

A human postmortem of the 1996 AOL outage

read original more articles
Why This Matters

The 1996 AOL outage highlights the critical importance of reliable infrastructure as internet usage surged globally. It underscores how technical failures can have widespread societal impacts, especially during pivotal moments of digital adoption, prompting the tech industry to prioritize resilience and robustness in network systems.

Key Takeaways

Mac is a Senior Platform Engineer who specializes in site reliability engineering and security. Aside from blogging , you can find Mac trying to fight the system from both within and without in Raleigh, North Carolina.

Disclaimer: We, ngrok, have sponsored Mac to write this post because we think it’s an underexplored perspective on the topic of reliability. We’re glad to have the opportunity to give writers the space and time to do this, but the opinions are Mac’s, not the company’s. Enjoy!

Artwork by arthurxmedic.

Picture yourself traveling back to August 7th, 1996. Close your eyes and imagine a world where tensions are high with Russia, China, and in the Middle East, people are concerned about a tech bubble, and bell-bottoms are back in style. Difficult to imagine, I know.

Open your eyes, you’re in 1996 now. You probably just got back from work or school, hoping to unwind. Maybe you put something on the stereo, still clinging to the waning grunge era. You sit down in your squeaky desk chair and are welcomed by the Windows 95 boot screen. But this time, when you try to connect to America Online, rather than seeing your email inbox, info about popular sitcoms, or NASA announcing evidence of life on Mars, instead you see:

Image credit to CBS News.

America Online was down, and it would stay down for 19 hours. It pushed that news of life on Mars right off the front page of the New York Times.

Now, technically this outage shouldn’t have been that notable. America Online went down for maintenance regularly. This regular maintenance was what triggered the outage in the first place. There was even a similar outage during peak hours a few months prior that didn’t make the news at all (I only found out about it through oral history which I’ll get into later). Why did this one make the front page?

At that time, the world was joining the internet in droves. The number of people online was beginning to hockey-stick. My theory is that we had clearly passed some kind of inflection point where the internet was starting to become integral to our daily lives. And us humans really don’t like when we are reminded of the fragility of things we depend on.

As someone who works in the field of site reliability engineering (SRE), I became a little obsessed with researching this outage. It was essentially the first example of people outside of the industry realizing how important it is for internet stuff to keep running. And that collective desire is what keeps me employed.

... continue reading