Build vs Buy: What This Week's Outages Should Teach You 11/19/2025 • Todd H. Gardner • EngineeringLeadershipBusiness
A few years back, I gave a conference talk called “Build vs Buy: Software Systems at Jurassic Park” where I argued that the real villain wasn’t the velociraptors or the T-Rex—it was Dennis Nedry’s custom software. The park’s catastrophic failure wasn’t just about one disgruntled programmer; it was about choosing to build critical infrastructure that should have been bought. You can watch the whole thing here, but this week’s events make the lesson worth revisiting.
In the span of a few days, we’ve watched some of the internet’s most critical infrastructure go down. Cloudflare had a major outage today that took down huge swaths of the web. GitHub went down. AWS had issues last week. And while each failure had its own specific cause, they all highlight the same fundamental problem: we’ve built our businesses on top of abstractions we don’t understand, controlled by companies we can’t influence.
The Simple Rule That Everyone Gets Wrong
Here’s the thing, if your core business function depends on some capability, you should own it if at all possible. You need to control your destiny, and you need to take every opportunity to be better than your competitors. If you just buy “the thing you do,” then why should anyone buy it from you?
But tech leaders consistently get this backwards. They’ll spend months building their own analytics tools while running their entire product on a cloud provider they don’t understand. They’ll craft artisanal monitoring solutions while their actual business logic—the thing customers pay for—runs on someone else’s computer.
The Infrastructure Trap
Of course, there are exceptions. Sometimes you can’t do something you depend on because of expertise or affordability. As a software provider, I need servers, networks, and datacenters to deliver my software, but I couldn’t afford to build a datacenter.
But here’s where most companies go wrong: just because I need some infrastructure doesn’t mean I should jump to a full-on cloud provider. I need some servers. I don’t need a globally-redundant PaaS that allows me to ignore how computers work. In my experience, that’s an outage waiting to happen.
This is what I mean about controlling your own destiny. Building my product on hardware is transparent. When something goes wrong, it’s understandable. A DIMM went bad. We lost a drive. The system needs to be swapped out. It’s understandable, and I have a timeline and alternatives that I can control.
... continue reading