Tech News
← Back to articles

Pulling an Inverse Conway Maneuver at Netflix (2023)

read original related products more articles

Pulling an Inverse Conway Maneuver at Netflix

When I first joined the Netflix Platform team circa 2020, the Observability offering was composed of a series of tools serving different purposes. There was Atlas for metrics, Edgar for distributed tracing, Radar for Logs and Alerts, Lumen for dashboards, Telltale for app health, etc. It was a portfolio of about 20 different apps. Big and small, ranging from business-specific tools to analyze playback sessions to low-level tools for CPU profiling.

The Observability org was composed of three different teams. Each team had a mix of front-end, back-end and full-stack engineers. We also had one designer and one PM shared across the three teams. Each team was further subdivided into sub-teams of two to four engineers working on a specific sub-domain.

It was no coincidence that this org structure produced a set of independent apps. That’s the kind of architecture we’d expect based on Conway’s Law:

Any organization that designs a system (defined broadly) will produce a design whose structure is a copy of the organization’s communication structure. Melvin E. Conway

Simply put, the system’s architecture will be shaped like the org that produced it. This is because, to build a complex system, people must communicate to ensure the different pieces fit well together. Therefore, the design that emerges will be a map of the communications paths in the organization.

Netflix’s approach to building software further intensified this. Netflix embraces the Full Cycle Development; this means teams are fully responsible for all the stages of the software lifecycle, from Design to Operate and Support.

We were organized as highly aligned, loosely coupled teams with a high level of independence. ICs wholly owned every aspect of their work, from tech stack choices to which ticketing platform to use to track bugs. Netflix provides a recommended set of tools (known as “paved path”) but doesn’t mandate their adoption. Each team is free to pick whatever tools and practices suit them best.

Netflix has a “paved road” set of tools and practices that are formally supported by centralized teams. We don’t mandate adoption of those paved roads but encourage adoption by ensuring that development and operations using those technologies is a far better experience than not using them. Extract from Full Cycle Development blog post

This produced a set of heterogeneous apps to serve the observability needs of the company. Each team would focus on its own sub-domain and individual products to deliver the best possible experience. And users were happy with the result. At least for a while…

... continue reading