Skip to content
Tech News
← Back to articles

Flat Datacenter Networks at Scale at Amazon

read original get Networking Switches for Data Centers → more articles

The research roots of finding “optimal routing” networks trace back to the late 1970s. Mathematicians defined special kinds of networks called “expanders“. These are graphs with strong connectivity properties guaranteeing no subset of vertices can be isolated from the rest. In 1976, Leslie Valiant gave one of the earliest discussions of such graphs. Following work on Alon-Boppana on trying to understand the best “possible” expanders, mathematicians (notably, Lubotzky, Phillips, and Sarnak) gave constructions of such optimal expanders. These were intricate designs, used advanced number theory, and only work for specific network sizes and degrees.

Could there be a simpler, general purpose construction? In 1991, Friedman showed that a randomly wired network is, with high probability, nearly as good an expander as the best explicit construction. (A recent mathematical result of 2023 actually shows that random graphs match this bound.) The implication was tantalizing: if you want an optimal network for routing, you could simply wire it at random.

Meanwhile, the networking industry took a different path entirely. Inspired by Clos interconnects in switches, since the mid-1980s, communications networks were built on the fat-tree topology (a folded Clos) with two, three, or more layers of switches. As cloud computing exploded in the late 2000s, fat-trees were scaled up with increasing sophistication. In 2009, nine of us lead by Albert Greenberg published “VL2: A Scalable and Flexible Data Center Network”, which pushed the fat-tree architecture to new heights by introducing flat addressing and — notably — randomized Valiant Load Balancing to spread traffic uniformly across network paths. In 2019, the VL2 paper was awarded the SIGCOMM test of Time award. The VL2 work demonstrated that even within structured topologies, randomization of traffic (if not of topology) improved performance. But the underlying network remained hierarchical, rigid, and complex to cable.

In 2012, researchers at the University of Illinois connected random graphs and data center networks in a proposal called Jellyfish. This work generated much follow-on work. Being based on simple theoretical models and simulations, it had left critical hard problems open. Routing in random graphs is tricky because there are many more diversified paths data can take; cabling is harder because endpoints are chosen randomly; and operations become unpredictable. Building random networks at scale remained an elusive target: routing, cabling, and operations were the three unsolved challenges.

RNG (Resilient Network Graphs) history

In 2023, Giacomo Bernardi (AWS principal scientist) started to investigate whether datacenter routers could be arranged in a flat network following Penrose tiling, a geometrical construction where shapes tessellate without ever quite repeating. Ratul Mahajan, an Amazon Scholar and Professor at the University of Washington, was intrigued. The two spent months exploring the idea, building simulations, and pushing the concept as far as it would go.

... continue reading