Systems Correctness Practices at Amazon Web Services
Published on: 2025-06-13 11:43:13
Amazon Web Services (AWS) strives to deliver reliable services that customers can trust completely. This requires maintaining the highest standards of security, durability, integrity, and availability—with systems correctness serving as the cornerstone for achieving these priorities. An April 2015 article published in Communications of the ACM, titled “How Amazon Web Services Uses Formal Methods,” highlighted the approach for ensuring the correctness of critical services that have since become among the most widely used by AWS customers.21 Central to this approach was TLA+,14 a formal specification language developed by Leslie Lamport. Our experience at AWS with TLA+ revealed two significant advantages of applying formal methods in practice. First, we could identify and eliminate subtle bugs early in development—bugs that would have eluded traditional approaches such as testing. Second, we gained the deep understanding and confidence needed to implement aggressive performance optimizat
... Read full article.