Architecture Payments Resiliency
The American Express core payments ecosystem is a global platform relied on by Card Members and partners around the world. Every day, it processes live payment transactions that require high availability, low latency, and predictable performance.
Resiliency is not an afterthought; it has been encoded into the system’s design from the beginning. Localized faults are contained within defined boundaries, and recovery is designed to be fast and predictable.
To achieve this, the platform is built around a cell-based architecture that isolates failures, maintains low-latency processing, and scales capacity without expanding the failure domain.
This blog outlines the principles that guide this architecture and how they help us build a resilient payments latform at global scale.
Core Payments Ecosystem
In 2018, we started a journey to modernize our core payments ecosystem. This platform processes live card and payment transactions and is mission-critical to our Card Members and partners.
As we modernized the platform, resiliency remained a primary design requirement. We needed an architecture that could continue processing transactions reliably, even when individual components failed. This decision was heavily influenced by our historical design patterns, which predated the term “cell-based architecture,” but share many of the same principles.
Our new platform targeted cloud-native technologies, which meant we needed to think differently about how we designed for resiliency and scalability.
In the next sections, we’ll discuss some of the design principles we follow in our core payments ecosystem and how they not only improve our ability to process payments reliably but also help us reduce latency and scale more easily.
... continue reading