Throttling is a very accepted concept in computer science. But this is the most misused and misunderstood lever of computer science. When new engineers join Roblox, their first solutions often include, “If we could just tell our creators to tweak this config or slow down their events…”. Veteran Roblox engineers then gently explain our value of respecting the community and that we don’t tell our creators what to do.
For example, most gaming systems have a simple solution for matchmaking when millions of players click play simultaneously. They throttle the joins, make players wait, or send them to random servers by skipping the matchmaking algorithm. At Roblox, we do the opposite. We redesigned our entire matchmaking systems for thundering herds of players. At peak, this system evaluates up to 4 billion possible join combinations per second. Years ago, we set the objective of 10 million joins in 10 seconds, and we continue to iterate toward that goal.
To avoid throttling due to capacity, we’re experimenting with cloud bursting as part of our transition to a cellular infrastructure, allowing for dynamic and compute-efficient scaling. This architecture handles peak demand by matching users to both on-premise and cloud edge data center cells. We’re working toward a fully automated bring-up and tear-down of cloud-based edge data centers that are fully abstracted for the matchmaking algorithm.
Another example is our text-filter system, which at peak handles 250,000 requests per second. That’s a large model inference running 250,000 tokens with constantly expanding context windows. And with more than 300 AI inference pipelines running in production, Roblox service owners invest a lot of time in finding the ideal mix of inference profiles between GPUs and CPUs. Even under peak loads, Roblox engineers respect the community by prioritizing creator freedom and user safety.