Queueing Requests Queues Your Capacity Problems, Too

Here’s an exchange I had on twitter a few months ago:

twitter, the world’s largest town hall

My intent here is not to single this gentleman out, as from what I can tell, he runs a very successful business and is likely smarter than me in many ways. That said, he and I have a different understanding of queueing, a topic dear to my heart that I’ve written about before:

Allow me to paint you a picture: your p90 latency graph looks perfectly healthy at or around 1 second, while every customer request submitted in the last hour is experiencing latency of 1 hour because they are waiting behind 3.6 million other requests in the queue.

How could this happen?

A Few Numbers

Let’s say you’re providing an API, and real-world constraints lead you to believe that offering a queue-like interface would be a good alternative to your existing real-time request/response model.

Some more specifics:

Your request rate is a flat 1000 requests/second for 24 hours per day, 7 days per week.

Every request takes 1 second to process.

... continue reading