postPuppies. It makes a
POST request to upload puppy pictures to a user's profile. This is serious business. Your customers will be upset if they can't access pictures of puppies just because your network request failed.
Linear vs. exponential retries
To make sure no one goes without their dog pictures, you write some code that calls
n seconds until it succeeds.
If the photos aren't being saved after a few tries, there's probably something wrong. Maybe the server is down or there's an issue with the user's network connection. Retrying as soon as possible is like walking face-first into a closed door 10 times in a row and expecting a different outcome.
Instead of retrying every
n seconds, we could retry every nk seconds, where k is the number of attempts. This is a simplified description of exponential backoff. When a server is swamped, an exponential delay between retries can dramatically reduce its workload.
Marc Brooker from AWS wrote this great article on exponential backoff. Here's a graph he made to show how the approach reduces network requests:
Improve the algorithm with jitter
If many client requests are pushed away at the same time, they may return to the server in waves when you least expect it.
We can further improve the algorithm by adding jitter, statistical randomness, to each delay so that requests are more spread out over time. Here's another graph from Marc's article to show how jitter makes a difference:
The chart compares several approaches to generating jitter. Notice how most of them greatly reduce calls among competing clients. I won't rehash the details here, but I recommend reading the AWS article for a deeper dive.
Implementation in ES6 / TypeScript
I implemented this retry strategy as an excuse to play around with TypeScript and some of the newer features in ES6. The module is called
Backoff. Here's an example to show you how it's instantiated:
And here's the module itself:
postPuppies() has succeded thanks to our retry strategy, I'll leave you with this final thought: