When building REST APIs for microservices, there are a few design decisions to make about responses. Some responses are clearly a product of the protocols surrounding your microservice – things like the 3xx codes for instance are all about redirects and routing.
In general, you will be trying to get the right 2xx codes for success. If in doubt, it’ll be 200 (OK), but consider 201 (CREATED) for requests intended to create data and 202 (ACCEPTED) for requests that will go on to be processed later.
In this article, I’d like to discuss the 4xx and 5xx responses, used for errors. I’d also like to consider whether your service is going to attempt to tolerate downstream errors. The more sophisticated you want your software to be, the more precise you want your internal exceptions to be.
And by precise, I mean simple.
A clumsy exception handling strategy will ultimately lead to hard work to hit all uses cases.
Simplify Simplify Simplify
Every time I’m asked for an opinion on exceptions and errors in our microservices at the moment, I reply with the same answer.
There are two categories of error… it went wrong, or you’re wrong.
The client error is the easiest to detect and needs an error handling as precise as the response codes. Often 404 errors are not really an exception, so much as the return of zero results. For other errors, you’ve essentially got:
- Security violations, which should be checked for in a suitable framework before you process the request
- Invalid request – usually a malformed body
It’s easy to forget that a random Json parsing exception could simply be classified as a you’re wrong, if it happens at the right layer.
Once you know what classification you’re trying to prove, in a simple form, it’s relatively easy to see what to do and what to test.
It Went Wrong
These errors fall into two categories:
- My algorithm doesn’t stretch to this edge case – sorry
- Some downstream service isn’t working
Of these two, the latter may have some variants where a retry strategy needs to be applied on the error to have another go at the request before giving us, thus avoiding network blips, or avoiding playing some sort of game of chance with several dependent services, any one of which might be blipping at the moment.
Clue: if getting a response feels like a game of Yahtzee, you need to add some retries, and these should be around clearly defined retryable it went wrong errors.
If your retry strategy is wrong, it will retry things that are:
- My algorithm can’t cope
- The request can never be valid
Surely Things In Life Can’t Be This Binary?
There are two types of people. There are those that think everything is a binary choice, and then there are some others…
Starting with the binary choice of the title is quite a good/strong start. Then dividing each category down into sub categories if necessary can help you deal with specific nuances.
So far this is working for us.
You can build on it incrementally.
There are two types of people: people who understand how to build things incrementally, and… I’ll tell you about the other type another day.