Things I Regret: Returning Modified Data In API Response Payloads
When you work on the same web application for the better part of a decade, your architectural choices have plenty of time to learn you a lesson, showing you what works and what definitely does not work. And, one software architectural choice that has bitten me in the butt time and time again is the decision to return modified data in an API response payload. If I could go back and rebuild all mutation requests, I would design them to return confirmation data only - no entity data.
To understand what I mean, imagine that we have a web application that has a Contact
entity. And, I have an API end-point that allows me to rename the Contact
with a POST
request:
POST /api/contact/4/rename?name=Hanna+Banana
Historically, in my web application, this POST
request would execute the use-case - renaming the Contact
- and would then return the Contact
data with the new name field in tow. Something like:
200 OK
{
"id": 4,
"name": "Hanna Banana", // <--- the new name.
"phone": "917-555-1234",
"isFavorite": true
}
If I were to build this POST
request today, however, I would code this to return a 204 No Content
response. A 204
is a success code, confirming that the POST
request was handled by the application; however, it wouldn't contain any other data.
If the client-side application needed the modified data, it could either optimistically update its own view model locally as part of the request processing; or, it could make a subsequent request back to the server to get the modified data for its view.
In the case of a create request, such as to create a new Contact
record within the application, I would return a 200 OK
with a response payload; but, it would only contain the unique identifier of the newly-created Contact
(assuming synchronous processing). Something like:
200 OK
{
"id": 5 // <--- the primary key of the newly-created entity.
}
And, again, if the client-side application needed the fleshed-out data for the newly-created Contact
record it could either optimistically update its own view-model locally; or, it could make a subsequent request - using the returned id
value - to get the fleshed-out Contact
record. The client-side application could even do both (an optimistic update locally plus a subsequent request to the server as the source of truth).
This makes the API a little more "chatty". But, in the long-run, I think it would have forced us to keep our code much, much cleaner and easier to maintain and evolve over time.
This is Really Just Command-Query Responsibility Segregation (CQRS)
Ultimately, this is really just a take on CQRS: Command-Query Responsibility Segregation. Which means exactly what it sounds like, that "commands" (ie, mutating the state in a target) should be separated from "queries" (ie, reading the state out of a target).
Inherently, there's nothing really wrong with returning state in a "command" request. The problem is, People are sloppy; and lazy; and have an irrational fear of code duplication. And this causes them to make code worse over time by coupling command-request responses to application views.
ASIDE: When I say that People are sloppy, lazy, and irrational, understand that I am pointing this finger at myself as much as at anyone else. The benefit of working on an application for a decade is that I get to see all the poor choices that I have made. And, learning from my own mistakes is probably the most effective way I can evolve as an engineer.
To illustrate how sloppiness affects code over time, imagine that this Contact
record (mentioned above) is used in a number of views within the application. And one view that allows a user to create new Contact
records needs to show an avatar next to each Contact
item. So, in order to fulfill that requirement, the developer goes into the "Create Contact" end-point and adds an avatarUrl
to the response payload.
This works great for this one view and this one engineer. But, the problem is, every other user interface (UI) that allows new Contact
records to be created now also receives that avatarUrl
regardless of whether or not they need it.
Now imagine how this sloppiness can build-up over time. First it's an avatarUrl
, then it's a isLoggedIn
flag, then it's a lastContactedAt
timestamp. Slowly, and eventually, the data requirements for different application views leak out into the rest of the application.
Not only does this mean more data over the network, it also means more processing time on the server and a markedly increased chance that any change to the API response payload will accidentally break some unexpected view within the application.
All of this cross-contamination could have been avoided if the "command" to create a new Contact
record was separated from the "query" to read the details of that newly-created Contact
record. Because, each query request can be tailor-made for a use-case, allowing the read-models for each view to evolve separately - and cleanly - over time.
Public APIs are a "Use Case"
One place in which it can be quite nice to return modified data in a "command request" response payload is within a public API. And that's because a public API represents a "use case". Meaning, a public API is build for a very specific type of workflow. As such, I would think of a public API as being more akin to a "single view" that can evolve over time, separate from other views.
I don't build public APIs; so, I can't speak to them in any real depth. I'm only bringing them up here because I believe that what makes a Public API valuable is not the same thing that makes internal APIs valuable. And, I don't want one use case to be extrapolated as a truth that can be applied to all other use cases.
Regret is a "Good Thing" ™
As a closing thought, I wanted to say that it's good to regret things. Because regret means that we're learning - that we're evolving. If I didn't regret the architectural choices that I've made, it would mean that I'm making the same choices today that I made a decade ago. And that would be a sad state.
It reminds me of Danny DeVito's monologue from The Big Kahuna:
Of course, for software engineers, it's never too late - we can always refactor!
Reader Comments
Ben,
no need to regret this. I think it comes down to consistency once the decision is made.
For example, the graphql apollo world now is embracing the pattern you are regretting, returning changed data as a call result:
https://www.apollographql.com/docs/apollo-server/schema/schema/#designing-mutations
To break my own advice:
The most practical API I built had a hybrid model. Where some calls would return data and others would only return acknowledgments. The API provided introspection and documented when data was returned and when not. So you could even find that out programmatically.
We did this when we noticed that most use cases would immediately query the API again for the changed data. Returning changed data was less overhead for everyone.
@Bilal,
A GraphQL context is interesting. And, I think different enough to have different rules. Since GraphQL allows the client to tell the server what it needs, I think you get less coupling, in general, to any particular end-point. After all, there really are no "end points" in GraphQL - there's just one end-point (from what I understand) that manages everything.
Where we've run into issues is when one engineer decides to start returning a data-point that now every other call has to get as well. But, with GraphQL, if you never ask for that data-point, then presumably no resolve on the server-side will calculate it. And, then it doesn't really matter.
But, I have no actual hands-on GraphQL experience, so pardon me if I'm missing something here.