Designing Distributed Systems: Patterns And Paradigms For Scalable, Reliable Services By Brendan Burns
Over the weekend, I read Designin Distributed Systems: Patterns and Paradigms for Scalable, Reliable Services by Brendan Burns. I really enjoyed this book; although, I must admit it wasn't exactly what I expected. The description of the book barely mentions containers. And, while the patterns discussed in the book aren't necessarily confined to containerization, almost all of the patterns in the book are defined in a container-based context, subsequently exercised within a Kubernetes deployment pipeline. This was a good thing. As someone who is very new to container-based development and deployment, seeing architectural patterns discussed from a container-centric viewpoint was eye-opening. And, it gave me a lot of perspective on how to define cohesive boundaries around feature-focused services in a microservice landscape.
|
|
|
||
|
|
|||
|
|
|
Author Brendan Burns is the cofounder of the Kubernetes open source project. So, it's not surprising that his "hands on" examples all revolve around deploying containers using Kubernetes configuration files. I, myself, know only a little bit about Docker; and nothing at all about Kubernetes. As such, I tried to extract the "intent" of the Kubernetes configuration files as I was reading them. But, I believe I would have gotten more out of the book had I had at least some experience with Kubernetes.
As I was reading the book, I couldn't help but think of Enterprise Integration Patterns: Designing, Building, and Deploying Messaging Solutions by Gregor Hohpe and Bobby Woolf. Many of the patterns discussed by Burns were quite reminiscent of the message queue patterns discussed by Hohpe and Wolfe. In fact, I'm pretty sure that both books even use similar names for some of their design patterns (ex, "Scatter and Gather"). This makes sense given that both books are focused on splitting complicated monolithic systems into compositions of smaller, cohesive, and reusable services. I think you could even argue that Burns lays out concrete approaches that one could use to implement the Producer and Consumer services partaking in the message-based workflows described in the Enterprise Integration Patterns book.
If nothing else, I think that drawing parallels between the two books speaks to the power of design patterns. Both, in their ability to guide solutions and in their ability to facilitate conversation through the use of a common vocabulary.
That said, one of the approaches that I found most fascinating in Designing Distributed Systems was actually in relation to message queue consumption. Rather than having a Worker container pull messages directly from a system like Amazon's Simple Queue Service (SQS), Burns advocates for the creation of a "Work Queue Source Ambassador". This Ambassador container would be deployed alongside the Worker container and would expose a generic API for queue manipulation. The Ambassador container would abstracts-away the implementation details of the message queue persistence, keeping the Worker container completely decoupled from any specific technology choices.
The Work Queue Source Ambassador was just one example in an overarching theme that was woven throughout the book: use collections of small containers together to achieve outcomes in order to allow any individual container to have maximum focus and reusability. Burns examines this concept at the individual container level, talking about parameterizing containers using command-line arguments and environment variables. He, then steps up a level, talking about multi-container side-car and ambassador patterns on a single node. And then, ultimately, using multi-node patterns to create a powerful microservices architecture.
I happen to think that Burns does a great job of articulating the "dream" of the microservices landscape:
Microservices break down an application into small pieces, each focused on providing a single service. This reduced scope enables each service to be built and maintained by a single "two pizza" team. Reduced team size also reduces the overhead associated with keeping a team focused and moving in one direction.
Additionally, the introduction of formal APIs in between different microservices decouples the teams from one another and provides a reliable contract between the different services. This formal contract reduces the need for tight synchronization among the teams because the team providing the API understands the surface area that it needs to keep stable, and the team consuming the API can rely on a stable service without worrying about its details. This decoupling enables teams to independently manage their code and release schedules, which in turn improves each team's ability to iterate and improve their code.
Finally, the decoupling of microservices enables better scaling. Because each component has been broken out into its own service, it can be scaled independently. (Kindle Location 990)
I say "dream" because, as Burns even concedes, designing and architecting a microservices system is hard. And, monitoring and debugging such a system is much harder than monitoring and debugging its monolithic counterpart. This is something that I can easily agree with. In my limited experience with microservices, I can tell you that decoupled services quickly become tightly coupled; and, that "reliable contracts" between services quickly become moving targets with ever-breaking changes.
As a final thought, I wanted to touch on FaaS - Functions as a Service. The use of systems like Amazon's Lambda Service confuse me. While I like them in the most abstract sense, I have absolutely no handle on how they come to play in a concrete application. Burns touches on FaaS in his section on "Serving Patterns"; but, unfortunately, my lack of clarity with regard to Faas persists.
I did like that Burns prescribes FaaS for only a subset of known problems:
As with many tools for developing a distributed system, it can be tempting to see a particular solution like event-driven processing as a universal hammer. However, the truth is that it is best suited to a particular set of problems. Within a particular context it is a powerful tool, but stretching it to fit all applications or systems will lead to overly complicated, brittle designs. (Kindle Location 1826)
And, I did appreciate that he talks about the complications in using FaaS:
As described in the previous section, developing systems using FaaS forces you to strongly decouple each piece of your service. Each function is entirely independent. The only communication is across the network, and each function instance cannot have local memory, requiring all states to be stored in a storage service. This forced decoupling can improve the agility and speed with which you can develop services, but it can also significantly complicate the operations of the same service.
In particular, it is often quite difficult to obtain a comprehensive view of your service, determine how the various functions integrate with one another, and understand when things go wrong, and why they go wrong. Additionally, the request-based and serverless nature of functions means that certain problems are quite difficult to detect. (Kindle Location 1854)
And, I was intrigued by the fact that most FaaS systems aren't good for tasks that require too much processing:
.... because of the serverless nature of the implementation of theses services, the runtime of any particular function instance is generally time bounded. This means that FaaS is usually a poor fit for situations that require processing. (Kindle Location 1880)
And, I was titillated by the idea that FaaS is no longer economically viable once you can keep the CPU running long enough:
But if you have a sufficient number of requests to keep a function active, then it's likely you are overpaying for the requests you are processing.
.... as a service grows, the number of requests that you are servicing grows to the point where you can keep a processor continuously active servicing user requests. At this point, the economics of a pay-per-request model start to become bad, and only get worse because the cost of cloud virtual machines generally decreases as you add more cores (and also via committed resources like reservations or sustained use discounts), whereas the cost per-request largely grows linearly with the number of requests. (Kindle Location 1909)
Which left me reflecting on my existing confusion: when and why would I use Functions as a Service? To be clear, Burns does go on to outline the use of event-based, short-running, low-processing tasks like Two-Factor Authentication (2FA). But, given the fact that these are small, low-overhead, non-continuous tasks, I question why I would need to scale them independently? Why not just include these Functions in another container-based service that is closely related?
I am sure more clarity will come as I start to get some hands-on experience with FaaS technologies.
FaaS confusion aside, I really enjoyed this book. It was a quick, mostly-easy read. And, as someone who is very new to the world of containerized development, it was helpful to see design patterns surrounding container-based systems. It was also another great reminder of how powerful decoupling is at all levels of application development. And, if nothing else, I'll now have a better sense of what our Platform team is talking about when they mention things like "Sidecar containers."
Reader Comments