The Problem Of Tracking Failures When Using Fallback Values In A Circuit Breaker In ColdFusion
Thinking about Circuit Breakers has been a really fascinating thought experiment. Every time that I think I'm getting closer to a "good" solution, I realize that there's some aspect or edge-case or unfortunate coupling that I hadn't considered (for example, I'm still using the "system time" rather than a time "implementation" that can be mocked). The problem that I'm grappling with right now is the tracking of errors while using a Fallback value in my Circuit Breaker. And, as of this moment, I don't really have a satisfactory solution.
In an earlier post, I extracted the state management from the Circuit Breaker control flow. I felt like this was a really positive step because it created a clean separation of concerns and teased the state management out into abstraction that could be implemented in various ways. Essentially, it made the Circuit Breaker generic, allowing it to be used with a number of independently-testable state management implementations.
Now, in my version of the Circuit Breaker, the concept of "fallback values" are handled inside the Circuit Breaker's flow of control. In fact, they are handled at the very top of that action execution. Here's the source code for one of the public "execute" methods:
/**
* I marshal the given action inside the Circuit Breaker.
*
* @target I am the function or closure to be invoked.
* @fallback I am the value to be evaluated if the action fails to complete successfully.
* @output false
*/
public any function execute(
required any target,
any fallback
) {
try {
return( run( target ) );
} catch ( any error ) {
// If a fallback has been provided, return the fallback instead of letting
// the error propagate to the calling context.
if ( structKeyExists( arguments, "fallback" ) ) {
return( evaluateFallback( fallback ) );
}
rethrow;
}
}
As you can see, the top-level execute() function wraps the underlying run() method and tries to evaluate the fallback should the underlying run() method fail. The problem here, when it comes to tracking errors, is that nothing beyond the scope of this method actually knows about the error handling. Externally, the fallback value is used in lieu of a thrown-error (to hide the error). Internally, the run() method doesn't know that a fallback value was provided in the higher-level function.
This means that externally, we can't track the error because there's no propagated error to track. And, internally, we can't track the error in the low-level run() method because nothing inside the run() method knows about the fallback value; and, if the run() method were to naively track the error, the error might get double-logged by both the Circuit Breaker and the application (in the cases where no fallback value were provided).
So, the only place to track the error would be in the top-level try-catch of the execute() method; and, only in cases where a fallback value was provided (meaning the error wouldn't propagate). But, I really liked the idea that the Circuit Breaker had no dependency other than the "state" implementation. If the Circuit Breaker itself needed to log errors, it means that we'd have to inject some sort of Logger into the Circuit Breaker which means we would have to create another Interface just for logging errors and deal with the increased complexity of instantiating the Circuit Breaker with these dependencies.
Another option would be to get the State implementation to expose an error-tracking method that the Circuit Breaker could consume from within that top-level try-catch block. But, this also feels funky since error logging has nothing to do with the "state" of the Circuit Breaker. Meaning, the failure to execute the command is already affecting the state of the system - the State doesn't actually need the Error object to do this. As such, it feels like such a method would be overloading the responsibilities of state management.
If I squint hard enough, I could probably justify tracking "fallback" usage in the state implementation. Such a method would grant me reason to pass the error object into the state implementation under the guise of "metrics". But, really, there's no inherent value in tracking fallback usage since fallbacks don't influence the state machine.
This conundrum might indicate that the fallback value is being handled in the wrong place. But, I don't think it is; keeping it in the Circuit Breaker makes the Circuit Breaker easier to consume. I wouldn't want to push the try/catch functionality into the calling context - that would spread a lot of unnecessary boiler-plate logic throughout the consuming code.
Ultimately, I'll probably just add an optional method to the State interface for tracking the error. I don't love this approach; I don't think it make sense; but, at this time, it seems like it would provide the most amount of value with the least amount of effort. And, it wouldn't require any additional dependencies to be injected into the Circuit Breaker.
To be continued.
Want to use code from this post? Check out the license.
Reader Comments
@All,
I finally took all of my noodling on the concept of Circuit Breakers and turned it into a GitHub project. While it's not the end of the journey, this forced me to clean it up and add unit tests:
www.bennadel.com/blog/3190-coldfusion-circuit-breaker-project-on-github.htm
Now, I'll have a more directed way to continue evolving my understanding of the concept.