Calling Timeout.unref() With setTimeout() Does Not Appear To Be A Best Practice In Node.js
When I was building my Node.js Circuit Breaker, I spent a lot of time looking at existing Circuit Breaker implementations to see how different people approached the problem. And, as I was digging through source code, I consistently came across something that I had never seen before: developers were calling .unref() on the timeout object returned by setTimeout(). Since this was showing up in a number of Circuit Breaker implementations, I assumed it was a "best practice" but, I didn't understand it. So, I started looking into the .unref() functionality. And, from what I can tell, there doesn't appear to be any reason to consider this a "best practice". In fact, there's reason to think that this is a bad practice.
If you've never seen the .unref() method before - as I had not - it's a way to tell Node.js not to hold the current process open if the given timer is the only thing left to execute on the event-loop queue. To see this in action, we can set up a trivial example that does nothing but initiate a setTimeout() call and then .unref() it:
console.log( "App started." );
var timer = setTimeout(
function timeoutCallback() {
console.log( "Timer callback." );
},
1000
);
// At this point, the timer is the only action that would hold the node-process open. By
// unref-ing it, the process will exit prior to the timeout callback execution and we
// should never see the second console.log() statement.
timer.unref();
As you can see, our setTimeout() will log a message to the console. But, we immediately .unref() the timeout object after it has been created. This tells Node.js to exit out of the process (in this demo) if the timeout is the only pending operation. And, since it is, we get the following terminal output:
As you can see, we never see the logging statement performed from within the setTimeout() callback since it is never executed. Once we .unref() the timer, Node.js no longer needs to hold the process open (waiting for the timer to execute), and it quietly exits.
If you're unfamiliar with the concept of Circuit Breakers, they are essentially proxies to brittle resources that will "fail fast" (ie, prevent communication) if the given resource appears to be unhealthy. This is very much like the physical Circuit Breaker that you have in your home's electrical system. In a Circuit Breaker context, this .unref() call was being used in conjunction with the setTimeout() timer that will reject a long-running command passing through the Circuit Breaker. Across the various implementations that I investigated, the approach looked like some variation of the following:
// Require the application modules.
var api = require( "./api" );
// ----------------------------------------------------------------------------------- //
// ----------------------------------------------------------------------------------- //
var TEN_SECONDS = ( 10 * 1000 );
function runWithTimeout( command, timeout = TEN_SECONDS ) {
// Since we can't cancel a command once it has been executed, we have to
// conditionally turn a command into either a resolved or a rejected promise. In
// this case, we'll use a timer to generate a rejected promise prior to command
// completion if the command takes too long to return.
var promise = new Promise(
function( resolve, reject ) {
// Setup our rejection timer to preemptively kill hanging commands.
var timer = setTimeout(
function rejectHangingCommand() {
reject( new Error( "Command took too long (and was timed-out)." ) );
},
timeout
);
// ************************************************************** //
// ************************************************************** //
// CAUTION: This is the line that seemingly makes no sense (to me).
// I cannot figure out what purpose it actually serves.
timer.unref();
// ************************************************************** //
// ************************************************************** //
// Catch any synchronous execution errors.
try {
var commandPromise = Promise.resolve( command() );
} catch ( error ) {
var commandPromise = Promise.reject( error );
}
// Once the command returns, we need to use the result to fulfill the
// contextual Promise. However, the command may return before OR after the
// timeout has already rejected the contextual Promise. As such, the
// following operations may actually be No-Op instructions.
commandPromise.then(
function handleResolve( result ) {
clearTimeout( timer );
resolve( result );
},
function handleReject( error ) {
clearTimeout( timer );
reject( error );
}
);
}
);
return( promise );
}
// ----------------------------------------------------------------------------------- //
// ----------------------------------------------------------------------------------- //
// Run the command with a 1-Second timeout. Since our command will take 5-Seconds to
// execute, we know that the command-runner's timer will terminate the request early.
var promise = runWithTimeout(
function command() {
return( api.echoAfterTimeout( "[data payload]", 5000 ) );
},
1000
);
promise.then(
function handleResolve( result ) {
console.log( "Success:", result );
},
function handleReject( error ) {
console.log( "Error:", error );
}
);
As you can see, right after we initiate the rejectHangingCommand() timer, we're calling .unref() on it. This is the line of code that makes no sense to me (and is the entire point of this post). That said, when we try to run a 5-second command through a 1-second execution, we get the following output:
As you can see, our 5-second API request is being preemptively timed-out by our 1-second setTimeout() call.
So, what purpose is the .unref() method playing in our Circuit Breaker?
Well, first, let's look at the Node.js document on Timers. If we look at the .unref() explanation, there is a clear warning that calling .unref() should be consumed with caution:
Note: Calling timeout.unref() creates an internal timer that will wake the Node.js event loop. Creating too many of these can adversely impact performance of the Node.js application.
So, immediately, the Node.js documentation warns us that we should take caution when using the timeout.unref() feature. Which means that we-developers better have a really good reason for using .unref() in our Circuit Breaker code (or other such inspired code). But, I'm struggling to think of a good reason. In fact, I can only thing of reasons that .unref() seems silly.
For starters, the underlying Promise may still hold the process open. If you look back at the previous terminal output, you can see the 1-Second timeout; but, you can also see that the underlying API call still returns after 5-Seconds. As such, a long-running command will hold the process open longer than its sibling timeout.
Of course, if the command is resolved prior to the 1-Second timeout, we clear the pending timer both for Resolution and Rejection outcomes. As such, the timer will never hold the process open longer that the command execution itself, whether it's long-running or short-lived.
Now, let's say for the sake of argument that something deep down inside the command goes "wrong" and an uncaught exception is thrown or an unhandled Promise rejection occurs. In that case, it is common - for reasons that still elude me - for a Node.js application to the log the error and terminate the running process. In such cases, it doesn't matter if there are pending timers - an explicit call to exit the process will take precedent.
Which brings up the greater context of the application itself. Circuit Breakers make sense inside applications. They operate based on metrics that are collected over time. So, it really only makes sense to use them inside long-running contexts. As such, a Circuit Breaker will generally be used in conjunction with something like an Express.js or Koa application that will be holding the process open regardless of what the Circuit Breaker is doing with its internal timers.
Of course, you may be in a clustered Express.js application and need to disconnect a Worker thread from the Master cluster. If we assume that something somewhere goes wrong and a Circuit Breaker timer is somehow left running, does it really matter? In a few seconds, the timer will execute its callback and the Worker thread will be allowed to die.
NOTE: I don't really know all that much about clustering and disconnecting Worker threads. As such, the preceding paragraph may not be entirely accurate.
Now, I'm not saying that Timeout.unref() has no place in a Node.js application. I can certainly understand a situation in which a long-running timeout or an internval on a message queue consumer (for example) shouldn't hold a process open. But, this feels more like the exception than the rule. I may very well be missing something; but, I can't think of any reason why .unref() would be meaningful in most contexts (especially Circuit Breakers). And, considering the warning in the Node.js documentation, calling .unref() may have a detrimental affect on the application performance.
Want to use code from this post? Check out the license.
Reader Comments