Aggregating Values In A Promise-Based Workflow In JavaScript
I love promises. I find a promise-based workflow much easier to use than the average callback-based workflow (though, of course, Promises still use callbacks). That said, promises are not a panacea; they don't magically turn asynchronous tasks into trivial matters. Asynchronous workflows are still hard; and aggregating values across a series of asynchronous tasks is not exactly straightforward. I'm far from an expert on Promises and continue to explore and experiment in an effort to find my sweet spot. So, I thought I'd share a couple of the approaches that I've used, with varying degrees of success.
In my mind, there's the approach that I'm most comfortable with; but, when when it comes to aggregating values in a promise-based workflow, each approach has various pros and cons and may also depend on the Promise library that your team has chosen to work with (if any). Of the three approaches outlined below, I happen to use the "lexical binding" approach the most because I like that it keeps all the moving parts in close proximity; but, I'll definitely use different approaches depending on the constraints of the context.
Lexical Binding
In this approach - which is the approach I use most often - the aggregated values are stored in the same lexical scope in which the individual asynchronous steps are defined. This gives the function bodies of each asynchronous step the ability to read and write to the same bindings. As such, each step of the promise-based workflow can read previously-calculated values from the lexical scope and store new values back into the lexical scope for consumption by subsequent asynchronous steps.
In this demo, and all the following demos, we're going to run three asynchronous tasks in series, assuming that each step cannot proceed until the prior step has returned. Please, just take that as fact (whether or not it is actually true in the demo); coming up with "simple" promise-based scenarios is not always easy.
// Require the core node modules.
var chalk = require( "chalk" );
var Q = require( "q" );
// ----------------------------------------------------------------------------------- //
// ----------------------------------------------------------------------------------- //
// Here, we have a series of asynchronous tasks that are required to aggregate some
// result. For the sake of the demo, let's assume that each one of the tasks has to be
// run in serial.
(function someWorkflow( userID ) {
// As the workflow progresses, we need to aggregate the results of each step. In
// this approach, we're going to depend on lexical binding to allow each step to
// save data back into these commonly-accessible variables such that they can be
// used by any subsequent step in the process.
var user = null;
var subscription = null;
var projects = null;
var promsie = Q
.when()
.then( getUser )
.then( getSubscription )
.then( getProjects )
.then(
function handleResolve() {
// When outputting the results, we're reaching back into that
// lexical scoping.
console.log( chalk.magenta( "User ID:" ), user.id );
console.log( chalk.magenta( "Subscription ID:" ), subscription.id );
console.log( chalk.magenta( "Project Count:" ), projects.length );
}
)
.catch(
function handleReject( reason ) {
// This method has access to all the lexically-bound values.
}
)
.finally(
function handleDone() {
// This method has access to all the lexically-bound values.
}
)
;
// Since this approach depends on lexical binding, the individual steps of the
// workflow need to be defined in the same lexical scope. Not only does that mean
// that they have to defined in this function, it means that we re-defined these
// functions every time the workflow is executed.
function getUser() {
var promise = db_getUser( userID ).then(
function handleResolve( result ) {
// Store the result back into the lexically-bound variable.
user = result;
}
);
return( promise );
}
function getSubscription() {
var promise = db_getSubscription( user.accountID ).then(
function handleResolve( result ) {
// Store the result back into the lexically-bound variable.
subscription = result
}
);
return( promise );
}
function getProjects() {
var promise = db_getProjects( user.id ).then(
function handleResolve( result ) {
// Store the result back into the lexically-bound variable.
projects = result
}
);
return( promise );
}
})( 47 ); // Self-executing function block.
// ----------------------------------------------------------------------------------- //
// ----------------------------------------------------------------------------------- //
// Mock database access for user.
function db_getUser( id ) {
var user = {
id: id,
accountID: "AFC-9-3S",
name: "Lisa"
};
return( Q.when( user ) );
}
// Mock database access for subscription.
function db_getSubscription( accountID ) {
var subscription = {
id: 97,
accountID: accountID,
startedAt: "2016-02-14"
};
return( Q.when( subscription ) );
}
// Mock database access for projects.
function db_getProjects( userID ) {
var projects = [ "Project A", "Project B", "Project C" ];
return( Q.when( projects ) );
}
As you can see, in this approach, the aggregated values are defined in the same context as the asynchronous step implementations. A big benefit of this approach is that it is relatively easy to reason about because everything is right there. It's not necessarily concise; but, all the moving parts are right there in the same enclosing function. This approach also means that .catch(), .finally(), and .done() handlers can also access the aggregated values because they, too, are in the same context. Using lexical bindings also means that we are less dependent on any "this" reference when it comes to reading and writing values.
The downside to this approach is that we have to re-define the functions every time the workflow is executed. This may or may not have some memory and performance considerations. But, it's likely that any effects would be trivial and not a cause for concern.
Explicit Reduction
In this approach, rather than relying on lexical bindings, we're going to treat the promise chain like a reducer. Each step is going to be given a object that contains the aggregated values. Then, each step has to populate that object with new date and pass it down to the next promise as the reduction.
// Require the core node modules.
var chalk = require( "chalk" );
var Q = require( "q" );
// ----------------------------------------------------------------------------------- //
// ----------------------------------------------------------------------------------- //
// Here, we have a series of asynchronous tasks that are required to aggregate some
// result. For the sake of the demo, let's assume that each one of the tasks has to be
// run in serial.
(function someWorkflow( userID ) {
var promise = Q
// In this approach, we still need to aggregate values as the workflow
// progresses; however, rather than relying on the lexical scoping of variables,
// we're going to "reduce" the workflow by passing a running-aggregate down
// through each promise.
.when({
userID: userID,
user: null,
subscription: null,
projects: null
})
.then( getUser )
.then( getSubscription )
.then( getProjects )
.then(
function handleResolve( result ) {
// When outputting the results, we're using only the running-aggregate
// that was passed to the resolution handler.
console.log( chalk.magenta( "User ID:" ), result.user.id );
console.log( chalk.magenta( "Subscription ID:" ), result.subscription.id );
console.log( chalk.magenta( "Project Count:" ), result.projects.length );
return( result );
}
)
.catch(
function handleReject( reason ) {
// This method has NO ACCESS to the "reduction".
}
)
.finally(
function handleDone() {
// This method has NO ACCESS to the "reduction".
}
)
;
})( 47 ); // Self-executing function block.
// By using a running aggregate, our workflow steps no longer need to be in same closure
// as the workflow itself. This means that we don't have to re-define the steps every
// time the workflow is executed; but, it does mean that each individual step has to be
// responsible for updating the aggregate and passing it on down the promise chain.
function getUser( reduction ) {
var promise = db_getUser( reduction.userID ).then(
function handleResolve( result ) {
// Store the result into the running aggregate.
reduction.user = result;
// Pass the running aggregate down the promise chain.
return( reduction );
}
);
return( promise );
}
function getSubscription( reduction ) {
var promise = db_getSubscription( reduction.user.accountID ).then(
function handleResolve( result ) {
// Store the result into the running aggregate.
reduction.subscription = result
// Pass the running aggregate down the promise chain.
return( reduction );
}
);
return( promise );
}
function getProjects( reduction ) {
var promise = db_getProjects( reduction.user.id ).then(
function handleResolve( result ) {
// Store the result into the running aggregate.
reduction.projects = result
// Pass the running aggregate down the promise chain.
return( reduction );
}
);
return( promise );
}
// ----------------------------------------------------------------------------------- //
// ----------------------------------------------------------------------------------- //
// Mock database access for user.
function db_getUser( id ) {
var user = {
id: id,
accountID: "AFC-9-3S",
name: "Lisa"
};
return( Q.when( user ) );
}
// Mock database access for subscription.
function db_getSubscription( accountID ) {
var subscription = {
id: 97,
accountID: accountID,
startedAt: "2016-02-14"
};
return( Q.when( subscription ) );
}
// Mock database access for projects.
function db_getProjects( userID ) {
var projects = [ "Project A", "Project B", "Project C" ];
return( Q.when( projects ) );
}
As you can see, the first step in the promise chain provides the initial values for the running aggregate. Each subsequent step then receives this reduction as its resolved value and is responsible for passing this reduction onto the next step in the asynchronous workflow. One benefit of this approach is that the individual steps no longer need to be defined in the same lexical context which means that we can reuse the steps without having to redefine them. Of course, that moves the individual steps farther away from the initial definition of the reduction, which makes them harder to reason about.
The biggest drawback to this approach is that the .catch(), .finally(), and .done() style callbacks do not have access to the reduction value. This means that you are more limited in how you repsond to the errors and outcomes that take place during the asynchronous workflow.
Implicit Reduction
In an attempt to bridge-the gap between the Explicit Reduction and the Lexical Binding, we can try to use Bluebird's .bind() method as a means to use the reduction object as the this-binding for the promise handlers. In this approach, each asynchronous step is still responsible for writing to the reduction object; but, it no longer needs to explicitly pass it down to the next promise - the aggregates values are implicitly passed as the "this" context.
CAUTION: I am not all that familiar with Bluebird - I use Q in my work. So, sorry if this is an atypical use of the .bind() method.
// Require the core node modules.
var Bluebird = require( "bluebird" );
var chalk = require( "chalk" );
// ----------------------------------------------------------------------------------- //
// ----------------------------------------------------------------------------------- //
// Here, we have a series of asynchronous tasks that are required to aggregate some
// result. For the sake of the demo, let's assume that each one of the tasks has to be
// run in serial.
(function someWorkflow( userID ) {
var promise = Bluebird
// In this approach, we're still going to pass the running-aggregate down
// through the promise chain; but, instead of passing it explicitly, we're
// going to pass it down IMPLICITLY as the THIS binding for all the promise
// handlers. What this means is that each promise handler can leverage its
// own this-binding as the reduction.
// --
// NOTE: This is a feature of Bluebird, not promises in general.
.bind({
userID: userID,
user: null,
subscription: null,
projects: null
})
.then( getUser )
.then( getSubscription )
.then( getProjects )
.then(
function handleResolve() {
// When outputting the results, notice that we're using the THIS
// binding as the implicitly-passed running-aggregate.
console.log( chalk.magenta( "User ID:" ), this.user.id );
console.log( chalk.magenta( "Subscription ID:" ), this.subscription.id );
console.log( chalk.magenta( "Project Count:" ), this.projects.length );
}
)
.catch(
function handleReject( reason ) {
// This method has access to the "reduction" via "this".
}
)
.finally(
function handleDone() {
// This method has access to the "reduction" via "this".
}
)
;
})( 47 ); // Self-executing function block.
// By using the IMPLICITLY-PASSED running aggregate, our workflow steps no longer need
// to be in same closure as the workflow itself. This means that we don't have to
// re-define the steps every time the workflow is executed; but, it does mean that each
// individual step has to be responsible for updating the implicit aggregate.
function getUser() {
var promise = db_getUser( this.userID ).then(
( result ) => {
// Store the result into the running aggregate which is being implicitly
// passed down through the promise chain.
this.user = result;
}
);
return( promise );
}
function getSubscription() {
var promise = db_getSubscription( this.user.accountID ).then(
( result ) => {
// Store the result into the running aggregate which is being implicitly
// passed down through the promise chain.
this.subscription = result;
}
);
return( promise );
}
function getProjects() {
var promise = db_getProjects( this.user.id ).then(
( result ) => {
// Store the result into the running aggregate which is being implicitly
// passed down through the promise chain.
this.projects = result;
}
);
return( promise );
}
// ----------------------------------------------------------------------------------- //
// ----------------------------------------------------------------------------------- //
// Mock database access for user.
function db_getUser( id ) {
var user = {
id: id,
accountID: "AFC-9-3S",
name: "Lisa"
};
return( Bluebird.resolve( user ) );
}
// Mock database access for subscription.
function db_getSubscription( accountID ) {
var subscription = {
id: 97,
accountID: accountID,
startedAt: "2016-02-14"
};
return( Bluebird.resolve( subscription ) );
}
// Mock database access for projects.
function db_getProjects( userID ) {
var projects = [ "Project A", "Project B", "Project C" ];
return( Bluebird.resolve( projects ) );
}
As you can see, in this approach, each asynchronous step still has to save its results into the reduction. However, as the reduction value is the "this" binding, each value is implicitly passed down to the next promise handler. The big benefit of this approach is that the .catch(), .finally(), and .done() style callbacks all have access to the reduction via "this".
That said, I find this approach the hardest to reason about. Not only are the moving parts spread out, the overloading of the "this" binding can easily lead to confusion and unexpected reference problems. And, on top of that, you have to go out of your way to maintain the "this" binding in each individual step otherwise access to the reduction value will be severed.
All of these approaches have the same outcome:
I know that future versions of JavaScript will "async" and "await" type constructs. But, I have zero experience with these features. And, from what I've read, I'm also a little cautious of them as they appear to make the code a bit too magical. Promise chains are verbose; but, they're hella explicit and I definitely believe in the principle of least surprise.
As I said before, I personally use the lexical binding approach in the vast majority of cases. It does have a little more overhead since it needs to redefine the individual methods every time the workflow is executed. But, to me, the benefits far outweigh the costs. The fact that all of the moving parts are co-located makes it much easier to reason about. And, since it depends on lexical binding, all of the promise handlers, including .catch(), .finally(), and .done(), have easy access to the aggregated values.
Want to use code from this post? Check out the license.
Reader Comments
For me, there is one more drawback of the approach you call "Explicit Reduction". The functions are not defined now in the same lexical scope, which means they are intended to be somehow independent of each other. But relying on that object that is passed into each of them implicitly couples them all together. I consider it a code smell.
That's why I prefer your first approach, though continue to use all of them in different cases.
Thank you for that comparison!
I'm personally a huge fan of the AsyncJS (https://github.com/caolan/async) library and I use it all the time. I've found it to meet every scenario that I have ever had to code against. I wonder if you have considered it before writing this and what your thoughts are on it?
Great article as always.
@Anton,
I tend to agree, I certainly favor the first approach as well. I like that it keeps them all together. Though, one thing I've been thinking about is breaking out the workflow into its own object that does nothing but aggregate values and return a promise. Something like:
function DoSomethingWorkflow() { ... }
... where this object can have prototype methods that aggregate values on the instance. Then, I could use it like:
var promise = new DoSomethingWorkflow( userID ).execute();
But, that might be a lot more trouble than it's worth. I don't really know all that much about Object Oriented Programming ... so I think to think up little object-experiments like that.
@Villy,
I haven't tried async.js yet; but, I have heard nothing but good things about it. I'll put it on my list of things to investigate, thanks!