Managing And Reporting Errors During Parallel Array Iteration In Lucee CFML 5.3.7.47
Hands-down, one of the most bad-ass features of Lucee CFML is parallel iteration over collections. This feature allows us to call map
, filter
, each
, any
, and some
using low-level threading. Except, we don't have to care about the threading - Lucee just handles it for us. That said, we do have to care about errors; especially since an error won't halt the parallel iteration. My current approach for this is to map
the collection in parallel, generating a "result" value for each operation. Since I've found this to be a very helpful pattern, I wanted to put together a small demo in Lucee CFML 5.3.7.47.
At InVision, we generate a lot of ZIP files that consist of remote images. The problem is, those remote images don't always exist. And, even if they do exist, the download of the image object itself can fail for a variety of reasons. It's hard to know what the best user-experience (UX) for this is. Do we want the overall ZIP generation to fail due to a single missing image? Or, do we want to generate the ZIP and then let the user know that some of the images failed to download?
I don't think there's a one-size-fits-all solution. But, at least in some cases, I've opted to go with the latter approach: generating the ZIP archive and then letting the user know that some of the images failed. That's the pattern that I want to demonstrate in this post.
When using this approach, I'm going to invoke a parallel .map()
on a given collection of image URLs. Each invocation of the .map()
operator is responsible for downloading a single image URL and returning the result of the download. Typically, I like to create a Happy Path result object that gets returned if all goes well; and then, modify that result to represent the Sad Path in the case of an error.
To demonstrate some sad path operations, I have purposefully put two invalid URLs in the target Array:
<cfscript>
// Let's download these images in parallel.
imageUrls = [
"https://picsum.photos/500/300",
"https://picsum.photos/501/301",
"https://picsum.photos/502/302",
"https://picsum-BAD-DNS.photos/503/303", // This will fail (invalid domain name).
"https://picsum.photos/504/304",
"https://picsum.photos/505/305",
"https://picsum-BAD-DNS.photos/506/306" // This will fail (invalid domain name).
];
results = withTempDirectory(
( tempDirectory ) => {
// Since we know that some downloads may fail (bad files, network errors,
// server errors, etc), we're going to use a MAP instead of an EACH. This
// way, we can create a "result" for each parallel image download operation
// that reports the success or failure.
var iterationResults = imageUrls.map(
( imageUrl, i ) => {
// For this type of "results-based" approach, I like to set up a
// "happy path" result that can be returned AS IS; or, updated in the
// case of an error.
var result = {
success: true,
error: nullValue(),
imageUrl: imageUrl,
imageIndex: i,
imageFilename: "image-#i#.jpg"
};
try {
fileCopy( imageUrl, "#tempDirectory#/#result.imageFilename#" );
} catch ( any error ) {
// Modify the "happy path" result to represent the "sad path".
result.success = false;
result.error = error;
}
return( result );
},
// Parallel iteration.
true,
// Maximum number of threads to use.
10
);
return( iterationResults );
}
);
successCount = countWithProperty( results, "success", true );
errorCount = countWithProperty( results, "success", false );
```
<cfoutput>
<h1>
Download Results
</h1>
<p>
<strong>Success:</strong> #successCount#,
<strong>Error:</strong> #errorCount#
</p>
<cfif errorCount>
<p>
The following images failed to download - re-uploading the images may
fix some of these errors.
</p>
<ul>
<cfloop value="result" array="#results#">
<cfif ! result.success>
<li>
#encodeForHtml( result.imageUrl )#
</li>
</cfif>
</cfloop>
</ul>
</cfif>
</cfoutput>
```
// ------------------------------------------------------------------------------- //
// ------------------------------------------------------------------------------- //
/**
* I create a temp directory and pass it to the given operator. Any value returned
* from the operator is automatically passed back to the calling context. The temp
* directory is deleted once the operator finishes executing.
*/
public any function withTempDirectory( required function operator ) {
var path = expandPath( "./temp-#createUniqueId()#" );
directoryCreate( path );
try {
return( operator( path ) );
} finally {
// NOTE: Normally I would delete all the temp directory stuff once I was done
// with it; however, for the demo, I'm going to keep this around.
// --
// directoryDelete( path, true );
}
}
/**
* I count the number of items that have the given key-value pair.
*/
public numeric function countWithProperty(
required array collection,
required string key,
required any value
) {
var count = 0;
for ( var item in collection ) {
if ( item[ key ] == value ) {
count++;
}
}
return( count );
}
</cfscript>
As you can see, each .map()
iteration starts out by declaring a result
object. This result
object is the "happy path" to be returned if the fileCopy()
operation is a success. And, if the download fails, I update the success
and error
properties of the result
so as to represent the "sad path". This result
object is then returned in either case.
At the end of this, we've mapped our imageUrls
collection onto a "results" collection which we can then use to report the outcome to the user. And, when we run this ColdFusion code, we get the following browser output:
As you can see, by mapping over our collection in parallel, we're able to report the number of successes, the number of errors, and the actual URLs of the failed image downloads. In a production context, the user might then be able to re-upload said images; or, retry the overall download-and-ZIP operation.
Epilogue on Parallel Programming in ColdFusion
One of the most amazing features of ColdFusion is how gosh darn easy the language makes parallel programming. Between the CFThread
tag, parallel iteration, and scheduled tasks, performing work asynchronously in CFML is something ColdFusion developer take for granted - we don't realize how hard this type of stuff is for some of the other programming communities out there.
It's just awesome!
Want to use code from this post? Check out the license.
Reader Comments