Racing To Show Asynchronous Report-Generation Results With CFThread In Lucee CFML 5.3.6.61
At InVision, I've been building a reporting module for our enterprise clients. And, depending on the size of the account, any given report may run in seconds; or, for rather large companies, minutes. The problem is, this report generation sits at the very back-end of a long-chain of intermediaries: CDNs (Content-Delivery Networks), ELBs (Elastic Load-Balancers), K8 (Kubernetes) ingresses, Nginx proxies, Java Servlet containers, and finally, a Lucee CFML / ColdFusion runtime. All of these intermediaries have some sort of request-timeout setting which may, at any moment, terminate an in-flight HTTP request. Which means, even if the end-user was patient enough to sit-and-wait, letting the report generation run indefinitely isn't a viable option. But, treating all reports as "asynchronous" will also be a poor user experience (UX). As such, I wanted to think about a way in which I could show "fast reports" to the user immediately while still allowing "slow reports" to run asynchronously in the background in Lucee CFML 5.3.6.61.
The goal here is to let the user download a report if the report can be generated within a reasonable amount of time - something like 10-seconds. But, if the report is going to take longer than said timeout, we want to show the user a "Still running" message and then alert the user via email when the report finally completes.
To do this, I think I can use two of the ColdFusion CFThread
tag actions:
action = "run"
(the default action)action = "join"
The join
action is very cool because it accepts a timeout
attribute. This timeout
attribute tells the parent ColdFusion request how long to block-and-wait for the given CFThread
to complete execution. And, if the CFThread
hasn't completed within the given time, the parent request continues on processing the top-level request.
So, from a high-level algorithm standpoint, I want to do something like this:
Spawn asynchronous report generation using
CFThread(run)
.Block-and-wait for N-seconds using
CFThread(join)
.If the asynchronous report has not yet finished, tell the ColdFusion application to email the results of the report to the user; and, in the meantime, show the user a "still running" HTML page.
However, if the asynchronous report has finished, tell the ColdFusion application to immediately download the results as a file-attachment using
CFHeader
andCFContent
.
Since I am not really generating any reports in this demo, I am going to simulate report-generation latency with a sleep()
function call. And, different degrees of simulated latency can be defined using a URL parameter. Here's my report initiation page:
<h1>
Reporting Module
</h1>
<!---
Since we're just simulating latency for report generation, let's provide a number
of links each with an increasingly large latency.
--
NOTE: By using TARGET="_BLANK", the report will open in a NEW TAB. And, if the report
completes within a reasonable amount of time, the browser will AUTOMATICALLY CLOSE
the tab and download the report to the user's computer (via Content-Disposition).
However, if the report doesn't complete within the given timeout, having the new tab
will give us an appropriate place in which to show the "still running" message.
--->
<p>
<a href="report.cfm?latency=1000" target="_blank">Run report (1,000)</a> ,
<a href="report.cfm?latency=3000" target="_blank">Run report (3,000)</a> ,
<a href="report.cfm?latency=5000" target="_blank">Run report (5,000)</a>
</p>
As you can see, we can call the report-generation page with latencies ranging from 1-second to 5-seconds. Each report is opened in a new browser tab. This is important because it dove-tails with the browser's native behavior: if a new tab ends-up with a Content-Disposition
of attachment
, the browser will automatically download the file and close the tab. However, if the new tab ends-up with an HTML page - telling the user that report is running in the background - it allows us to communicate with the end-user without disrupting the out of the original report-generation context.
The report.cfm
ColdFusion file is going to define a runReport(timeout)
function. This user defined function (UDF) will either return the report data if it can be generated in a timely manner; or, it will return null
if the report is going to continue running in the background. If the result of the runReport()
function is non-null
, we will download the file immediately; and, if the result is null
, we will present the user with an HTML page telling them about the asynchronous notification.
In order to alter the outcome of the CFThread
tag, we need a way for the CFThread
tag to communicate with the parent request. And for this, I'm going to pass a ColdFusion Closure into the CFThread
tag as an attribute. The last thing the asynchronous report generation will do is invoke this closure. Then, the closure will determine what to do with the given results.
The default "outcome" will be to just return the results. However, if the CFThread(join)
action indicates that the report is still running, we'll switch that "outcome" over to being an asynchronous email notification.
<cfscript>
param name = "url.latency" type = "numeric" default = "0";
// ------------------------------------------------------------------------------- //
// ------------------------------------------------------------------------------- //
/**
* I run the report asynchronously. But, I will BLOCK-AND-WAIT for a brief period of
* time; and, if the report finishes prior to this timeout, the report data will be
* returned. Otherwise, NULL will be returned and the user will be notified by email
* when the report generation has completed.
*/
public any function runReport( required numeric waitTimeout ) {
// The default action for report generation will be "noop" (No-Operation) since
// it will be the action performed if the report finishes prior to the timeout.
var onResultsAction = "noop";
// The OnResults closure is the TUNNEL through which the CFTHREAD will
// communicate with the top-level page. It is the very last processing command
// that the following CFThread will execute.
var onResults = ( required array results ) => {
if ( onResultsAction == "email" ) {
systemOutput( "Sending Report Email:", true, false );
systemOutput( results, true, false );
systemOutput( "- -", true, false );
} else {
systemOutput( "Report data consumed by parent request", true, false );
systemOutput( "No action taken by async-report.", true, false );
systemOutput( "- -", true, false );
}
};
// Initiate the ASYNC report generation.
thread
name = "asyncReport"
action = "run"
simulatedLatency = url.latency
onResults = onResults
{
// Sometimes a report may run really quickly; and, sometimes, it may run
// really slowly. This could have to do with any number of things: database
// load, cache state, CPU load, remote API issues, etc. For this experiment,
// we're just going to simulate the latency with a sleep() command.
sleep( simulatedLatency );
thread.results = [
[ "ID", "Name", "CreditScore" ],
[ 1, "Timmy Timmerson", 78 ],
[ 2, "Samantha Samarita", 62 ],
[ 3, "Arnold Arnoldi", 83 ]
];
onResults( thread.results );
}
// BLOCK-AND-WAIT for the report to run - if the CFThread doesn't complete in the
// given timeout, the top-level request will continue processing.
thread
name = "asyncReport"
action = "join"
timeout = waitTimeout
;
// At this point, the report MAY OR MAY NOT have completed. Regardless, let's
// update the action to indicate that the user should be notified by email. If
// the CFThread has already completed, this change is meaningless; however, if
// the CFThread is still running, this will change the outcome of the internal
// onResults() call.
onResultsAction = "email";
// CAUTION: There is a TINY RACE CONDITION here where the THREAD.RESULTS value
// may have been assigned BUT the onResults() call MAY NOT HAVE BEEN EXECUTED.
// However, all this means is that the user will be notified by email about the
// report which will be superfluous if we're about to show them the data on-
// screen - this is not something worth being concerned about.
return( cfthread.asyncReport.results ?: nullValue() );
}
// ------------------------------------------------------------------------------- //
// ------------------------------------------------------------------------------- //
// Run the report and WAIT UP TO 3-seconds for it to complete.
data = runReport( 3000 );
// If the data is NULL, it means the report did not complete in a reasonable amount
// of time and will continue to run in the background.
if ( ! isNull( data ) ) {
reportFilename = "report-#createUniqueID()#.json";
header
name = "Content-Disposition"
value = "attachment; filename=""#encodeForUrl( reportFilename )#""; filename*=UTF-8''#encodeForUrl( reportFilename )#"
;
content
type = "application/x-json; charset=utf-8"
variable = charsetDecode( serializeJson( data ), "utf-8" )
;
// NOTE: The Abort here is not needed - CFContent will implicitly abort. I'm
// putting it here to underscore that nothing else will be processed in the
// top-level page request.
abort;
}
// ------------------------------------------------------------------------------- //
// ------------------------------------------------------------------------------- //
// CAUTION: We will only make it this far in the page request if the report did not
// complete within the CFThread JOIN-timeout.
</cfscript>
<p>
<h2>
Your report is still running.
</h2>
<p>
You report is still running in the background.
You <strong>will be notified by email</strong> when it has completed.
</p>
<p>
<a href="./index.cfm">Run another report</a>
</p>
</p>
As you can see, if the runReport()
UDF returns data, we use the CFContent
tag to immediately download it to the user's computer, which implicitly aborts the rest of the page request. However, if the runReport()
UDF returns null
, we allow the rest of the request to process, which renders the HTML view.
Now, if we run this demo and try the different simulated latency values, we get the following output:
As you can see, when we run the report with 1,000-ms of simulated latency, the report runs in a reasonable amount of time and we download it immediately to the user's computer. However, when we run the report with 5,000-ms of simulated latency, the request blocks for 3-seconds (our timeout) and then shows the user an "asynchronous notification message". We can then see from the terminal output that an "email" was sent several seconds later.
Now, if you look closely, you'll see something interesting when we use 3,000-ms of simulated latency: we both download the report data immediately to the user's computer and send an asynchronous email (which you can see in the terminal). This is a tiny race condition when the report-generation time is roughly equal to the join
timeout. In some cases we end up switching to the "email" outcome just after the report results have been generated, but before the onResults()
closure is invoked.
For me, this is not a problem. In fact, it errs on the side of safety. I'd rather have an edge-case where the user receives an unnecessary email instead of an edge-case where the user never receives any report.
One thing to consider here is that the request-timeout of the ColdFusion page also affects the CFThread
tag. We can always increase the request-timeout inside the CFThread
tag; but, I am not sure how that might affect the execution of the Closure, which is bound to the parent request. I'll have to do some more testing of that in Lucee CFML.
Can we all just stop for a moment and think about how awesome the CFThread
tag is in ColdFusion? Think about all the times you've heard people in the web-development world talk about "how challenging" parallel processing is. And now, think about how CFML just removes the vast majority of that complexity with its beautiful asynchronous constructs. Of course, there are still race-conditions and resource-contention to worry about (when it is relevant). But, with ColdFusion's threading, I find that in most cases, it "just works".
Want to use code from this post? Check out the license.
Reader Comments
Awesome trick heh? you beat me to blogging about this one....
One thing you have to be careful about is error handling inside the thread, you need to manually catch and handle any errors, as any normal error handling/reporting won't be triggered within a thread.
Any errors inside a thread are logged out to thread.log
@Zac,
Ugg, I had actually intended to have a little addendum about error handling, but I totally forgot by the time I was done writing. You should see the
try/catch
stuff I usually have aroundCFThread
. I normally break it up into two different functions - one that manages the spawning and then one that does the actually "business" logic. Something like this:And sometimes, if I have a Component that does a lot of async things, I'll actually just wrap that all up in some sort of async wrapper like:
runSafelyAsync( "someMethod" );
... where I then have all the try catch stuff that eventually calls:
invoke( variables, methodName );
Of course, every situation is different and there are loads of variations on error-handling. But, its more or less some variation on this (for me).
@All,
As a quick follow-up, I wanted a quick sanity check to make sure that making a call from a long-running
CFThread
tag back into a timed-out parent page context couldn't cause Lucee CFML to explode:www.bennadel.com/blog/3903-calling-into-a-timed-out-parent-page-context-from-a-cfthread-tag-in-lucee-cfml-5-3-6-61.htm
"Page Context" is a funky beast in ColdFusion; so, I am sure that I am not thinking of some edge-cases. But, at least for the vast majority of my use-cases, which is just having
CFThread
invoke methods in thevariables
scope (typically as part of a ColdFusion Component), this seem to "just work".