Learning ColdFusion 8: CFThread Part III - Set It And Forget It
Now that we have covered the ColdFusion 8 CFThread basics and talked about running parallel threads in a singe page, let's look at another parallel thread use case - the "Set It And Forget It" scenario. Unlike our previous example where we required that all threads finish processing by the end of a single request, we will undoubtedly run into scenarios where we have to put a lot of processing towards an action that will not affect the user experience.
Take, for example, downloading photos. Imagine that we have a web form that allows the user to enter photo URLs into an HTML textarea (one URL per line) and then, on submission, we will download those photos using CFHttp. Traditionally, we would download the photos as the page processes, leaving the end user to stare irritably at a form that is seemingly doing nothing. This scenario assumes that the user cares to know the exact moment that the files are downloaded. But is that usually the case? I would say that most of the time, it is not. The user just wants to know that they are going to be downloaded in a timely manner.
But, even assuming that the user didn't care to know when the photos were done downloading, there is little we could do to make our code more efficient. Sure there are hacks to get around this, and if you have ColdFusion Enterprise then you can use Gateways. But most of us would rather not have hacks nor do we have ColdFusion Enterprise. For most of us, there would be no elegant way to handle this. Here is a traditional ColdFusion page that will download the end user's given files:
<!--- Kill extra output. --->
<cfsilent>
<!---
Param the FORM variable that will hold our photo urls.
Remember, each URL is on its own line (separrated by
line returns).
--->
<cfparam
name="FORM.photo_url"
type="string"
default=""
/>
<!---
Get a value for the time at which the page started
processing. We will need this to see how long it takes
the page to run.
--->
<cfset intStartTime = GetTickCount() />
<!--- Trim the form field. --->
<cfset FORM.photo_url = FORM.photo_url.Trim() />
<!---
Check to see if the form has been submitted. For
this demo, we will know this if there is a value
in the form field.
--->
<cfif Len( FORM.photo_url )>
<!---
Loop over the URLs. We can treat the text area
as if it were a list of URLs that is using the
line break, line return as the list delimiter.
--->
<cfloop
index="strURL"
list="#FORM.photo_url#"
delimiters="#Chr( 13 )##Chr( 10 )#">
<!---
Now that we have our individual URL, let's
grab the photo binary using CFHttp and store
it directly into a file on the server.
--->
<cfhttp
url="#strURL#"
method="GET"
getasbinary="yes"
path="#ExpandPath( './data/' )#"
file="#GetFileFromPath( strURL )#"
/>
</cfloop>
</cfif>
</cfsilent>
<cfoutput>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html>
<head>
<title>ColdFusion 8 - CFThread Demo</title>
</head>
<body>
<h2>
Photo Download
</h2>
<!---
Check to see if the form as been submitted. For
this demo, we will know this if there is a value
in the form field.
--->
<cfif NOT Len( FORM.photo_url )>
<p>
Please enter photo URLs that you would like to
download. Each URL should be on a single line of
the following text area.
</p>
<form
action="#CGI.script_name#"
method="post">
<p>
<textarea
name="photo_url"
cols="70"
rows="20"
>#FORM.photo_url#</textarea>
</p>
<p>
<input type="submit" value="Download Now" />
</p>
</form>
<cfelse>
<p>
Your photos have been downloaded!
</p>
</cfif>
<!--- Output how long it took the page to run. --->
<p>
Page ran in:
#NumberFormat(
((GetTickCount() - intStartTime) / 1000),
",.00"
)#
seconds.
</p>
</body>
</html>
</cfoutput>
Notice that all of our CFHttp calls must finish executing before the results / confirmation page gets displayed to the user. If we run that page a few times, we get the following execution times:
Page ran in: 9.35 seconds.
Page ran in: 10.59 seconds.
Page ran in: 9.34 seconds.
Page ran in: 7.27 seconds.
Page ran in: 8.94 seconds.
We are forcing the user to stare at the web form for a minimum of 7 seconds before they get any sort of feed back. This is totally lame and quite frustrating for the user. But what can we do, it's over 2 megabytes of photos (and frankly, I am amazed it all downloaded in 7 seconds):
However, with ColdFusion 8's CFThread tag, we can now easily and quite elegantly perform what I like to call "Set it and forget it" tasks. These are tasks that we want to launch, but we don't care when they finish and we certainly don't want to make the end user wait for them to finish. These are tasks who's execution time may or may not outlive the current page. Here is the above example modified ever so slightly to use ColdFusion 8's CFTHread tag:
<!--- Kill extra output. --->
<cfsilent>
<!---
Param the FORM variable that will hold our photo urls.
Remember, each URL is on its own line (separrated by
line returns).
--->
<cfparam
name="FORM.photo_url"
type="string"
default=""
/>
<!---
Get a value for the time at which the page started
processing. We will need this to see how long it takes
the page to run.
--->
<cfset intStartTime = GetTickCount() />
<!--- Trim the form field. --->
<cfset FORM.photo_url = FORM.photo_url.Trim() />
<!---
Check to see if the form has been submitted. For
this demo, we will know this if there is a value
in the form field.
--->
<cfif Len( FORM.photo_url )>
<!---
Loop over the URLs. We can treat the text area
as if it were a list of URLs that is using the
line break, line return as the list delimiter.
--->
<cfloop
index="strURL"
list="#FORM.photo_url#"
delimiters="#Chr( 13 )##Chr( 10 )#">
<!---
Now that we have our individual URL, let's
grab the photo binary using CFHttp and store
it directly into a file on the server.
We are going to launch this CFHttp call in a
new thread using CFThread. We are not going
to wait for this call to finish.
--->
<cfthread
action="run"
name="photo_#GetFileFromPath( strURL )#">
<!--- Save the photo. --->
<cfhttp
url="#strURL#"
method="GET"
getasbinary="yes"
path="#ExpandPath( './data/' )#"
file="#GetFileFromPath( strURL )#"
/>
</cfthread>
</cfloop>
</cfif>
</cfsilent>
<cfoutput>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html>
<head>
<title>ColdFusion 8 - CFThread Demo</title>
</head>
<body>
<h2>
Photo Download
</h2>
<!---
Check to see if the form as been submitted. For
this demo, we will know this if there is a value
in the form field.
--->
<cfif NOT Len( FORM.photo_url )>
<p>
Please enter photo URLs that you would like to
download. Each URL should be on a single line of
the following text area.
</p>
<form
action="#CGI.script_name#"
method="post">
<p>
<textarea
name="photo_url"
cols="70"
rows="20"
>#FORM.photo_url#</textarea>
</p>
<p>
<input type="submit" value="Download Now" />
</p>
</form>
<cfelse>
<p>
Your photos are being downloaded right now.
They should be done shortly.
</p>
</cfif>
<!--- Output how long it took the page to run. --->
<p>
Page ran in:
#NumberFormat(
((GetTickCount() - intStartTime) / 1000),
",.00"
)#
seconds.
</p>
</body>
</html>
</cfoutput>
The changes here are minute. Notice that instead of giving the message, "Your photos have been downloaded!", we are returning the message, "Your photos are being downloaded right now. They should be done shortly." We have to do this because the form processing is no longer a reflection of the photo downloads (as it was in the first demo). This is because our CFHttp calls are now being wrapped in CFThread tags for parallel, asynchronous processing. Running the above code a few times, we get the following execution times:
Page ran in: 0.02 seconds.
Page ran in: 0.00 seconds.
Page ran in: 0.00 seconds.
Page ran in: 0.00 seconds.
Page ran in: 0.02 seconds.
The web form itself is now processing instantly, providing the end user with feedback immediately upon submission. The photos might not be done downloading yet, but the user is assured that they should be done soon. And, in all likelyhood, they will be done faster in parallel than they were done in series as in the first demo.
While not every page uses tasks that should be considered for a "Set it and forget it" scenario, when they do come up, ColdFusion 8 is certainly going to make dealing with them a piece of cake. Just a note of caution: since these threads get launched in parallel to current page request, if they crash (ex. the CFHttp exceeds its execution time limit), our main page will not be haulted (if it hasn't already finished processing). So, if you are going to run a Set it an forget it style task, be very careful that you have fully tested your code and that you failure-recovery methods in place (such as CFTry / CFCatch) since child thread failures will not be alerted to the end user by default.
Want to use code from this post? Check out the license.
Reader Comments
One of the things I'd like to do is write up a quick UDF just for fire and forget HTTP calls. There have been many times where I wish I had such a thing (like the ping support in blogcfc).
@Ray,
Had some time, thought I would give it a shot:
www.bennadel.com/index.cfm?dax=blog:751.view
Actually, dealing with CFThread was a nice experience because I forgot that you need to pass data to it via the CFThread tag attributes (all this stuff is still sinking in).
Thanks for the posts Ben. I'd never fully grokked threading concepts before and this goes a long way toward clearing things up for me.
@Josh,
Awesome dude, glad to help clear some stuff up.
Some people might find use for the aSyncHTTP lib by Mark Mandel also.
To expand on this example, you can even track which files have been downloaded via the user's Session. Then on the use AJAX to check to see which files have already been processed and downloaded.
This would allow you to quickly spit back a page to the user and then you could build a very lightweight page that keeps checking the processed queue showing the progress. Once the queue is finalized, you could stop checking the queue.
-Dan
@Dan,
I am actually working on related post shortly.
none of your pages print correctly, do you have a style sheet or security setting in place?
@Rich,
There is no security issue. The pages have just never been optimized for printing. Many people have brought this up; I suppose it is time that I actually have a better printing solution. I will try to make this a priority.
@Kevin,
Try storing the results of the CFHTTP call into the THREAD scope:
<cfhttp... result="THREAD.HttpRequest" />
Then, after your CFThread tag, join all the threads back to the current page:
<cfthread action="join" />
Then, CFDump out the CFTHREAD scope and see what those results were. At the very least, you will be able to see if there are any errors.
@Ben -
Thanks - OK I think I've narrowed it down to the application I'm running it in for some reason -
when I run the same code in a non fusebox application - it works fine - with the threading -
Within our fusebox app I'm getting a connection error - when I don't cfthread it, it works fine again - doesn't seem to matter if the framework is in 'development' mode or not -
Here's the code and the result of the dump of the cfhttp call:
<cfthread action="run" name="photo_#GetFileFromPath( strURL )#">
<!--- Save the photo. --->
<cfhttp url="#strURL#" method="GET" getasbinary="yes" path="#ExpandPath( './xml/' )#" file="#GetFileFromPath( strURL )#" useragent="Mozilla" result="cfthread.httprequest" />
</cfthread>
<cfthread action="join" />
....
HTTPREQUEST
struct
Charset [empty string]
ErrorDetail [empty string]
Filecontent Connection Failure
Header HTTP/1.1 200 OK Connection: close Expires: Fri, 08 Feb 2008 22:12:07 GMT Date: Tue, 05 Feb 2008 04:36:12 GMT Accept-Ranges: bytes Server: Apache Content-Length: 1010 Cache-Control: max-age=322555 ETag: "3f2-52e56c00" Last-Modified: Wed, 23 Aug 2006 07:30:56 GMT Content-Type: image/gif
Mimetype image/gif
Responseheader
struct
Accept-Ranges bytes
Cache-Control max-age=322555
Connection close
Content-Length 1010
Content-Type image/gif
Date Tue, 05 Feb 2008 04:36:12 GMT
ETag "3f2-52e56c00"
Expires Fri, 08 Feb 2008 22:12:07 GMT
Explanation OK
Http_Version HTTP/1.1
Last-Modified Wed, 23 Aug 2006 07:30:56 GMT
Server Apache
Status_Code 200
Statuscode 200 OK
Text NO
I just tested it in a new install of Fusebox 4.1 on the same instance - even using a different file extension (other than .cfm which we have custom setup) to see if it had something to do with that.
The code worked just FINE in this environment - so I'll have to track down what the issue is specific to our application using cfthread.
Thanks again
@Kevin,
Very interesting. When you figure it out, please let us know. I have no idea what would be causing the connection issues.
Wow - well I think the problem is solved / mystery unveiled - here it is and I think it's a good one -
RUNNING CFTHREAD using a Virtual Directory and ExpandPath to find the path.
1.) We have a virtual directory path on the app I was using to test with - and if I ran the example under a vd of 'j2w' the expandpath would return an incorrect path to be something like a 'c:\websites\xyz\j2w\' - where that doesn't exist - BUT ONLY when it was in a cfthread.
When i remove the CFTHREAD and still executed the page under the virtual directory url, the expandpath would return as 'c:\websites\xyz\' (w/o the virtual directory in there) which is correct, and therefore would save the images off just fine.
It all seems to work just fine now as long as the filepath is not a relative one being defined through expandpath when the URL may be using a virtual directory - (something to do with pageContext.getServletContext() in a cfthread possibly?) -
So I'll take what I know and try to implement threading w/in a cfc that is being managed by ColdSpring and see what other sorts of issues I can drum up.
eek - Thanks
@Kevin,
Good detective work! One way to get around this would be to pass in the of the directory as an attribute of the tag.
<cfthread
path="#ExpandPath( './xml/' )#" .....>
Then, inside the thread you can refer to that using the ATTRIBUTES scope. This should keep all the paths relative to the primary executing page and cut down on any confusion.
I'm just wondering what happens to so called "Set it and Forget it" threads once they're done? What is their lifespan?
On my local server, I keep hitting the maximum amount of threads and also noticed there was quite a few queued as well. (ie. Server Monitor)
BTW, this has nothing to do with your code sample, I'm just wondering.
@Dan,
I believe the thread just ends and gets added back to the thread pool? I don't really know what happens under the hood.
Coming in late here, but I have just gotten into the CFTHREAD to try and speed up a very slow application (written in CF5 but being upgraded to CF8).
One of the things the application does right now is show statistics in a graph form but the queries (about 20 of them) run really slow. We optimized the queries on the Oracle side and that sped it up dramatically. But the graphing is still slow. So I would like to put that off to the side and bring the graph up once it is done.
So using this example of yours, I can display a page showing the non-graphical stuff really quick, but how can I have it bring up the graph once it is done?
To put it in the context of your graphics demo, how would you show the graphics once they were loaded?
@Don,
Because you want to display the graphic, it becomes a bit more tricky. I am not all that experienced with CFChart, so I am not sure how the graphic actually gets displayed. Are you saving the image to a file and then using the IMG tag? Or is the CFChart just outputting it to the page?
I suppose you could always put the chart itself inside an IFrame or something? Or use AJAX to load the page that displays the CFChart.
CFChart just posts to the page. An iframe? hmmmmm
That is an idea.
@Don,
Or, you could make the IMG tag point to a CFM page that generates the report and then just give that page a long timeout. Of course, if you do that, then you can't return HTML - you'd have to trap the IMG tag that gets generated by the CFChart tag and return the SRC attribute. Not the easiest thing. I'd check into the IFrame first or some sort of AJAX load dealy.
Maybe look into CFDiv for this sort of thing - seems like a great use-case.
Hi Ben,
I'm very (very) late here, but I've found a small issue with your code you'll probably want to correct (as I'm sure people will keep coming here to learn about cfthread for many years to come).
When I tried it out and checked the downloads, I found that it had only downloaded the last picture of the input list.
I realised that this was because the threads were all sharing the same strURL variable from the parent code. By the time *any* of the cfhttps started downloading the respective files, the loop that started each thread had completed and strURL had been set to the value of the URL of the last picture: therefore all the threads downloaded the same picture and saved them with the same file name.
To fix this I added an attribute myURL to the cfthread tag and set it to strURL, like so:
cfthread action="run" name="photo_#GetFileFromPath( strURL )#" myUrl="#strURL#"
And then replaced references to strURL with references to attribute.myURL, like this:
cfhttp url="#attributes.myURL#" (...) file="#GetFileFromPath( attributes.myURL )#"
That worked fine.
I have also tested using the thread scope instead, by setting thread.myURL = strURL at the beginning of each thread and then using that in the cfhttp, but even that assignment was not quick enough: in all threads, thread.myURL was set to the last value of strURL.
Cheers, and let me thank you for your great help to all of us. I wonder if you realise how much lots of Coldfusion developers around the world owe to your thorough explorations and clear explanations. Beyond the basics, the vast majority of what I have learned of Coldfusion comes from your blog.