Skip to main content
Ben Nadel at the jQuery Conference 2010 (Boston, MA) with: Andrew Wirick
Ben Nadel at the jQuery Conference 2010 (Boston, MA) with: Andrew Wirick

POST Streaming Multi-Part Form Data From ColdFusion Using Java And Formidable.js

By
Published in , Comments (4)

Lately, I've been playing around with the idea of streaming form posts from a ColdFusion server such that the entire contents of the form post would not have to be buffered in the local memory at any given time. I was able to post file-only content from one ColdFusion server to another; but, when it came to multi-part form data (ie. form fields), it seemed as if the target ColdFusion server was not able to handle the streaming. So, I thought, I would try streaming multi-part form data from ColdFusion to a Node.js server which, typically, is more than happy to handle long-running and / or streaming connections.

Until now, I've never actually POST'd anything to a Node.js server; I've only ever dealt with GET requests that parse query strings and write responses. As such, I had to do a little searching for a form-parsing module. As luck would have it, Felix Geisendorfer recently published a Node.js module - formidable - which deals specifically with parsing incoming form POSTs as they are being streamed over the connection.

Like so many Node.js modules, formidable extends the event model, allowing callbacks to be bound to certain events that get triggered during the request parsing. For our demo purposes, we're going to be listening for three of the request-parsing events:

  • field - This gets triggered when a single field has been fully parsed out of the incoming request.

  • file - This gets triggered when a single file has been fully parsed out of the incoming request (and saved to the local file system as a TMP file).

  • end - This gets triggered when the incoming form POST has been fully parsed.

Once I cloned the formidable Git repo, setting up the Node.js server was quite easy. The following code creates an HTTP request callback which looks for requests being posted to the "/upload" URL. These requests are assumed to be POST requests and are passed to a formidable instance. Field and File parsing events are then logged to the console.

Server2.js (Our Node.js Server)

// Include the necessary modules.
var sys = require( "sys" );
var http = require( "http" );

// This module will help us parse multi-part form data as it is
// being streamed in from the ColdFusion server.
var formidable = require( "./formidable" );


// ---------------------------------------------------------- //
// ---------------------------------------------------------- //


// Create an instance of the HTTP server.
var server = http.createServer(
	function( request, response ){


	// Check to see if this request is a form upload.
	if (request.url == "/upload") {


		// Create a new instance of multi-part form parser.
		var form = new formidable.IncomingForm();

		// Point to the TMP directory in the server folder - this is
		// where our files will be uploaded.
		form.uploadDir = require( "path" ).join( __dirname, "tmp" );

		// Listen for field completion.
		form.on(
			"field",
			function( name, value ){

				// Log the field to the console.
				console.log( "FIELD:", name, value );

			}
		);

		// Listen for the file completion.
		form.on(
			"file",
			function( name, file ){

				// Log the file to the console.
				console.log(
					"FILE:",
					name,
					file.filename,
					("[" + file.length + " Bytes]")
				);

			}
		);

		// Listen for the end of the form post (remember, we've been
		// parsing this as it was streaming in).
		form.on(
			"end",
			function(){

				// Write the successful headers.
				response.writeHead(
					200,
					{
						"content-type": "text/plain"
					}
				);

				// Write the content.
				response.end( "Form has been uploaded!" );

			}
		);

		// Signify the new request in the console.
		console.log(
			"-----------------------------------",
			"\n+-- New Multi-Part Form Request --+",
			"\n-----------------------------------"
		);

		// Now that we have our form set-up, let's start parsing the
		// files and fields in the incoming request.
		form.parse( request );


	// This reuqest was NOT an upload.
	} else {


		// Write the failed headers.
		response.writeHead(
			404,
			{
				"content-type": "text/plain"
			}
		);

		// Write the content.
		response.end( "Not Found." );


	}


});


// Point the server to listen to the given port for incoming
// requests.
server.listen( 8080 );


// ---------------------------------------------------------- //
// ---------------------------------------------------------- //


// Write debugging information to the console to indicate that
// the server has been configured and is up and running.
sys.puts( "Server is running on 8080" );

Luckily, all of the heavy lifting is encapsulated inside the formidable node.js module. This leaves our Node.js server quite sparse when it comes to configuration.

Ok, so now that we have our target node.js server configured to parse incoming requests, let's take a look at our ColdFusion code. The following code is very similar to the code posted previously; the only difference here is that as we assemble and stream the multi-part form data, we're sleeping the upload between the field definitions and the file post. In this way, we can see (from the Node.js console) that ColdFusion is truly streaming the POST content without having to buffer it locally.

<!---
	Set up the target URL for posting. In this demo, we will be
	posting to a Node.js server which does have the ability to handle
	multi-part form data in a streaming, non-buffered fashion.
--->
<cfset postUrl = "http://localhost:8080/upload" />

<!---
	Create an instance of our Java URL - This is the object that we
	will use to open the connection to the above location.
--->
<cfset targetUrl = createObject( "java", "java.net.URL" ).init(
	javaCast( "string", postUrl )
	) />

<!---
	Now that we have our URL, let's open a connection to it. This
	will give us access to the input (download) and output (upload)
	streams for the target end point.

	NOTE: This gives us an instance of java.net.URLConnection (or
	one of its sub-classes).
--->
<cfset connection = targetUrl.openConnection() />

<!---
	Be default, the connection is only set to gather target content,
	not to POST it. As such, we have to make sure that we turn on
	output (upload) before we access the data streams.
--->
<cfset connection.setDoOutput( javaCast( "boolean", true ) ) />

<!--- Since we are uploading, we have to set the method to POST. --->
<cfset connection.setRequestMethod( javaCast( "string", "POST" ) ) />

<!---
	By default, the connection will locally buffer the data until it
	is ready to be posted in its entirety. We don't want to hold it
	all in memory, however; as such, we need to explicitly turn data
	Chunking on. This will allow the connection to flush data to the
	target url without having to load it all in memory (this is
	perfect for when the size of the data is not known ahead of time).
--->
<cfset connection.setChunkedStreamingMode( javaCast( "int", 10 ) ) />

<!---
	When posting data, the content-type will determine how the
	target server parses the incoming request. If the target server
	is ColdFusion, this is especially crtical as it will throw an
	error if it tries to parse this POST as a collection of
	name-value pairs.

	In this case, we WANT it to see the form as multi-part, which
	will be a collection of name-value pairs. In order to delimit
	the part of the form post, we need to create a bondary identifier.
	This is how the server will know where one value ends and the
	next one starts.

	This needs to be a random string so as not to show up in the
	form data itself (as a false boundary).
--->
<cfset fieldBoundary = ("POST------------------" & getTickCount()) />

<!---
	Set the content type and include the boundary information so the
	server knowns how to parse the data.
--->
<cfset connection.setRequestProperty(
	javaCast( "string", "Content-Type" ),
	javaCast( "string", ("multipart/form-data; boundary=" & fieldBoundary) )
	) />


<!---
	Now that we have prepared the connection to the target URL, let's
	get the output stream - this is the UPLOAD stream to which we can
	write data to be posted to the target server.
--->
<cfset uploadStream = connection.getOutputStream() />

<!---
	Before we send the file data, we'll send some simple
	name-value pairs in plain-text format. In order to make it easier
	to write strings to the upload stream, let's wrap it in a Writer.
	This will allow us to write string data rather than just bytes.
--->
<cfset uploadWriter = createObject( "java", "java.io.OutputStreamWriter" ).init(
	uploadStream
	) />

<!---
	Form data makes heavy use of the Carriage Return and New Line
	characters to delimite values.
--->
<cfset crnl = (chr( 13 ) & chr( 10 )) />

<!--- A double break is also used. --->
<cfset crnl2 = (crnl & crnl) />


<!--- Delimit the field. --->
<cfset uploadWriter.write(
	javaCast( "string", ("--" & fieldBoundary & crnl) )
	) />

<!--- Send the title. --->
<cfset uploadWriter.write(
	javaCast(
		"string",
		(
			"Content-Disposition: form-data; name=""title""" &
			crnl2 &
			"The Bride" &
			crnl
		))
	) />


<!--- Delimit the field. --->
<cfset uploadWriter.write(
	javaCast( "string", ("--" & fieldBoundary & crnl) )
	) />

<!--- Send the author. --->
<cfset uploadWriter.write(
	javaCast(
		"string",
		(
			"Content-Disposition: form-data; name=""author""" &
			crnl2 &
			"Julie Garwood" &
			crnl
		))
	) />


<!--- Delimit the field. --->
<cfset uploadWriter.write(
	javaCast( "string", ("--" & fieldBoundary & crnl) )
	) />

<!--- Send the publisher. --->
<cfset uploadWriter.write(
	javaCast(
		"string",
		(
			"Content-Disposition: form-data; name=""publisher""" &
			crnl2 &
			"Pocket Star" &
			crnl
		))
	) />


<!---
	Now that we've written the simple name/value pairs, let's post
	the actual file data as part of the incoming request. This works
	very much in the same way, although we are going to stream the
	local file into the post data.

	Let's open a connection to a local file that we will stream to
	the output a byte at a time.

	NOTE: There are more effficient, buffered ways to read a file
	into memory; however, this is just trying to keep it simple.
--->
<cfset fileInputStream = createObject( "java", "java.io.FileInputStream" ).init(
	javaCast( "string", expandPath( "./data2.txt" ) )
	) />

<!--- Delimit the field. --->
<cfset uploadWriter.write(
	javaCast( "string", ("--" & fieldBoundary & crnl) )
	) />


<!--- ----------------------------------------------------- --->
<!--- ----------------------------------------------------- --->
<!--- ----------------------------------------------------- --->
<!--- ----------------------------------------------------- --->


<!---
	Now that have defined the non-file fields and delimited the
	file, let's FLUSH the UPLOAD stream and pause the upload. This
	will allow us to look at the Node.js console to see if the POST
	is truly being streamed (without local buffering).
--->
<cfset uploadWriter.flush() />

<!--- Sleep the upload for a few seconds. --->
<cfset sleep( 5000 ) />


<!--- ----------------------------------------------------- --->
<!--- ----------------------------------------------------- --->
<!--- ----------------------------------------------------- --->
<!--- ----------------------------------------------------- --->


<!--- Send the file along. --->
<cfset uploadWriter.write(
	javaCast(
		"string",
		(
			"Content-Disposition: form-data; name=""upload""; filename=""the_bride.txt""" &
			crnl &
			"Content-Type: ""text/plain""" &
			crnl2
		))
	) />

<!--- Read the first byte from the file. --->
<cfset nextByte = fileInputStream.read() />

<!---
	Keep reading from the file, one byte at a time, until we hit
	(-1) - the End of File marker for the input stream.
--->
<cfloop condition="(nextByte neq -1)">

	<!--- Write this byte to the output (UPLOAD) stream. --->
	<cfset uploadWriter.write( javaCast( "int", nextByte ) ) />

	<!--- Read the next byte from the file. --->
	<cfset nextByte = fileInputStream.read() />

</cfloop>

<!--- Add the new line to the field value. --->
<cfset uploadWriter.write( javaCast( "string", crnl ) ) />



<!---
	Delimit the end of the post. Notice that the last delimiter has
	a trailing double-slash after it.
--->
<cfset uploadWriter.write(
	javaCast( "string", (crnl & "--" & fieldBoundary & "--" & crnl) )
	) />

<!--- Now that we're done streaming the file, close the stream. --->
<cfset uploadWriter.close() />


<!--- ----------------------------------------------------- --->
<!--- ----------------------------------------------------- --->
<!--- ----------------------------------------------------- --->
<!--- ----------------------------------------------------- --->


<!---
	At this point, we have completed the UPLOAD portion of the
	request. We could be done; or we could look at the input
	(download) portion of the request in order to view the response
	or the error.
--->
<cfoutput>

	Response:
	#connection.getResponseCode()# -
	#connection.getResponseMessage()#<br />
	<br />

</cfoutput>

<!---
	The input stream is mutually exclusive with the error stream,
	although both can return data. As such, let's try to access
	the input stream... and then use the error stream if there is
	a problem.
--->
<cftry>

	<!--- Try for the input stream. --->
	<cfset downloadStream = connection.getInputStream() />

	<!---
		If the input stream is not available (ie. the server returned
		an error response), then we'll have to use the error output
		as the response stream.
	--->
	<cfcatch>

		<!--- Use the error stream as the download. --->
		<cfset downloadStream = connection.getErrorStream() />

	</cfcatch>

</cftry>


<!---
	At this point, we have either the natural download or the error
	download. In either case, we can start reading the output in
	the same mannor.
--->
<cfset responseBuffer = [] />

<!--- Get the first byte. --->
<cfset nextByte = downloadStream.read() />

<!---
	Keep reading from the response stream until we run out of bytes
	(-1). We'll be building up the response buffer a byte at a time
	and then outputting it as a single value.
--->
<cfloop condition="(nextByte neq -1)">

	<!--- Add the byte AS CHAR to the response buffer. --->
	<cfset arrayAppend( responseBuffer, chr( nextByte ) ) />

	<!--- Get the next byte. --->
	<cfset nextByte = downloadStream.read() />

</cfloop>

<!--- Close the response stream. --->
<cfset downloadStream.close() />

<!--- Output the response. --->
<cfoutput>

	Response: #arrayToList( responseBuffer, "" )#

</cfoutput>

I know this is super verbose and exploding with comments. But hopefully, you can understand it fairly well. The key thing to see is that after all of the simple name/value fields have been defined, we are flushing the upload stream and then pausing the ColdFusion thread. This should allow us to see the multi-part form parsing get paused on the Node.js side, confirming that ColdFusion is, in fact, streaming the content with out buffer.

NOTE: You can see this in effect if you watch the above video.

As a target server, ColdFusion doesn't seem to be able to handle multi-part, streaming form data (for which a content-length value is not known ahead of time); but as an origin server, streaming large, complex form posts doesn't seem to be a problem. This can definitely come in handy when posting large binary data from ColdFusion to something like Amazon S3 which, I'm told, can handle streaming, multi-part form data.

Want to use code from this post? Check out the license.

Reader Comments

2 Comments

Ben...am wondering if this post has some ideas that might help my latest conundrum...

I have a client who is a photographer. I built a multipart form that will allow her to upload photos (5 at a time) to her server so that her clients can view them. The issue we are running into is the immense size of the photos and the length of time it is taking to get the photos uploaded. Currently, I am using a combination of cfflush and cfimage's rezizing capabilities to make things smaller, but it is still not ideal. Being a photographer, as you may well imagine, these pics are huge, high res pics. What I am discovering is that the file is being uploaded through the http stream prior to hitting my flushing/cfimage work. I came across your code in looking for ways to stream the form submission.

With my long-winded explanation out of the way, do you think that your proof of concept might be the way to go? Any thoughts would be appreciated.

~Clay

By the way...have learned a ton from your site. Thanks.

15,848 Comments

@Clay,

That's a classic problem. In the past, I've actually gone the "use FTP" route. If your client is willing to use a modern browser, there might be some things you can do with the new File API to make the experience a bit more reasonable. However, if they are large files, there's only so much you can do to get around the issue of bandwidth, which is usually not as available going up than coming down.

I wish I had better advice. Is the process failing? Or just taking too long for a good user experience.

2 Comments

It is just taking too long for a good user experience. SInce my post, I have shown her how to batch resize in PS and I am using cffileupload to allow multiples at the same time with user feedback. I just did this for her and have not had a response yet.

15,848 Comments

@Clay,

Yeah, sometimes you *have* to ask the client to take on a little of the responsibility. There's only so much you can do to make uploading a more enjoyable experience. Good luck!

I believe in love. I believe in compassion. I believe in human rights. I believe that we can afford to give more of these gifts to the world around us because it costs us nothing to be decent and kind and understanding. And, I want you to know that when you land on this site, you are accepted for who you are, no matter how you identify, what truths you live, or whatever kind of goofy shit makes you feel alive! Rock on with your bad self!
Ben Nadel