Using BeforeUpload To Generate Per-File Amazon S3 Upload Policies Using Plupload
As I've blogged about before, you can use Plupload to upload files directly from the client-browser to Amazon S3. But, in order to do so, you have to generate an Amazon S3 upload policy and pass it along with the file upload. Hard-coding this policy at page-load can be problematic because it has to be flexible enough to handle whichever file(s) a user selects and active long enough to handle the duration of the page's lifetime. As such, I wanted to see if we could use Plupload to generate per-file Amazon S3 upload policies that were short-lived and specific to the file about to be uploaded.
View this project on my GitHub account.
When you use Plupload, you have the opportunity to hook into a large number of events that have to do with the queue, the uploader state, and the file lifecycle. But, the one event that has caught my eye is the "BeforeUpload" event. In the past, I've looked at using the "BeforeUpload" event as a means to set per-file configuration settings. So, it seems possible that we could use this event to generate per-file Amazon S3 upload policies.
By default, the BeforeUpload event and the file upload are synchronous. Meaning, the BeforeUpload event fires and then the upload of the given file begins. As such, Plupload won't wait around for us to perform an AJAX (Asynchronous JavaScript and XML) request to our application server for a per-file upload policy; by the time our AJAX request returns to the client, the file upload will have already begun.
Luckily, Plupload allows us to override this behavior by returning "false" from the BeforeUpload event handler. When we do this, Plupload essentially pauses the upload processing until we explicitly tell it to start again (by manually triggering the "UploadFile" event). This gives us a window, in the queue processing, during which we can communicate with our application server, generate a per-file Amazon S3 upload policy, and update the Plupload configuration, all before the file upload begins.
Once we take Plupload out of its normal workflow, however, we have to take special precautions around error handling. Generally speaking, when Plupload is processing the queue, it handles errors gracefully. But, it doesn't know about our AJAX request for the per-file upload policy. As such, we have to catch those errors, explicitly clean up the file, and restart the upload process. It's not hard to do (at least in the way I've done it); but, it has to be done.
Ok, let's look at some code. There's a full demo application in my GitHub project; but, I'm only going to show you the Plupload code and the Amazon S3 upload policy generation. First, we'll look at the Plupload code. The part you want to pay attention to is the handleBeforeUpload() function - this is our BeforeUpload event hook where we make a request to the server for a file-specific upload policy:
app.directive(
"bnImageUploader",
function( $window, $rootScope, plupload, naturalSort, imagesService ) {
// I bind the JavaScript events to the scope.
function link( $scope, element, attributes ) {
// The uploader has to refernece the various elements using IDs. Rather than
// crudding up the HTML, just insert the values dynamically here.
element
.attr( "id", "primaryUploaderContainer" )
.find( "div.dropzone" )
.attr( "id", "primaryUploaderDropzone" )
;
// Instantiate the Plupload uploader.
var uploader = new plupload.Uploader({
// For this demo, we're only going to use the html5 runtime. I don't
// want to have to deal with people who require flash - not this time,
// I'm tired of it; plus, much of the point of this demo is to work with
// the drag-n-drop, which isn't available in Flash.
runtimes: "html5",
// Upload the image to the API.
url: "api/index.cfm?action=upload",
// Set the name of file field (that contains the upload).
file_data_name: "file",
// The container, into which to inject the Input shim.
container: "primaryUploaderContainer",
// The ID of the drop-zone element.
drop_element: "primaryUploaderDropzone",
// To enable click-to-select-files, you can provide a browse button.
// We can use the same one as the drop zone.
browse_button: "primaryUploaderDropzone",
// We don't have any parameters yet; but, let's create the object now
// so that we can simply consume it later in the BeforeUpload event.
multipart_params: {}
});
// Initialize the plupload runtime.
uploader.bind( "Error", handleError );
uploader.bind( "PostInit", handleInit );
uploader.bind( "FilesAdded", handleFilesAdded );
uploader.bind( "QueueChanged", handleQueueChanged );
uploader.bind( "BeforeUpload", handleBeforeUpload );
uploader.bind( "UploadProgress", handleUploadProgress );
uploader.bind( "FileUploaded", handleFileUploaded );
uploader.bind( "StateChanged", handleStateChanged );
uploader.init();
// I provide access to the file list inside of the directive. This can be
// used to render the items being uploaded.
$scope.queue = new PublicQueue();
// Wrap the window instance so we can get easy event binding.
var win = $( $window );
// When the window is resized, we'll have to update the dimensions of the
// input shim.
win.on( "resize", handleWindowResize );
// When the scope is destroyed, clean up bindings.
$scope.$on(
"$destroy",
function() {
win.off( "resize", handleWindowResize );
uploader.destroy();
}
);
// ---
// PRIVATE METHODS.
// ---
// I handle the before upload event where the meta data can be edited right
// before the upload of a specific file, allowing for per-file settings. If
// return FALSE from this event, upload process will be halted until you
// trigger it manually.
function handleBeforeUpload( uploader, file ) {
// Get references to the runtime settings and multipart form parameters.
var settings = uploader.settings;
var params = settings.multipart_params;
// Save the image to the application server. This will give us access to
// subsequent information that we need inorder to post the image binary
// up to Amazon S3.
imagesService.saveImage( file.name ).then(
function handleSaveImageResolve( response ) {
// Set the actual URL that we're going to POST to (in this case,
// it's going to be our Amazon S3 bucket.)
settings.url = response.formUrl;
// In order to uplaod directly from the client to Amazon S3, we
// need to post form data that lines-up with the generated S3
// policy. All the appropriate values were already dfined on the
// server during the Save action - now, we just need to inject
// them into the form post.
for ( var key in response.formData ) {
if ( response.formData.hasOwnProperty( key ) ) {
params[ key ] = response.formData[ key ];
}
}
// Store the image data in the file object - this will make it
// availalbe in the FileUploaded event where we'll have both
// the image object and the valid S3 pre-signed URL.
file.imageResponse = response.image;
// Manually change the file status and trigger the upload. At
// this point, Plupload will post the actual image binary up to
// Amazon S3.
file.status = plupload.UPLOADING;
uploader.trigger( "UploadFile", file );
},
function handleSaveImageReject( error ) {
// CAUTION: Since we explicitly told Plupload NOT to upload this,
// we've kind of put Plupload into a weird state. It will not
// handle this error since it doesn't really "know" about this
// workflow; as such, we have to clean up after this error in
// order for Plupload to start working again.
console.error( "Oops! ", error );
console.warn( "File being removed from queue:", file.name );
// We failed to save the record (before we even tried to upload
// the image binary to S3). Something is wrong with this file's
// data, but we don't want to halt the entire process. In order
// to get back into queue-processing mode we have to stop the
// current upload.
uploader.stop();
// Then, we have to remove the file from the queue (assuming that
// a subsequent try won't fix the problem). Due to our event
// bindings in the "QueueChanged" event, this will trigger a
// restart of the uploading if there are any more files to process.
uploader.removeFile( file );
}
);
// By returning False, we prevent the queue from proceeding with the
// upload of this file until we manually trigger the "UploadFile" event.
return( false );
}
// I handle errors that occur during intialization or general operation of
// the Plupload instance.
function handleError( uploader, error ) {
console.warn( "Plupload error" );
console.error( error );
}
// I handle the files-added event. This is different that the queue-
// changed event. At this point, we have an opportunity to reject files from
// the queue.
function handleFilesAdded( uploader, files ) {
// ------------------------------------------------------------------- //
// BEGIN: JANKY SORTING HACK ----------------------------------------- //
// This is a real hack; but, the files have actually ALREADY been added
// to the internal Plupload queue; as such, we need to actually overwrite
// the files that were just added.
// If the user selected or dropped multiple files, try to order the files
// using a natural sort that treats embedded numbers like actual numbers.
naturalSort( files, "name" );
var length = files.length;
var totalLength = uploader.files.length;
// Rewrite the sort of the newly added files.
for ( var i = 0 ; i < length ; i++ ) {
// Swap the original insert with the sorted insert.
uploader.files[ totalLength - length + i ] = files[ i ];
}
// END: JANKY SORTING HACK ------------------------------------------- //
// ------------------------------------------------------------------- //
// Tell AngularJS that something has changed (the public queue will have
// been updated at this point).
$scope.$apply();
}
// I handle the file-uploaded event. At this point, the image has been
// uploaded and thumbnailed - we can now load that image in our uploads list.
function handleFileUploaded( uploader, file, response ) {
$scope.$apply(
function() {
// Broudcast the response from the server that we received during
// our previous request to saveImage(). Remember, the FileUpload
// event is only for the successful push of the image up to
// Amazon S3 - the actual image object was already saved during
// the BeforeUpload event. At that point, the image response was
// associated with the file, which is what we're broadcasting.
$rootScope.$broadcast( "imageUploaded", file.imageResponse );
// Remove the file from the internal queue.
uploader.removeFile( file );
}
);
}
// I handle the init event. At this point, we will know which runtime has
// loaded, and whether or not drag-drop functionality is supported.
function handleInit( uploader, params ) {
console.log( "Initialization complete." );
console.log( "Drag-drop supported:", !! uploader.features.dragdrop );
}
// I handle the queue changed event. When the queue changes, it gives us an
// opportunity to programmatically start the upload process. This will be
// triggered when files are both added to or removed (programmatically) from
// the list (respectively).
function handleQueueChanged( uploader ) {
if ( uploader.files.length && isNotUploading() ){
uploader.start();
}
$scope.queue.rebuild( uploader.files );
}
// I handle the change in state of the uploader.
function handleStateChanged( uploader ) {
if ( isUploading() ) {
element.addClass( "uploading" );
} else {
element.removeClass( "uploading" );
}
}
// I get called when upload progress is made on the given file.
// --
// CAUTION: This may get called one more time after the file has actually
// been fully uploaded AND the uploaded event has already been called.
function handleUploadProgress( uploader, file ) {
$scope.$apply(
function() {
$scope.queue.updateFile( file );
}
);
}
// I handle the resizing of the browser window, which causes a resizing of
// the input-shim used by the uploader.
function handleWindowResize( event ) {
uploader.refresh();
}
// I determine if the upload is currently inactive.
function isNotUploading() {
return( uploader.state === plupload.STOPPED );
}
// I determine if the uploader is currently uploading a file.
function isUploading() {
return( uploader.state === plupload.STARTED );
}
}
// I model the queue of files exposed by the uploader to the child DOM.
function PublicQueue() {
// I contain the actual data structure that is exposed to the user.
var queue = [];
// I index the currently queued files by ID for easy reference.
var fileIndex = {};
// I add the given file to the public queue.
queue.addFile = function( file ) {
var item = {
id: file.id,
name: file.name,
size: file.size,
loaded: file.loaded,
percent: file.percent.toFixed( 0 ),
status: file.status,
isUploading: ( file.status === plupload.UPLOADING )
};
this.push( fileIndex[ item.id ] = item );
};
// I rebuild the queue.
// --
// NOTE: Currently, the implementation of this doesn't take into account any
// optimizations for rendering. If you use "track by" in your ng-repeat,
// though, you should be ok.
queue.rebuild = function( files ) {
// Empty the queue.
this.splice( 0, this.length );
// Cleaer the internal index.
fileIndex = {};
// Add each file to the queue.
for ( var i = 0, length = files.length ; i < length ; i++ ) {
this.addFile( files[ i ] );
}
};
// I update the percent loaded and state for the given file.
queue.updateFile = function( file ) {
// If we can't find this file, then ignore -- this can happen if the
// progress event is fired AFTER the upload event (which it does
// sometimes).
if ( ! fileIndex.hasOwnProperty( file.id ) ) {
return;
}
var item = fileIndex[ file.id ];
item.loaded = file.loaded;
item.percent = file.percent.toFixed( 0 );
item.status = file.status;
item.isUploading = ( file.status === plupload.UPLOADING );
};
return( queue );
}
// Return the directive configuration.
return({
link: link,
restrict: "A",
scope: true
});
}
);
This file is an AngularJS directive that wraps the Plupload implementation; but, the bulk of the code is just Plupload event bindings. If you look at our BeforeUpload hook, you should notice three things: we are returning "false" to cancel the BeforeUpload event; we're saving the image data to the server and getting our per-file Amazon S3 upload policy in return; and, we're storing our image object (persisted data structure) as a property in the Plupload File object. This last action allows the image data to be easily accessed in the subsequent "FileUploaded" event handler.
In this case, I've chosen to create the "image record" before actually performing the file upload. I suppose you could do this in reverse order; but, since I had to communicate with the server to get the Amazon S3 upload policy, I figured I might as well create the data record as well. Plus, I have plans for a future blog post in which it will be helpful to have the persisted record before the file upload begins.
Now, let's take a look at the server-side API endpoint that actually generates the Amazon S3 upload policy. This is the API that is called in the body of the BeforeUpload event handler:
<cfscript>
// Require the form fields.
param name="form.name" type="string";
// ------------------------------------------------------------------------------- //
// ------------------------------------------------------------------------------- //
// This is simply for internal testing so that I could see what would happen when
// our save-request would fail.
if ( reFind( "step-3", form.name ) ) {
throw( type = "App.Forbidden" );
}
// ------------------------------------------------------------------------------- //
// ------------------------------------------------------------------------------- //
// Since we know the name of the file that is being uploaded to Amazon S3, we can
// create a pre-signed URL for the S3 object (that will be valid once the image is
// actually uploaded) and an upload policy that can be used to upload the object.
// In this case, we can use a completely unique URL since everything is in our
// control on the server.
// --
// NOTE: I am using getTickCount() to help ensure I won't overwrite files with the
// same name. I am just trying to make the demo a bit more interesting.
s3Directory = "pluploads/before-upload/upload-#getTickCount()#/";
// Now that we have the target directory for the upload, we can define our full
// amazon S3 object key.
s3Key = ( s3Directory & form.name );
// Create a pre-signed Url for the S3 object. This is NOT used for the actual form
// post - this is used to reference the image after it has been uploaded. Since this
// will become the URL for our image, we can let it be available for a long time.
imageUrl = application.s3.getPreSignedUrl(
s3Key,
dateConvert( "local2utc", dateAdd( "yyyy", 1, now() ) )
);
// Create the policy for the upload. This policy is completely locked down to the
// current S3 object key. This means that it doesn't expose a security threat for
// our S3 bucket. Futhermore, since this policy is going to be used right away, we
// set it to expire very shortly (1 minute in this demo).
// ---
// NOTE: We are providing a success_action_status INSTEAD of a success_action_redirect
// since we don't want the browser to try and redirect once the image is uploaded.
settings = application.s3.getFormPostSettings(
dateConvert( "local2utc", dateAdd( "n", 1, now() ) ),
[
{
"acl" = "private"
},
{
"success_action_status" = 201
},
[ "starts-with", "$key", s3Key ],
[ "starts-with", "$Content-Type", "image/" ],
[ "content-length-range", 0, 10485760 ], // 10mb
// The following keys are ones that Plupload will inject into the form-post
// across the various environments. As such, we have to account for them in
// the policy conditions.
[ "starts-with", "$Filename", s3Key ],
[ "starts-with", "$name", "" ]
]
);
// Now that we have generated our pre-signed image URL and our policy, we can
// actually add the image to our internal image collection. Of course, we have to
// accept that the image does NOT yet exist on Amazon S3.
imageID = application.images.addImage( form.name, imageUrl );
// Get the full image record.
image = application.images.getImage( imageID );
// Prepare API response. This needs to contain information about the image record
// we just created, the location to which to post the form data, and all of the form
// data that we need to match our policy.
response.data = {
"image" = {
"id" = image.id,
"clientFile" = image.clientFile,
"imageUrl" = image.imageUrl
},
"formUrl" = settings.url,
"formData" = {
"acl" = "private",
"success_action_status" = 201,
"key" = s3Key,
"Filename" = s3Key,
"Content-Type" = "image/#listLast( form.name, "." )#",
"AWSAccessKeyId" = application.aws.accessID,
"policy" = settings.policy,
"signature" = settings.signature
}
};
</cfscript>
As you can see, I am taking the name of the file that we're about to upload and then using that name to generate a file-specific Amazon S3 upload policy. I'm also setting a very small expiration date for the policy - 1 minute. Together, these properties lock the upload policy down, making it perfectly functional but with no security concerns.
Once the policy is generated, I then return both the data record and the policy in the response. The policy is then injected into the Plupload configuration settings and the upload processing is reanimated. At that point, Plupload will post the file to Amazon S3 and trigger the "FileUploaded" event.
I love the idea of offloading Amazon S3 file uploads to the client. This can free up a lot of server-side resources (bandwidth and processing). And, it's great to know that there's a way to do this with Plupload that doesn't require an open-ended upload policy. From a technical standpoint and security standpoint, this feels like a huge win! I love Plupload.
Want to use code from this post? Check out the license.
Reader Comments
Interesting approach - I'm also looking at per file upload URLs with plupload. I had thought I'd make a synchronous ajax request inside the beforeupload function, but I'm not sure if blocking the plupload processing is a great idea. It's good to see this approach remaining asynchronous.