Overriding Content-Type And Content-Disposition Headers In Amazon S3 Pre-Signed URLs
When you upload a file to Amazon S3 (Simple Storage Service), you can provide the content type and a variety of other meta data points that Amazon will automatically serve-up when the file is later requested. Sometimes, however, that data was never provided, is inaccurate, or simply needs to be tweaked for the current context. Luckily, Amazon S3 allows you to override certain HTTP response header values in your pre-signed (query string authenticated) URLs.
The set of HTTP headers that you can override using these query string parameters is a subset of the headers that Amazon S3 accepts when you upload a file. The HTTP response headers that you can override for the GET responses are:
- response-content-type -> Content-Type
- response-content-language -> Content-Language
- response-expires -> Expires
- response-cache-control -> Cache-Control
- response-content-disposition -> Content-Disposition
- response-content-encoding -> Content-Encoding
In this post, we're going to be overriding the Content-Type and the Content-Disposition. However, this approach should work for all of these HTTP headers.
The tricky part about overriding the HTTP response headers is that all of this information needs to be used when generating the signature for the request (in order to make sure the request is never tainted). For pre-signed URLs, the canonical resource includes the query-string, with the HTTP response override keys in lowercase, sorted alphabetically. In my demo, I have this logic hard-coded. Meaning, I don't have a method that sorts the keys automatically; rather, they are already in the proper order when I pass the canonical resource into the signature method.
In this demo, we're going to upload an image with the generic Content-Type "application/octet-stream". Then, we're going to create a pre-signed URL that overrides the content-type to be "image/jpg" and provides an alternative filename using Content-Disposition.
<cfscript>
/**
* I get the expiration in seconds based on the given expires-at date. This takes
* care of the UTC conversion and expects to receive a date in local time.
*
* @output false
*/
public numeric function getExpirationInSeconds( required date expiresAt ) {
var localEpoch = dateConvert( "utc2local", "1970/01/01" );
return( dateDiff( "s", localEpoch, expiresAt ) );
}
/**
* I generate the signature for the given resource which will be available until
* the given expiration date (in seconds).
*
* For GET requests, the contentType is expected to be the empty-string; for PUT
* requests, the contentType is expected to match one of the HTTP request headers.
*
* @output false
*/
public string function generateSignature(
required string method,
required string contentType,
required numeric expirationInSeconds,
required string resource
) {
var stringToSignParts = [
ucase( method ),
"",
contentType,
expirationInSeconds,
resource
];
var stringToSign = arrayToList( stringToSignParts, chr( 10 ) );
var signature = hmac( stringToSign, aws.secretKey, "HmacSHA1", "utf-8" );
// By default, ColdFusion returns the Hmac in Hex; we need to convert it to
// base64 for usag in the pre-signed URL.
return(
binaryEncode( binaryDecode( signature, "hex" ), "base64" )
);
}
/**
* I encode the given S3 object key for use in a url. Amazon S3 keys have some non-
* standard behavior for encoding - see this Amazon forum thread for more information:
* https://forums.aws.amazon.com/thread.jspa?threadID=55746
*
* @output false
*/
public string function urlEncodeS3Key( required string key ) {
key = urlEncodedFormat( key, "utf-8" );
// At this point, we have a key that has been encoded too aggressively by
// ColdFusion. Now, we have to go through and un-escape the characters that
// AWS does not expect to be encoded.
// The following are "unreserved" characters in the RFC 3986 spec for Uniform
// Resource Identifiers (URIs) - http://tools.ietf.org/html/rfc3986#section-2.3
key = replace( key, "%2E", ".", "all" );
key = replace( key, "%2D", "-", "all" );
key = replace( key, "%5F", "_", "all" );
key = replace( key, "%7E", "~", "all" );
// Technically, the "/" characters can be encoded and will work. However, if the
// bucket name is included in this key, then it will break (since it will bleed
// into the S3 domain: "s3.amazonaws.com%2fbucket"). As such, I like to unescape
// the slashes to make the function more flexible. Plus, I think we can all agree
// that regular slashes make the URLs look nicer.
key = replace( key, "%2F", "/", "all" );
// This one isn't necessary; but, I think it makes for a more attactive URL.
// --
// NOTE: That said, it looks like Amazon S3 may always interpret a "+" as a
// space, which may not be the way other servers work. As such, we are leaving
// the "+"" literal as the encoded hex value, %2B.
key = replace( key, "%20", "+", "all" );
return( key );
}
// ------------------------------------------------------ //
// ------------------------------------------------------ //
// Include my AWS credentials (so they are not in the code). Creates the structure:
// * aws.bucket
// * aws.accessID
// * aws.secretKey
include "./credentials.cfm";
// Define the upload location (key) of the file.
key = urlEncodeS3Key( "signed-urls/headers/monkey.jpg" );
// Define the full resource of our key in our bucket.
resource = ( "/" & aws.bucket & "/" & key );
// Define the expiration after which this pre-signed URL is no longer valid (and will
// be rejected by AWS).
expirationInSeconds = getExpirationInSeconds( dateAdd( "n", 30, now() ) );
// Generate the signature for the query-string authentication. Notice that we are
// uploading the file with a generic content type of APPLICATION/OCTET-STREAM.
signature = generateSignature(
method = "PUT",
contentType = "application/octet-stream",
expirationInSeconds = expirationInSeconds,
resource = resource
);
// Create our pre-signed URL from the various parts.
preSignedUrl = (
"https://s3.amazonaws.com#resource#?AWSAccessKeyId=#aws.accessID#&" &
"Expires=#expirationInSeconds#&" &
"Signature=#urlEncodedFormat( signature )#"
);
// ------------------------------------------------------ //
// ------------------------------------------------------ //
// Now that we have our pre-signed URL, we can use it to upload a file to Amazon S3.
// Note that the pre-signed URL is specific to a given file and expiration date. As
// such, this doesn't grant free-range; but, rather very targeted access based on
// both the resource key and the file type.
uploadRequest = new Http(
method = "put",
url = preSignedUrl,
getAsBinary = "yes"
);
// Notice, again, that we are sending the file up with the generic content-type. We
// will be overriding this response header in the next pre-signed URL.
uploadRequest.addParam(
type = "header",
name = "Content-Type",
value = "application/octet-stream"
);
uploadRequest.addParam(
type = "body",
value = fileReadBinary( expandPath( "./monkey.jpg" ) )
);
result = uploadRequest.send();
// ------------------------------------------------------ //
// ------------------------------------------------------ //
// Now that we have the file uploaded on Amazon S3, we are going to provide a pre-
// signed URL for download. However, we *know* that the meta data provided in the
// upload is inaccurate (or simply not appropriate for this context). As such, we
// want to override certain response headers for this request. In this case, we're
// going to override the CONTENT-TYPE and the CONTENT-DISPOSITION response headers.
// -
// The following Response Headers can be overridden:
// * response-content-type -> Content-Type
// * response-content-language -> Content-Language
// * response-expires -> Expires
// * response-cache-control -> Cache-Control
// * response-content-disposition -> Content-Disposition
// * response-content-encoding -> Content-Encoding
// The content type we want served in the response.
contentType = "image/jpg";
// The file name we want served in the response (used if people Save-As the file).
filename = "awesome monkey.jpg";
// The content disposition is a little tricky since we need to account for both
// standard and UTF-8 character sets. To follow this RFC 5987 standard, we have to
// URL-encode the UTF-8 filename. We need to do this even though we do NOT generally
// want to URL-encode the response-header override values when generating the
// signature. In this case, it's ok to double-encode this part of the value.
contentDisposition = "inline; filename=""#filename#""; filename*=UTF-8''#urlEncodedFormat( filename )#";
// When generating the signature, we have to include the response-header override
// query string values as part of the resource being signed. When building the
// resource we do NOT want to URL-encode these values; but, we will URL encode them
// when building the actual URL.
// --
// NOTE: The order and the casing of the response-header override values is important.
// In order to generate a "Canonicalized Resource", the keys have to be lowercase and
// in alphabetical order.
signature = generateSignature(
method = "GET",
contentType = "",
expirationInSeconds = expirationInSeconds,
resource = ( resource & "?response-content-disposition=#contentDisposition#&response-content-type=#contentType#" )
);
// Create our pre-signed URL from the various parts. Notice that we are passing
// the response-header override values in the query string. At this point, we want
// URL-encode the values so the URL doesn't break.
// --
// NOTE: For this URL, the order of the parameters is not important. It only matters
// when you are signing the resource (above). However, the CASING of the response
// header override parameters DOES matter. In this contex, they have to be lower-case
// such that "response-content-type" != "Response-Content-Type".
preSignedUrl = (
"https://s3.amazonaws.com#resource#?AWSAccessKeyId=#aws.accessID#&" &
"Expires=#expirationInSeconds#&" &
"response-content-type=#urlEncodedFormat( contentType )#&" &
"response-content-disposition=#urlEncodedFormat( contentDisposition )#&" &
"Signature=#urlEncodedFormat( signature )#"
);
</cfscript>
<cfoutput>
<img src="#preSignedUrl#" alt="What an awesome monkey!" />
</cfoutput>
As you can see, toward the bottom, we are providing the "response-content-disposition" and the "response-content-type" parameters in order to override the HTTP response headers for Content-Disposition and Content-Type, respectively. The Content-Disposition is a little trickier because it is partially URL-encoded prior to signing. Typically, you don't want to URL-encode these values prior to singing; but, the UTF-8 portion of the Content-Disposition header specification requires URL-encoding. So, this portion of the value does need to be double-encoded (once pre-signing, and once in the actual URL).
When I run the above code and look at the HTTP response values, you can see that the Content-Type shows up as "image/jpg". And, when I go to save the image, the suggested filename is "awesome monkey.jpg".
I kept wanting to roll this up into something a bit more encapsulated. But, it was hard to think of a method signature that made sense. I'll have to noodle on that one a bit more.
Want to use code from this post? Check out the license.
Reader Comments
your experience is very helpful for me! thank your sharing!