OWASP Encoder.cfc - A Java Encoder Proxy For ColdFusion
NOTE: ColdFusion 10+ has native support for several encoding methods that are built on top of the OWASP libraries. encodeForHTML() for example. The following post was mostly a fun exploration of dynamic programming.
The Open Web Application Security Project (OWASP) maintains a project called the Java Encoder. The Java Encoder is a high-performance encoder class that encodes String values for "safe use" in different contexts. In a Java project, this encoder is dead-simple to use. However, in a ColdFusion project, care has to be taken when invoking the methods since the method signatures all require the use of javaCast() in order to ensure java.lang.String inputs.
My OWASP Encoder.cfc project is a light-weight ColdFusion component wrapper for the Java Encoder class that takes care of all of the javaCast() invocation for you. The Encoder.cfc implements all of the same top-level .forCONTEXT() methods. These methods can accept any "simple value" (number, string, date, boolean, etc.), and will javaCast() the input to a String internally, invoke the proxied method, and return the result.
Project: See the OWASP Encoder.cfc on my GitHub account.
To see this in action, let's instantiate the proxy and run a "problematic" string value through each encoding method:
<cfscript>
// Create an instance of our light-weight encode wrapper component, giving
// it the instance of the OWASP Encode class to proxy.
encoder = new lib.Encoder( createObject( "java", "org.owasp.encoder.Encode" ) );
// Setup the input that we are going to test.
input = "You & me are ""cool"" <Truth />!";
// Try running the above input through each encoder method.
arrayEach(
[
"forCDATA",
"forCssString",
"forCssUrl",
"forHtml",
"forHtmlAttribute",
"forHtmlContent",
"forHtmlUnquotedAttribute",
"forJava",
"forJavaScript",
"forJavaScriptAttribute",
"forJavaScriptBlock",
"forJavaScriptSource",
"forUri",
"forUriComponent",
"forXml",
"forXmlAttribute",
"forXmlComment",
"forXmlContent"
],
function( required string methodName ) {
// Run the value through the OWASP encoder (wrapper).
var encodedValue = invoke( encoder, methodName, [ input ] );
// When outputting the value, run the encoded value through the native
// htmlEditFormat() method so that we can see some of the HTML-specific
// encodings that were put in place.
writeOutput(
methodName & " --- " &
htmlEditFormat( encodedValue ) &
"<br />"
);
}
);
</cfscript>
As you can see, the Encoder.cfc takes an instance of the Java Encoder class to proxy. Then it exposes all of the same top-level methods. And, when we run this code, we get the following output:
forCDATA --- You & me are "cool" <Truth />!
forCssString --- You \26 me are \22 cool\22 \3cTruth \2f\3e!
forCssUrl --- You\20\26 \20me\20 are\20\22 cool\22 \20\3cTruth\20\2f\3e!
forHtml --- You & me are "cool" <Truth />!
forHtmlAttribute --- You & me are "cool" <Truth />!
forHtmlContent --- You & me are "cool" <Truth />!
forHtmlUnquotedAttribute --- You & me are "cool" <Truth />!
forJava --- You & me are \"cool\" <Truth />!
forJavaScript --- You \x26 me are \x22cool\x22 <Truth \/>!
forJavaScriptAttribute --- You \x26 me are \x22cool\x22 <Truth />!
forJavaScriptBlock --- You \x26 me are \"cool\" <Truth \/>!
forJavaScriptSource --- You & me are \"cool\" <Truth />!
forUri --- You%20&%20me%20are%20%22cool%22%20%3CTruth%20/%3E!
forUriComponent --- You%20%26%20me%20are%20%22cool%22%20%3CTruth%20%2F%3E%21
forXml --- You & me are "cool" <Truth />!
forXmlAttribute --- You & me are "cool" <Truth />!
forXmlComment --- You & me are "cool" <Truth />!
forXmlContent --- You & me are "cool" <Truth />!
What's really cool is that I didn't have to hand-write each of these methods. Since the method signature is exactly the same for each method - it takes a simple value and returns a string - I was able to employ some dynamic ColdFusion programming.
All I did was create a private method that would inspect its own name at runtime using the getFunctionCalledName() function. It would then use the dynamically-bound name to turn around and invoke the appropriate method on the Java Encoder. In the end, all I had to do was write this one method and then point all of the public method references to it:
component
output = false
hint = "I provide a simple wrapper to the OWASP Java Encoder that handles proper Java-casting."
{
/**
* I initialize the wrapper around the given OWASP Java class.
*
* @owaspEncoder I am the OWASP Java encoder.
* @output false
*/
public any function init( required any owaspEncoder ) {
encoder = owaspEncoder;
return( this );
}
// ---
// PUBLIC METHODS.
// ---
// All of the .encodeCONTEXT() Java methods use the same (String)::String signature
// and can, therefore, all be implemented using the encodeForCurrentContext() private
// method. The encodeForCurrentContext() method will determine the correct context
// at runtime when the individual methods are invoked.
this.forCDATA = encodeForCurrentContext;
this.forCssString = encodeForCurrentContext;
this.forCssUrl = encodeForCurrentContext;
this.forHtml = encodeForCurrentContext;
this.forHtmlAttribute = encodeForCurrentContext;
this.forHtmlContent = encodeForCurrentContext;
this.forHtmlUnquotedAttribute = encodeForCurrentContext;
this.forJava = encodeForCurrentContext;
this.forJavaScript = encodeForCurrentContext;
this.forJavaScriptAttribute = encodeForCurrentContext;
this.forJavaScriptBlock = encodeForCurrentContext;
this.forJavaScriptSource = encodeForCurrentContext;
this.forUri = encodeForCurrentContext;
this.forUriComponent = encodeForCurrentContext;
this.forXml = encodeForCurrentContext;
this.forXmlAttribute = encodeForCurrentContext;
this.forXmlComment = encodeForCurrentContext;
this.forXmlContent = encodeForCurrentContext;
// ---
// PRIVATE METHODS.
// ---
/**
* I take the given string and encode it for the context as defined by the method
* name at runtime. The context is being dynamically derived since this simple
* method signature is being used to power all of the .forCONTEXT() methods.
*
* @input I am the string being encoded.
* @output false
*/
private string function encodeForCurrentContext( required string input ) {
var methodName = getFunctionCalledName();
var methodArguments = [ javaCast( "string", input ) ];
return( invoke( encoder, methodName, methodArguments ) );
}
}
I love how dynamic ColdFusion is - so freakin' badass. I thought about trying to implement this using closures that would dynamically bind a function instance to an encoder method name. But, using getFunctionCalledName() felt like a cleaner and easier to read solution.
Want to use code from this post? Check out the license.
Reader Comments
@All,
I should quickly mention that ColdFusion 10 introduced a number of native functions that actually implement some of the above methods in ColdFusion code. Examples:
* encodeForHtml()
* encodeForHtmlAttribute()
I haven't personally looked into those yet, but it is next on my list.
Haha, way to build "suspense"! I was wondering why you weren't just using the native CFML functions for this lot, nor even mentioned they existed... until I saw your comment above.
I like your code technique here, but I'd perhaps put your comment caveat at the *top* of the article, cos this would not be a way I recommend ppl do stuff like encodeForHtml(), encodeForJavascript() etc. Obviously the OWASP API accounts for more variations than CFML's implementation does, though.
Cheers dude.
@Adam,
Ha ha, good point - I added a NOTE to the top of the post.
This actually started because I am working with some code that actually predates ColdFusion 10 and is using an older version of the OWASP encoding library and I keep running into ColdFusion errors when I try to pass it a Boolean or a Number (since all the Java methods expect Strings). It was most irritated that the implementation we were using wasn't *already* wrapped in a ColdFusion proxy to take care of that stuff. So, this post is mostly a result of that frustration :D
Thanks for writing this! (This was on my TODO list.) I also work with older versions of ColdFusion and prefer to work with custom functions rather than the built-in ones as it makes it easier to add newer features to projects without having to worry about the version of CF used. ColdFusion 8 & 9 both include the jar files for the OWASP ESAPI, but CFTag functionality wasn't introduced by Adobe until CF10.
NOTE: Support for 6 of the 8 native CFESAPI functions was added by CFBackPort:
https://github.com/misterdai/cfbackport
It looks like Adobe only chose to support 8 different ESAPI encodings, but you've exposed 18. I wonder why only 8 are supported in CF10+. Is it possible that other functions are prefixed with something other than "EncodeFor"? (Based on past bugbase submissions, the reason was probably "NOTENOUGHTIME".)
I really prefer the way you integrated it and wondered why an official EncodeFor(Method, Content) function wasn't introduced instead of the myriad of static unique name functions. I plan on using your proxy for all CF8-11+ versions, especially considering OWASP may be updated in the future and introduce something new that Adobe may never choose implement. Thanks again!
I wanted to point out the that CF method encoders in CF10+ are NOT the same thing as the Java Encoder project. Under the hood, CF uses the ESAPI encoders, not the Java Encoder project. These are 2 separate OWASP projects.
The ESAPI for Java project is not currently active while the Java Encoder project is active. The encoders implemented in both projects are different, perform different levels of encoding, and have different performance. In fact, I wrote a post awhile ago about this very topic - http://damonmiller513.blogspot.com/2014/10/dont-use-esapi-encoders-in.html.
@Damon,
Just for others, the above URL didn't link properly because of the (.):
http://damonmiller513.blogspot.com/2014/10/dont-use-esapi-encoders-in.html
That's really interesting information. I honestly don't know all that much about the various OWASP projects, but I did see that ColdFusion 10 libs directory seems to ship with the 2.0.1 version (esapi-2.0.1.jar). But, now that I'm Googling for it, I do see that the OWSAP site says that it not as well maintained as other projects.
Good to know about the performance implications as well.
@James,
As far as why not all 18 methods were exposed, to be fair, some of the methods are *mostly* redundant. Even in the Java Docs for the Encoder project (which I reproduced in my README.md since they were really hard to find!) it will say things like:
>>> Unless you are interested in saving a few bytes of output or are writing
>>> a framework on top of this library, it is recommend that you use
>>> forJavaScript(String) over this method.
It looks like they also excluded all the XML methods.
Well, I'm glad you found this helpful then :D
Interesting use of getFunctionCalledName() but seems like it'd be easier to just use onMissingMethod()? :)
@Henry,
Great question - I actually considered using the onMissingMethod() at first. But, the one thing I don't like about onMissingMethod() is that I feel like it doesn't self-document at all. While the code could arguably be a bit more concise with onMissingMethod(), I think there's something very comforting about seeing a list of all the methods that are supported by the component. It just makes it a bit more obvious what's going on.
Ben, thank you for not only flagging this project up but showing how it can be used. It looks really good! I'm at a crossroads deciding if we should use OWASPs WAF on a Linux firewall (a nightmare to get right if you've looked at the complexity of modsec docs!), Fondeo's FuseGuard on CF, or now your new Encoder.cfc.
From experience so far it's tricky to get a WAF right with lots of false positives being logged by modsec. I haven't tried your implementation yet but I hope to. Is there a way to wire it in to protect every GET and POST across an a whole CF application without manually coding it into each page, particularly on large websites with dozens of pages/forms? On that basis the ability to add exceptions for specific pages, or even form fields, is essential.
@Gary,
I wouldn't try to class the Encoder in with those other projects. FuseGuard and a Web Application Firewall are really trying to protect you from things coming INTO your application boundaries. They check every thing (or a lot of things) and can definitely have some false positives. I've seen users blocked from our application because their IP happened to be on some list somewhere and not because they (specifically) were actually trying to do anything malicious.
Encoding data on output is actually more about protecting your users from other users (as opposed to protecting your system). Things like XSS (Cross-Site Scripting) attacks can help be prevented by being aggressive with how output is encoded (to make sure that it cannot inadvertently be executed on the client).
That said, both ColdFusion and OWASP has the concept of "Canonicalization", which does help protect your system against inputs. I am not really up-to-snuff on this stuff, but functions like ColdFusion's canonicalize() can reduce an incoming value to its most basic format, removing any of the embedded encoding that might otherwise throw off validation (and let malicious code slip through).
But, honestly, I don't want to say too much as I'll just end up giving you misinformation. In the end, however, the three things you mentions have different responsibilities; I'd be cautious about trying to replace one with the other.