Added Manual String Serialization To JsonSerializer.cfc ColdFusion Project
A couple of years ago, I created a ColdFusion component - JsonSerializer.cfc - to help me serialize values in ColdFusion. There is a native method for this already - serializeJson(); but, going from a case-insensitive language to a case-sensitive specification is fraught with peril. That said, JsonSerializer.cfc was still using the serializeJson() method internally for simple values. After discovering the serializeJson() bug, involving unicode escape sequences, I've replaced the internal string serialization method with a manual string serialization approach.
View the JsonSerializer.cfc project on GitHub.
To see this in action, I've put together a simple demo that serializes an input string that is known to cause problems:
<cfscript>
// As of ColdFusion 10.0.14, the sequence u+1234 is incorrectly serialized as the
// value \u1234. This is particularly harmful when you are serializing base64-encoded
// data, such as embedded binary objects.
input = "This sequence [ u+1234 ] causes problems in ColdFusion 10.0.14+.";
// The JsonSerialier.cfc manually serializes the input, which lets it side-step the
// ColdFusion serializeJson() bugs.
serializedInput = new JsonSerializer().serialize( input );
deserializedInput = deserializeJson( serializedInput );
writeOutput( "Input: #input# <br />" );
writeOutput( "Deserialized: #deserializedInput# <br />" );
writeOutput( "Matches: #yesNoFormat( ! compare( input, deserializedInput ) )#" );
</cfscript>
As of ColdFusion 10.0.14, the input sequence "u+1234" is incorrectly encoded as "\u1234". However, using the updated JsonSerializer.cfc, we get the following page output:
Input: This sequence [ u+1234 ] causes problems in ColdFusion 10.0.14+.
Deserialized: This sequence [ u+1234 ] causes problems in ColdFusion 10.0.14+.
Matches: Yes
As you can see, the "u+1234" character sequence went through the serialization life-cycle in tact.
The more involvement I have in the JSON (JavaScript Object Notation) serialization process, the more overhead is incurred; I have to assume that anything I do will inherently be slower than the same feature that is natively part of ColdFusion. That said, I've tried very hard to keep this serialization process efficient. Internally, it uses an output buffer instead of function return values. This seems to provide fairly high performance.
Want to use code from this post? Check out the license.
Reader Comments
That's good stuff!
Have you looked at Jackson (https://github.com/FasterXML/jackson)? I used it to implement a similar thing in Java (then wrapped in a CFC) that serializes to json and xml for our REST APIs. I added options for key casing, adding records counts to result sets, date formatters for date types, etc... but I hadn't thought about custom formatters by key name - great idea!
@James,
I haven't seen that one before, I'll have to take a look. But, you mention something that I've been thinking about - "date formatters." In this component, I happen to use TZ formatting for the dates. But, in practice, I find it is often much easier to convert the date to UTC-milliseconds rather than some date string. This way, the consuming client doesn't have to parse the string, but can rather just create a local date object from the given UTC milliseconds.
For example, in JavaScript, where I am consuming an API response that is JSON, I'd just do something like:
var something = new Date( response.dateStamp )
... where response.dateStamp is something like "1434023221668".
So, I'd like to be able to add a flag to the JSON serializer that allows for that kind of formatting.
@Ben,
That's a good point. UTC millis are much easier for javascript consumption - I'm using java.text.SimpleDateFormat, which I just discovered doesn't have an option for UTC/epoch millis. But String.format could be used instead - that would provide a lot more flexibility, although the syntax isn't quite as friendly for dates.
@James,
I feel like there's this huge ocean of Java functionality right below the surface, and I've only scratched the surface :)
What about coldfusion 11? Do you need to do any of this?
@Papichulo,
Good question. ColdFusion 10 and, especially ColdFusion 11, have made strides in getting JSON-handling better. Take a look at the docs on serializeJson():
https://wikidocs.adobe.com/wiki/display/coldfusionen/SerializeJSON
Of particular note is the new per-Application properties:
this.serialization.preserveCaseForStructKey = true
Which will maintain the case of keys as they are defined, as opposed to the traditional model where it converts them all to upper-case.
You can also add a custom serializer if you want to override the native serializer.
I should caveat that I have not played with either of the above. And, I should also mention that there is still a bug in the way serializeJson() handles "u+1234" type notation, which is what I covered in the blog post itself; so, that's a problem with all of the native serialization right now.
Thanks for the question - now I want to go and play with CF11's new JSON stuff - time to turn that theory into practice :D