Skip to main content
Ben Nadel at NCDevCon 2016 (Raleigh, NC) with: Dan Wilson
Ben Nadel at NCDevCon 2016 (Raleigh, NC) with: Dan Wilson

Creating A ColdFusion-Oriented HashCode With Loose Types (Part 2)

By
Published in

The other day, I looked at creating a HashCode-inspired algorithm for ColdFusion such that you could tell if two different data structures were the same, taking into account ColdFusion's loose type system. Only, once I was done, I realized that it didn't fully serve my purposes. Ultimately, I need it for the companion app to my feature flags book; and, in my companion app, the versioning of configuration data needs to case sensitive for keys. As such, I needed to update my FusionCode.cfc ColdFusion component to allow for two configuration options:

  • caseSensitiveKeys - I determine if struct keys and column names are canonicalized using ucase(). When enabled, key and KEY will be considered different.

  • typeCoercion - I determine if strict decision functions should be used when inspecting a given value. When disabled, false and "no" will be considered different. As will 1 and "1".

In order to make type coercion configurable, I had to bring in some strict decision functions that look at the underlying Java types representing the given ColdFusion values. This isn't an obvious thing to do because—at least in Adobe ColdFusion—there is a mixture of native Java classes and custom coldfusion.runtime.* classes. I received a little help from ChatGPT in coming up with the list of fall-back data-types; so, I'm not entirely sure if it's accurate.

It also seems that ColdFusion sometimes parses numeric values into java.math.BigDecimal instances. So, I added some checks for that too:

component {

	/**
	* I determine if the given value is one of Java's special number types.
	*/
	private boolean function isComplexNumber( required any value ) {

		return (
			isInstanceOf( value, "java.math.BigDecimal" ) ||
			isInstanceOf( value, "java.math.BigInteger" )
		);

	}


	/**
	* I determine if the given value is strictly a Boolean.
	*/
	private boolean function isStrictBoolean( required any value ) {

		return (
			isInstanceOf( value, "java.lang.Boolean" ) ||
			// Fall-back checks for legacy ColdFusion types.
			isInstanceOf( value, "coldfusion.runtime.CFBoolean" )
		);

	}


	/**
	* I determine if the given value is strictly a Date.
	*/
	private boolean function isStrictDate( required any value ) {

		return (
			isInstanceOf( value, "java.util.Date" ) ||
			// Fall-back checks for legacy ColdFusion types.
			isInstanceOf( value, "coldfusion.runtime.OleDateTime" )
		);

	}


	/**
	* I determine if the given value is strictly a numeric type.
	*/
	private boolean function isStrictNumeric( required any value ) {

		// Number is the base class for (among others):
		//
		// - java.lang.Double
		// - java.lang.Float
		// - java.lang.Integer
		// - java.lang.Long
		// - java.lang.Short
		//
		// But, it's unclear as to whether or not it covers all of the custom ColdFusion
		// data types, like "CFDouble". As such, I'm including those here as well.
		return (
			isInstanceOf( value, "java.lang.Number" ) ||
			// Fall-back checks for legacy ColdFusion types.
			isInstanceOf( value, "coldfusion.runtime.CFDouble" ) ||
			isInstanceOf( value, "coldfusion.runtime.CFFloat" ) ||
			isInstanceOf( value, "coldfusion.runtime.CFInteger" ) ||
			isInstanceOf( value, "coldfusion.runtime.CFLong" ) ||
			isInstanceOf( value, "coldfusion.runtime.CFShort" )
		);

	}

}

I also added default settings to the ColdFusion component instantiation that could subsequently be overridden on a per-call basis. The default settings create for a stricter canonicalization (enabling key case-sensitivity and disabling type coercion).

In the following snippet, notice that .deepEquals() and .getFusionCode() both accept an options argument that defaults to the constructor-provided values.

component {

	/**
	* I initialize the component with the given default settings.
	*/
	public void function init(
		boolean caseSensitiveKeys = true,
		boolean typeCoercion = false
		) {

		variables.defaultOptions = {
			// I determine if struct keys should be normalized during hashing. Meaning,
			// should the key `"name"` and the key `"NAME"` be canonicalized as the same
			// key? Or, should they be considered two different keys?
			caseSensitiveKeys: caseSensitiveKeys,
			// I determine if type-coercion should be allowed during hashing. Meaning,
			// should the boolean `true` and the string `"true"` be canonicalized as the
			// same input? Or, should all values be kept in their provided types? This
			// setting DOES NOT apply to low-level Java data types that fall under the
			// same ColdFusion umbrella. Meaning, a Java "int" and a Java "long" are both
			// still native "numeric" values in ColdFusion. As such, they will be
			// canonicalized as the same value during hashing.
			typeCoercion: typeCoercion
		};

		variables.Double = createObject( "java", "java.lang.Double" );

	}

	// ---
	// PUBLIC METHODS.
	// ---

	/**
	* I determine if the two values are equal based on their generated FusionCodes.
	*/
	public boolean function deepEquals(
		any valueA,
		any valueB,
		struct options = defaultOptions
		) {

		var codeA = getFusionCode( arguments?.valueA, options );
		var codeB = getFusionCode( arguments?.valueB, options );

		return ( codeA == codeB );

	}


	/**
	* I calculate the FusionCode for the given value.
	*
	* The FusionCode algorithm creates a CRC-32 checksum and then traverses the given data
	* structure and adds each visited value to the checksum calculation. Since ColdFusion
	* is a loosely typed / dynamically typed language, the FusionCode algorithm has to
	* make some judgement calls. For example, since the Java int `3` and the Java long `3`
	* are both native "numeric" types in ColdFusion, they will both be canonicalized as
	* the the same value. However, when it comes to different native ColdFusion types,
	* such as the Boolean value `true` and the quasi-equivalent string value `"YES"`, type
	* coercion will be based on the passed-in options; and, on the order in which types
	* are checked during the traversal.
	*/
	public numeric function getFusionCode(
		any value,
		struct options = defaultOptions
		) {

		var checksum = createObject( "java", "java.util.zip.CRC32" ).init();

		visitValue( coalesceOptions( options ), checksum, arguments?.value );

		return checksum.getValue();

	}

}

Since the options is defined on a per-call basis, not on a per-CFC-instance basis, it means that every visit method had to be updated to both accept and propagate the options during the recursion. But, it also means that the CFC is a bit easier to test since I can just swap-out the options for the same cached CFC instance.

<cfscript>

	fusionCode = new FusionCode(
		/* DEFAULT: caseSensitiveKeys = true , */
		/* DEFAULT: typeCoercion = false */
	);

	// -- Testing key-case sensitivity. -- //

	assertEquals(
		{ caseSensitiveKeys = false },
		{ "foo": "" },
		{ "FOO": "" }
	);
	assertNotEquals(
		{ caseSensitiveKeys = true },
		{ "foo": "" },
		{ "FOO": "" }
	);

	// -- Testing type-coercion. -- //

	settings = { typeCoercion = true };

	assertEquals( settings, 12.0, "12" );
	assertEquals( settings, 1, "1.0" );
	assertEquals( settings, 10000000, "10000000.0" );
	assertNotEquals( settings, 10000000, 10000000.1 );
	assertEquals( settings, true, "yes" );
	assertEquals( settings, "no", false );
	assertEquals( settings, "2024-02-14", createDate( 2024, 2, 14 ) );
	assertEquals( settings, [ javaCast( "null", "" ) ], [ javaCast( "null", "" ) ] );

	settings = { typeCoercion = false };

	assertNotEquals( settings, 10000000, 10000000.1 );
	assertNotEquals( settings, 1, "1" );
	assertNotEquals( settings, 12.2, "12.2" );
	assertNotEquals( settings, true, "yes" );
	assertNotEquals( settings, true, "true" );
	assertNotEquals( settings, 0, false );
	assertNotEquals( settings, "false", false );
	assertNotEquals( settings, "2024-02-14", createDate( 2024, 2, 14 ) );
	assertNotEquals(
		settings,
		[ "a", javaCast( "null", "" ) ],
		[ javaCast( "null", "" ), "a" ]
	);

	// Even when type-coercion is disabled, "complex" numbers should still be coerced into
	// native numbers for the sake of canonicalization. Sometimes, ColdFusion will use
	// these values behind the scenes and we can't control that.
	assertEquals(
		settings,
		createObject( "java", "java.math.BigInteger" ).init( "12" ),
		createObject( "java", "java.math.BigDecimal" ).init( "12.0" )
	);
	assertEquals(
		settings,
		createObject( "java", "java.math.BigDecimal" ).init( 12 ),
		createObject( "java", "java.math.BigDecimal" ).init( 12.0 )
	);
	assertNotEquals(
		settings,
		createObject( "java", "java.math.BigInteger" ).init( "12" ),
		createObject( "java", "java.math.BigDecimal" ).init( "12.1" )
	);
	assertNotEquals(
		settings,
		createObject( "java", "java.math.BigDecimal" ).init( 12 ),
		createObject( "java", "java.math.BigDecimal" ).init( 12.1 )
	);
	assertEquals(
		settings,
		createObject( "java", "java.math.BigInteger" ).init( "12" ),
		12
	);
	assertEquals(
		settings,
		createObject( "java", "java.math.BigDecimal" ).init( "12.0" ),
		12
	);

	// ------------------------------------------------------------------------------- //
	// ------------------------------------------------------------------------------- //

	/**
	* I assert that the given values have the SAME FusionCode using the given options.
	*/
	public void function assertEquals(
		required struct options,
		required any valueA,
		required any valueB
		) {

		if ( ! fusionCode.deepEquals( valueA, valueB, options ) ) {

			writeDump(
				label = "Expected True, received False",
				var = [
					options: options,
					valueA: valueA,
					valueB: valueB
				]
			);

		}

	}

	/**
	* I assert that the given values have DIFFERENT FusionCodes using the given options.
	*/
	public void function assertNotEquals(
		required struct options,
		required any valueA,
		required any valueB
		) {

		if ( fusionCode.deepEquals( valueA, valueB, options ) ) {

			writeDump(
				label = "Expected False, received True",
				var = [
					options: options,
					valueA: valueA,
					valueB: valueB
				]
			);

		}

	}

</cfscript>

The above ColdFusion code doesn't output anything because all of the tests pass.

Here's the full code for my FusionCode.cfc ColdFusion component:

component
	output = false
	hint = "I provide methods for generating a consistent, repeatable token for a given ColdFusion data structure (akin to Java's hashCode, but with configurable ColdFusion looseness)."
	{

	/**
	* I initialize the component with the given default settings.
	*/
	public void function init(
		boolean caseSensitiveKeys = true,
		boolean typeCoercion = false
		) {

		variables.defaultOptions = {
			// I determine if struct keys should be normalized during hashing. Meaning,
			// should the key `"name"` and the key `"NAME"` be canonicalized as the same
			// key? Or, should they be considered two different keys?
			caseSensitiveKeys: caseSensitiveKeys,
			// I determine if type-coercion should be allowed during hashing. Meaning,
			// should the boolean `true` and the string `"true"` be canonicalized as the
			// same input? Or, should all values be kept in their provided types? This
			// setting DOES NOT apply to low-level Java data types that fall under the
			// same ColdFusion umbrella. Meaning, a Java "int" and a Java "long" are both
			// still native "numeric" values in ColdFusion. As such, they will be
			// canonicalized as the same value during hashing.
			typeCoercion: typeCoercion
		};

		variables.Double = createObject( "java", "java.lang.Double" );

	}

	// ---
	// PUBLIC METHODS.
	// ---

	/**
	* I determine if the two values are equal based on their generated FusionCodes.
	*/
	public boolean function deepEquals(
		any valueA,
		any valueB,
		struct options = defaultOptions
		) {

		var codeA = getFusionCode( arguments?.valueA, options );
		var codeB = getFusionCode( arguments?.valueB, options );

		return ( codeA == codeB );

	}


	/**
	* I calculate the FusionCode for the given value.
	*
	* The FusionCode algorithm creates a CRC-32 checksum and then traverses the given data
	* structure and adds each visited value to the checksum calculation. Since ColdFusion
	* is a loosely typed / dynamically typed language, the FusionCode algorithm has to
	* make some judgement calls. For example, since the Java int `3` and the Java long `3`
	* are both native "numeric" types in ColdFusion, they will both be canonicalized as
	* the the same value. However, when it comes to different native ColdFusion types,
	* such as the Boolean value `true` and the quasi-equivalent string value `"YES"`, type
	* coercion will be based on the passed-in options; and, on the order in which types
	* are checked during the traversal.
	*/
	public numeric function getFusionCode(
		any value,
		struct options = defaultOptions
		) {

		var checksum = createObject( "java", "java.util.zip.CRC32" ).init();

		visitValue( coalesceOptions( options ), checksum, arguments?.value );

		return checksum.getValue();

	}

	// ---
	// PRIVATE METHODS.
	// ---

	/**
	* I merge the given options and the default options. The given options takes a higher
	* precedence, overwriting any default options.
	*/
	private struct function coalesceOptions( required struct options ) {

		return defaultOptions.copy().append( options );

	}


	/**
	* I determine if the given value is one of Java's special number types.
	*/
	private boolean function isComplexNumber( required any value ) {

		return (
			isInstanceOf( value, "java.math.BigDecimal" ) ||
			isInstanceOf( value, "java.math.BigInteger" )
		);

	}


	/**
	* I determine if the given value is strictly a Boolean.
	*/
	private boolean function isStrictBoolean( required any value ) {

		return (
			isInstanceOf( value, "java.lang.Boolean" ) ||
			// Fall-back checks for legacy ColdFusion types.
			isInstanceOf( value, "coldfusion.runtime.CFBoolean" )
		);

	}


	/**
	* I determine if the given value is strictly a Date.
	*/
	private boolean function isStrictDate( required any value ) {

		return (
			isInstanceOf( value, "java.util.Date" ) ||
			// Fall-back checks for legacy ColdFusion types.
			isInstanceOf( value, "coldfusion.runtime.OleDateTime" )
		);

	}


	/**
	* I determine if the given value is strictly a numeric type.
	*/
	private boolean function isStrictNumeric( required any value ) {

		// Number is the base class for (among others):
		//
		// - java.lang.Double
		// - java.lang.Float
		// - java.lang.Integer
		// - java.lang.Long
		// - java.lang.Short
		//
		// But, it's unclear as to whether or not it covers all of the custom ColdFusion
		// data types, like "CFDouble". As such, I'm including those here as well.
		return (
			isInstanceOf( value, "java.lang.Number" ) ||
			// Fall-back checks for legacy ColdFusion types.
			isInstanceOf( value, "coldfusion.runtime.CFDouble" ) ||
			isInstanceOf( value, "coldfusion.runtime.CFFloat" ) ||
			isInstanceOf( value, "coldfusion.runtime.CFInteger" ) ||
			isInstanceOf( value, "coldfusion.runtime.CFLong" ) ||
			isInstanceOf( value, "coldfusion.runtime.CFShort" )
		);

	}


	/**
	* I obfuscate the given stringified value so that it doesn't accidentally collide with
	* a string literal. When the visited values are canonicalized, they are often
	* converted to STRING values; and, I need to make sure that the stringified version of
	* a value doesn't match a native string value that might be present in the user-
	* provided data structure.
	*/
	private string function obfuscate( required string value ) {

		return "[[______#value#______]]";

	}


	/**
	* I add the given Boolean value to the checksum.
	*/
	private void function putBoolean(
		required any checksum,
		required boolean value
		) {

		putString( checksum, obfuscate( value ? "true" : "false" ) );

	}


	/**
	* I add the given date value to the checksum.
	*/
	private void function putDate(
		required any checksum,
		required date value
		) {

		putString( checksum, obfuscate( dateTimeFormat( value, "iso" ) ) );

	}


	/**
	* I add the given number value to the checksum.
	*/
	private void function putNumber(
		required any checksum,
		required numeric value
		) {

		putString(
			checksum,
			obfuscate( Double.toString( javaCast( "double", value ) ) )
		);

	}


	/**
	* I add the given string value to the checksum.
	*/
	private void function putString(
		required any checksum,
		required string value
		) {

		checksum.update( charsetDecode( value, "utf-8" ) );

	}


	/**
	* I visit the given array value, recursively visiting each element.
	*/
	private void function visitArray(
		required struct options,
		required any checksum,
		required array value
		) {

		var length = arrayLen( value );

		for ( var i = 1 ; i <= length ; i++ ) {

			putNumber( checksum, i );

			if ( arrayIsDefined( value, i ) ) {

				visitValue( options, checksum, value[ i ] );

			} else {

				visitValue( options, checksum /* , NULL */ );

			}

		}

	}


	/**
	* I visit the given binary value.
	*/
	private void function visitBinary(
		required struct options,
		required any checksum,
		required binary value
		) {

		checksum.update( value );

	}


	/**
	* I visit the given complex number.
	*/
	private void function visitComplexNumber(
		required struct options,
		required any checksum,
		required any value
		) {

		// ColdFusion seems to sometimes parse numeric literals into BigInteger and
		// BigDecimal. Let's convert both of those to DOUBLE. I think that there's a
		// chance that some value truncation may occur here; but, I think it may be edge-
		// case enough to not worry about it.
		putNumber( checksum, value.doubleValue() );

	}


	/**
	* I visit the given Java value.
	*/
	private void function visitJava(
		required struct options,
		required any checksum,
		required any value
		) {

		putNumber( checksum, value.hashCode() );

	}


	/**
	* I visit the given null value.
	*/
	private void function visitNull(
		required struct options,
		required any checksum
		) {

		putString( checksum, obfuscate( "null" ) );

	}


	/**
	* I visit the given query value, recursively visiting each row.
	*/
	private void function visitQuery(
		required struct options,
		required any checksum,
		required query value
		) {

		var columnNames = value.columnList
			.listToArray()
			.sort( "textnocase" )
			.toList( "," )
		;

		if ( options.caseSensitiveKeys ) {

			putString( checksum, columnNames );

		} else {

			putString( checksum, ucase( columnNames ) );

		}

		for ( var i = 1 ; i <= value.recordCount ; i++ ) {

			putNumber( checksum, i );
			visitStruct( options, checksum, queryGetRow( value, i ) );

		}

	}


	/**
	* I visit the given simple value.
	*/
	private void function visitSimpleValue(
		required struct options,
		required any checksum,
		required any value
		) {

		// When it comes to coercing types in ColdFusion, there's no perfect approach. We
		// might come out with a different result depending on the order in which we check
		// the types. For example, the value "1" is both a Numeric type and a Boolean
		// type. And the value "2023-03-03" is both a String type and Date type. These
		// values will be hashed differently depending on which type we check first. As
		// such, I just had to make a decision and try to be consistent. This is certainly
		// not a perfect algorithm.
		if ( options.typeCoercion ) {

			if ( isNumeric( value ) ) {

				putNumber( checksum, value );

			} else if ( isDate( value ) ) {

				putDate( checksum, value );

			} else if ( isBoolean( value ) ) {

				putBoolean( checksum, value );

			} else {

				putString( checksum, value );

			}

		// No type-coercion - strict matches only.
		} else {

			if ( isStrictNumeric( value ) ) {

				putNumber( checksum, value );

			} else if ( isStrictDate( value ) ) {

				putDate( checksum, value );

			} else if ( isStrictBoolean( value ) ) {

				putBoolean( checksum, value );

			} else {

				putString( checksum, value );

			}

		}

	}


	/**
	* I visit the given struct value, recursively visiting each entry.
	*/
	private void function visitStruct(
		required struct options,
		required any checksum,
		required struct value
		) {

		var keys = structKeyArray( value )
			.sort( "textnocase" )
		;

		for ( var key in keys ) {

			if ( options.caseSensitiveKeys ) {

				putString( checksum, key );

			} else {

				putString( checksum, ucase( key ) );

			}

			if ( structKeyExists( value, key ) ) {

				visitValue( options, checksum, value[ key ] );

			} else {

				visitValue( options, checksum /* , NULL */ );

			}

		}

	}


	/**
	* I visit the given xml value.
	*/
	private void function visitXml(
		required struct options,
		required any checksum,
		required xml value
		) {

		// Note: I'm just punting on the case-sensitivity here since I don't use XML. Does
		// anyone use XML anymore?
		putString( checksum, obfuscate( toString( value ) ) );

	}


	/**
	* I visit the given generic value, routing the value to a more specific visit method.
	*
	* Note: This method doesn't check for values that wouldn't otherwise be in basic data
	* structure. For example, I'm not checking for things like Closures or CFC instances.
	* This is intended to be used with serializable data.
	*/
	private void function visitValue(
		required struct options,
		required any checksum,
		any value
		) {

		if ( isNull( value ) ) {

			visitNull( options, checksum );

		} else if ( isArray( value ) ) {

			visitArray( options, checksum, value );

		} else if ( isStruct( value ) ) {

			visitStruct( options, checksum, value );

		} else if ( isQuery( value ) ) {

			visitQuery( options, checksum, value );

		} else if ( isXmlDoc( value ) ) {

			visitXml( options, checksum, value );

		} else if ( isBinary( value ) ) {

			visitBinary( options, checksum, value );

		} else if ( isComplexNumber( value ) ) {

			visitComplexNumber( options, checksum, value );

		} else if ( isSimpleValue( value ) ) {

			visitSimpleValue( options, checksum, value );

		} else {

			visitJava( options, checksum, value );

		}

	}

}

This is mostly for my own internal use and exploration. But, maybe there's something in here that you find interesting.

Want to use code from this post? Check out the license.

Reader Comments

Post A Comment — I'd Love To Hear From You!

Post a Comment

I believe in love. I believe in compassion. I believe in human rights. I believe that we can afford to give more of these gifts to the world around us because it costs us nothing to be decent and kind and understanding. And, I want you to know that when you land on this site, you are accepted for who you are, no matter how you identify, what truths you live, or whatever kind of goofy shit makes you feel alive! Rock on with your bad self!
Ben Nadel