Using Subtraction To Power The Array Sort Comparison Operator In Lucee CFML 5.2.9.31
Yesterday, while looking at generating color histograms using GraphicsMagick and Lucee CFML, I had to sort of an Array of colors based on their frequency distribution within an image. To do this, I created a sort operator that used a single Subtraction expression to calculate the comparison result. I'm not sure that I've ever done this before; so, I wanted to take a moment and document this maths-based approach using Lucee CFML 5.2.9.31.
ASIDE: I'm looking at this in ColdFusion code since that's what I was using yesterday; but, to be clear, this is just the way that sort comparison operators work in most languages. As such, this would also apply to, for example, JavaScript's
Array.prototype.sort
method.
In many places that a .sort()
comparison operator is discussed, it is often demonstrated using -1
, 0
, and 1
as the possible operator outcomes. Meaning, given two items, a
and b
, returning the aforementioned values carries the following connotation:
-1
- Valuea
should be sorted before valueb
.0
- Valuea
and valueb
are equivalent.1
- Valuea
should be sorted after valueb
.
Because of this, I often code my .sort()
comparison operators to explicitly return -1
, 0
, or 1
. For example, if I wanted to sort a collection of strings based on their length, I might do the following:
<cfscript>
values = [
"asdlfkj",
"oweiru",
"aldkfjlakjflajsdfljalsfjlfl",
"xzmcn",
"lakdfjlakjfl",
"mlkjwler",
"adf",
"lkasdjflajdla",
"cvuoixcviou"
];
// Sort the collection of values based on the LENGTH of each value.
values.sort(
( a, b ) => {
var aLength = a.len();
var bLength = b.len();
if ( aLength < bLength ) {
return( -1 );
} else if ( aLength > bLength ) {
return( 1 );
} else {
return( 0 );
}
}
);
dump( values );
</cfscript>
As you can see, I'm comparing the length of each item in the compare-operator function and then explicitly returning one of the static values. And, when we run this ColdFusion code, we get the following browser output:
But, the .sort()
operator isn't limited to these three return values. Really, the .sort()
algorithm is looking for these three general outcomes:
- Less than zero.
- Zero.
- Greater than zero.
It just so happens that -1
is less than zero; and, 1
is greater than zero; that's why those values work in a comparison operator. But, really, there's nothing special about -1
and 1
.
Given this broader perspective, we can greatly simplify our .sort()
comparison function using a single Subtraction expression based on the length of each value:
<cfscript>
values = [
"asdlfkj",
"oweiru",
"aldkfjlakjflajsdfljalsfjlfl",
"xzmcn",
"lakdfjlakjfl",
"mlkjwler",
"adf",
"lkasdjflajdla",
"cvuoixcviou"
];
// Sort the collection of values based on the LENGTH of each value.
values.sort(
( a, b ) => {
return( a.len() - b.len() );
}
);
dump( values );
</cfscript>
If we run this ColdFusion code, we get the same output. Let's look at why, using some example values. Given a
: "12345" and b
: "12345678":
return( "12345".len() - "12345678".len() )
... which gives us:
return( 5 - 8 )
... which evaluates to:
return( -3 )
So, when the length of a
is less than the length of b
we end up with a result that is less than zero. Conversely if we have a
: "12345678" and b
: "12345":
return( "12345678".len() - "12345".len() )
... which gives us:
return( 8 - 5 )
... which evaluates to:
return( 3 )
So, when the length of a
is greater than the length of b
we end up with a result that is greater than zero.
When we stop thinking about the sort comparison in terms of static values and, instead, think about the result as a set of ranges, our logic can become much more concise. Of course, there is always tension between "concise" and "readable". My first example with explicit if
statements is longer; but, it demonstrates clear intent. My second example, on the other hand, is much shorter; but, doesn't really express any intent. As such, I would almost certainly include a comment regarding my intent when using the Maths-based approach:
// Shorter values should be sorted before longer values. return( a.len() - b.len() );
Now, the next engineer to read this code should be able to understand what the .sort()
is doing even if the Maths is not immediately obvious.
PRO TIP: Avoid unnecessarily concise code! Technically speaking, we could make this code even more concise in Lucee CFML (and other languages) by removing the curly-braces of the comparison operator, leaving us with a single-line of code:
values.sort( ( a, b ) => a.len() - b.len() );
This short-hand syntax for fat-arrow functions implicitly returns the result of the subtraction. This is definitely fun for code golf competitions; but, I would recommend that you avoid this level of conciseness in code that you are writing as part of a team.
Historically, I've always through about sort operations in terms of three static outcomes: -1
, 0
, and 1
. However, when we think about sort operations in terms of what they really are: values ranges, we can start to simplify our sort logic. In this case, we can use a single Subtraction expression to sort an array of values by length in Lucee CFML 5.2.9.31.
Want to use code from this post? Check out the license.
Reader Comments
@All,
Ha ha, and of course, MDN (Mozilla Developer Network) calls out this approach right in its documentation:
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/sort
Just another case where "Reading the documentation" would reveal all the gems that you discover as you code. :face-palm:
@All,
I ran into an interesting edge-case with the maths based approach to sorting. It turns out, the operator used in the
array.sort()
call must return a value that fits into a javaint
:www.bennadel.com/blog/3795-array-sort-operator-must-return-int-sized-result-in-lucee-cfml-5-3-4-80.htm
I ran into an error with this because I was trying to sort an array based on UTC-milliseconds, which are not
int
friendly.@All,
Yesterday, I learned that there is a
sgn()
function in ColdFusion that will clamp any input down to-1
,0
, and1
:www.bennadel.com/blog/4247-using-sgn-to-clamp-values-in-array-sorting-operations-in-coldfusion.htm
This is perfect for array sorting algorithms. It's kind of like the numeric version of the
compare()
function, which is also great for sorting algorithms (on string values).