The mid() Function Can Safely Go Out-Of-Bounds And Gather Zero Characters In Lucee CFML 5.3.4.77
I'm almost certain that the mid()
function in ColdFusion - think the .slice()
function in JavaScript - has historically been quite touchy regarding out-of-bounds references and zero-length extractions. At some points, however, this must have changed. I don't know when it changed; but, I accidentally discovered this nicety the other day when I was attempting to canonicalize a URL by its individual components in Lucee CFML. In an attempt to bring my mental model for the mid()
function into the modern era, I wanted to quickly demonstrate its lenient behavior in Lucee CFML 5.3.4.77.
The mid()
function, as a member method, takes two arguments:
- Start of slice.
- Length of slice.
As such, calling "hello".mid( 3, 2 )
returns "ll"
.
Unless my memory betrays me, I could swear that in the days-of-yore, if the "Start" was out-of-bounds; or, if the "Length" took the slice out-of-bounds; or if the "Length" was zero; then, Adobe ColdFusion would throw an error. It seems, however, that these days, instead of throwing an error, the CFML runtime will [rightfully so] return an empty string.
Let's see this in action:
<cfscript>
value = "123456789";
// Test to see if we can go out-of-bounds at the END of the value using both the
// Start (start out-of-bounds) and the Count (end out-of-bounds) arguments.
loop index = "i" from = 1 to = 15 {
dump( value.mid( i, 100 ) );
}
// Test to see if we can get ZERO characters both in-bounds and out-of-bounds.
dump( value.mid( 1, 0 ) );
dump( value.mid( 100, 0 ) );
</cfscript>
As you can see, we have a String with nine character in it; and, we're performing two tests on it:
With the
loop
, we're testing to see if we can go out-of-bounds on both the "Start" and "Length" arguments.With the second block of code, we're testing to see if we can gather a zero-length substring both in-bounds and out-of-bounds.
And, when we run the above ColdFusion code, we get the following output:
As you can see, when we go out-of-bounds using the mid()
function, Lucee CFML safely returns either an empty string or a truncated string. It does not throw an error!
NOTE: You can only go out-of-bounds at the "end" of the string - you cannot go out-of-bounds at the "start" of the string. Meaning, you can't call
.mid( -2 )
- this will throw an error.
NOTE: This also works in Adobe ColdFusion 2018.
This is cool because it means we don't have to nit-pick our String offsets if we need to split a string into different parts. Take, as an example, stripping a protocol of a URL - this is a really trite example, but it illustrates my point:
<cfscript>
myUrl = "https://";
protocolMatches = myUrl.reMatchNoCase( "^https?://" );
if ( protocolMatches.len() ) {
protocol = protocolMatches[ 1 ];
// NOTE: When using .mid() to get the "reset" of the URL following the protocol,
// notice that we don't have to care about starting out-of-bounds; and, we don't
// have to worry that our Length will go out-of-bounds as well. This will just
// safely return an empty string (in this case).
rest = myUrl.mid( ( protocol.len() + 1 ), myUrl.len() );
} else {
protocol = "";
rest = myUrl;
}
dump( protocol );
dump( rest );
</cfscript>
As you can see, in this demo, there is no value that comes after the protocol of the given URL. However, when we call .mid()
on the URL to gather the characters that follow the protocol, we don't have to worry about it. This .mid()
call simply returns the empty string rather than throwing any errors.
I know it may seem silly to write a whole article about the mid()
function. However, a massive part of web development revolves purely around String manipulation. It's important that I have a solid mental model for how functions like mid()
work so that I can properly leverage them within my ColdFusion code. And, I'm super excited that mid()
can safely go out-of-bounds!
Want to use code from this post? Check out the license.
Reader Comments
@All,
I was just poking around in Google, and I found a reference to an earlier blog post of mine (from 2007
:scream:
):www.bennadel.com/blog/585-calling-coldfusion-function-literals-like-you-do-in-javascript.htm
In that post, Sean Corfield comments:
... so, I am not crazy - the
mid()
function used to barf when you did things like give itzero
as a length!