Considering Two-Step ColdFusion Custom Tags That Generate CFML Code In Lucee CFML 5.3.7.48
After realizing that Lucee performs a file-check on every invocation of a <cfimport>
tag - which has terrible performance in a containerized context - I started to think about ways to improve the performance of my ColdFusion custom tag DSL for HTML emails. I kept racking my brain, and the one idea that I keep coming back to is generating output using a two-step process: one that compiles the code down into a CFML template; and, one that executes the compiled code, generating the final output. Since this seems to be the only idea that I can come up with, I wanted to experiment with encapsulating both steps inside a single set of ColdFusion custom tags in Lucee CFML 5.3.7.48.
As my north star, the one immutable goal that I have in all of this is that I want the solution to be ColdFusion based. Meaning, I don't want to have to install node
or ruby
to compile my CFML code. I want turtles all the way down. This should make writing, debugging, and maintaining the generated CFML templates much easier (and faster) for ColdFusion developers.
The complexity with any solution that encompasses both compile-time and run-time behavior is that your templates necessarily have a mixture of syntax that needs to be executed "now" and executed "later". A CFML-based solution is no different. And, it means that we need a way to differentiate the "now" vs. "later" CFML code.
When it comes to ColdFusion tags, I am experimenting with prefixing "runtime" tags with <runtime:
such that a normal ColdFusion tag like <cfloop>
will be written as <runtime:loop>
. And, for template interpolation, I'm using the Angular / Handlebars inspired {{ value }}
syntax.
The biggest mental hurdle comes with ColdFusion custom tag attributes. Since custom tags, in this context, are being used to generate CFML code - not generate output - the attributes passed into the custom tag are no longer "variables", they're "variable names". And, the output that the custom tags generate (which is supposed to be CFML code), uses those variable names to reference runtime values, not compile-time values.
It definitely adds an unfortunately layer of complexity and mental gymnastics to what would have otherwise been a much cleaner solution. But, in the end, the performance gains that accompany that cost are quite significant.
Let's take a look at my first exploration of this concept. I just wanted to create a "wrapper" ColdFusion custom tag that compiles down to CFML code; and, uses a child ColdFusion custom tag to create an encapsulated, reusable "widget":
<!--- Import custom tag libraries. --->
<cfimport prefix="core" taglib="./core/" />
<cfimport prefix="my" taglib="./my/" />
<!--- ------------------------------------------------------------------------------ --->
<!--- ------------------------------------------------------------------------------ --->
<!---
Because we are using a "TWO-PASS" set of ColdFusion custom tags, the first pass will
compile the CFML code that will then be executed N-number of times in a SECOND-PASS.
To differentiate the CFML tags that are doing the compiling from the CFML tags that
will subsequently get executed, I'm using the prefix, "runtime:". At the end of the
compile step, all instances of "runtime:" will be replaced with "cf" in order to
create valid CFML syntax. Additionally, all instances of "{{" and "}}" will be
replaced with "#" to enable runtime template interpolation.
--->
<core:Content>
<!--- Becomes "<cfset" --->
<runtime:set request.values = [ "a", "b", "c" ] />
<!--- Becomes "<cfoutput" --->
<runtime:output>
<!--- Becomes "<cfloop" --->
<runtime:loop index="request.i" value="request.value" array="{{ request.values }}">
<my:Item item="request.value" />
</runtime:loop>
</runtime:output>
</core:Content>
As you can see, I have a number of <runtime:>
tags that get compiled down to <cf>
tags. But, I also have a compile-time invocation of a ColdFusion custom tag, <my:Item>
. Now, <my:Item>
receives request.value
, which is a runtime variable. That's why I'm passing it in as a String and not as an interpolated value.
I think you can already see that mixing runtime and compile-time syntax is confusing. But, I just don't think there's anyway around it. Generating code is a messy endeavor. I'm just trying to make it a bit more enjoyable.
Let's take a look now at my ColdFusion custom tag, <my:Item>
:
<!--- Import custom tag libraries. --->
<cfimport prefix="core" taglib="../core/" />
<!---
When creating a TWO-PASS custom tag, the "item" being passed-in here is NOT the
actual item - it's the NAME of the RUNTIME VARIABLE that we want to reference during
the second-pass (ie, runtime) execution. As such, most attributes will end up being
STRINGS (unless, of course, they are impacting the compile-time initial pass).
--->
<cfparam name="attributes.item" type="string" />
<!--- ------------------------------------------------------------------------------ --->
<!--- ------------------------------------------------------------------------------ --->
<!---
In order to make the attributes a little bit easier to work with inside this
ColdFusion custom tag, let's intercept the output and interpolate "attributes.*"
expressions. Meaning, let's replace the pattern "attributes.\w+" with the actual
values that they are storing. This will remove the need to wrap "attributes.item" in
hash-tags directly within the tag body.
--->
<core:InterpolateAttributes>
<runtime:if ( attributes.item neq "b" )>
<p>
This is my item from within the tag: {{ encodeForHtml( attributes.item ) }}.
</p>
<!---
We can, of course, mix compile-time and run-time functionality. Here, I'm
using COMPILE-TIME ColdFusion code to include the date/time at which this
template was compiled (just for exploratory purposes).
--->
<cfoutput>
<!-- Compiled #dateFormat( now() )# at #timeFormat( now() )#. -->
</cfoutput>
</runtime:if>
</core:InterpolateAttributes>
<!--- Since this tag only has a START mode, let's make sure no END mode is executed. --->
<cfexit method="tag" />
Again, we have a mixture of compile-time and runtime code. And, I'm also throwing this concept of <core:InterpolateAttributes>
into the mix. Since the attributes are [mostly] going to represent runtime values, not compile-time values, they actually need to be interpolated into the generated CFML code. Meaning, you'd have to write things like this:
My value: {{ encodeForHtml( #attributes.value# ) }}
... where I'm wrapping my attributes.value
in #
markers so that the value of the attribute is what shows up in the output.
Since I knew this would be the primary gesture of the ColdFusion custom tag output, I created another tag that would allow you to omit the #
markers and just have the compile-time step do the interpolation for you (optionally of course). Here's my ColdFusion custom tag that does this:
<cfscript>
switch ( thistag.executionMode ) {
case "end":
thistag.generatedContent = jreReplace(
thistag.generatedContent,
"(?i)\b(attributes\.\w+)",
( $0 ) => {
return( getVariable( "caller.#$0#" ) );
}
);
break;
}
// ------------------------------------------------------------------------------- //
// ------------------------------------------------------------------------------- //
/**
* I use Java's Pattern / Matcher libraries to replace matched patterns using the
* given operator function or closure.
*
* @targetText I am the text being scanned.
* @patternText I am the Java Regular Expression pattern used to locate matches.
* @operator I am the Function or Closure used to provide the match replacements.
*/
public string function jreReplace(
required string targetText,
required string patternText,
required function operator
) {
var matcher = createObject( "java", "java.util.regex.Pattern" )
.compile( patternText )
.matcher( targetText )
;
var buffer = createObject( "java", "java.lang.StringBuffer" ).init()
// Iterate over each pattern match in the target text.
while ( matcher.find() ) {
// When preparing the arguments for the operator, we need to construct an
// argumentCollection structure in which the argument index is the numeric
// key of the argument offset. In order to simplify overlaying the pattern
// group matching over the arguments array, we're simply going to keep an
// incremented offset every time we add an argument.
var operatorArguments = {};
var operatorArgumentOffset = 1; // Will be incremented with each argument.
var groupCount = matcher.groupCount();
// NOTE: Calling .group(0) is equivalent to calling .group(), which will
// return the entire match, not just a capturing group.
for ( var i = 0 ; i <= groupCount ; i++ ) {
operatorArguments[ operatorArgumentOffset++ ] = matcher.group( i );
}
// Including the match offset and the original content for parity with the
// JavaScript String.replace() function on which this algorithm is based.
// --
// NOTE: We're adding 1 to the offset since ColdFusion starts offsets at 1
// where as Java starts offsets at 0.
operatorArguments[ operatorArgumentOffset++ ] = ( matcher.start() + 1 );
operatorArguments[ operatorArgumentOffset++ ] = targetText;
var replacement = operator( argumentCollection = operatorArguments );
// Since the operator is providing the replacement text based on the
// individual parts found in the match, we are going to assume that any
// embedded group reference is coincidental and should be consumed as a
// string literal.
// --
// NOTE: In the event the operator doesn't return a value, we'll assume that
// the intention is to replace the match with nothing (ie, an empty string).
matcher.appendReplacement(
buffer,
matcher.quoteReplacement( replacement ?: "" )
);
}
matcher.appendTail( buffer );
return( buffer.toString() );
}
</cfscript>
Most of the code here is just an implementation of an operator-based Regular Expression replacement. Ultimately, all this ColdFusion custom tag is doing is replacing the matches of:
attributes\.\w+
... with the caller
-based values:
getVariable( "caller.#$0#" )
So, it's programmatically doing what a #attribute.value#
expression would have done for us implicitly. Only, we're getting the same result without having to include the #
in our markup.
I'm on the fence as to whether this makes it too confusing. At the very least, it's completely optional; and, if you don't care for this behavior, you can just omit the <core:InterpolateAttributes>
custom tag.
Ultimately, the first pass of all this code is just producing CFML code that is then being compiled down into a CFML template that is executed during the second pass. The wrapper tag, <core:Content>
is where the first vs. second pass logic takes place.
The following ColdFusion custom tag works by writing the compiled CFML code to a .cfm
template. Then, in the start mode of the tag, it attempts to execute that .cfm
template. Once the CFML template it compiled, this is all that ever happens (super fast!). However, if the .cfm
template doesn't exist yet, it allows the body of the tag to execute (which is what outputs our CFML code), captures the output, and writes it to disk.
<cfscript>
switch ( thistag.executionMode ) {
case "start":
// CAUTION: The compiled ColdFusion / CFML code will live in the "Cached"
// folder using a generated filename. The generated filename is based on the
// current callstack. While this may have some processing overhead, it will
// ultimately cut down on the chances that an engineer forgets to name
// something uniquely which would create a situation in which different
// compile-steps may end up overwriting each other accidentally.
targetTemplate = "./cache/#getGeneratedTemplateFilename()#";
// Since we have concurrent requests that may be competing for file-access in
// the same template, we want to add SOME SYNCHRONIZATION around the access.
// Once the file has been written to disk, this locking should have
// essentially no overhead.
lockName = "DynamicTemplate::#targetTemplate#";
lockTimeout = 30;
lock
name = lockName
type = "readonly"
timeout = lockTimeout
{
// In order to simplify the locking, we're going to use the file system
// as the source of truth and just OPTIMISTICALLY include the template,
// catching template-specific errors.
try {
include template = targetTemplate;
exit method = "tag";
} catch ( "MissingInclude" error ) {
// If the dynamic CFML template has not yet been generated to disk,
// let's just swallow the error and allow the END mode execution to
// generate the file.
}
}
break;
case "end":
// It's possible that we have concurrent requests that have all executed
// the tag body. As such, we need to SYNCHRONIZE access to the file so that
// only one request ends up actually generating the CFML template.
lock
name = lockName
type = "exclusive"
timeout = lockTimeout
{
if ( ! fileExists( targetTemplate ) ) {
cfmlCode = thistag.generatedContent;
// Convert <runtime:> tags to <cf> tags.
cfmlCode = reReplaceAll( cfmlCode, "(</?)runtime:", "$1cf" );
// Convert {{}} interpolation to ## interpolation.
cfmlCode = reReplaceAll( cfmlCode, "(\{\{\s*|\s*\}\})", "##" );
// Strip-out blank lines.
cfmlCode = reReplaceAll( cfmlCode, "(?m)^\p{Blank}*(\r\n?|\n)", "" );
fileWrite( targetTemplate, cfmlCode );
}
}
// At this point, either this request or a concurrent request has caused the
// dynamic CFML template to be written to disk. As such, all we have to now
// do is execute it.
include template = targetTemplate;
// Clear the output of this tag - it will have been entirely replaced with
// the execution of the dynamically-generated CFML template above.
thistag.generatedContent = "";
break;
}
// ------------------------------------------------------------------------------- //
// ------------------------------------------------------------------------------- //
/**
* I generate a unique filename based on the given unique calling context.
*
* @template I am the calling template.
* @lineNumber I am the calling template line-number.
*/
public string function generateFilenameForContext(
required string template,
required numeric lineNumber
) {
var stub = getFileFromPath( template )
.lcase()
.reReplace( "\.(cfc|cfml?|lucee)$", "" )
.reReplace( "[^a-z0-9]+", "-", "all" )
.reReplace( "-{2,}", "-", "all" )
.reReplace( "^-+", "" )
;
var suffix = hash( template & lineNumber )
.lcase()
;
return( "#stub#-#lineNumber#-#suffix#.cfm" );
}
/**
* I return the filename to be used to generate a unique CFML template for the
* current invocation.
*/
public string function getGeneratedTemplateFilename() {
// NOTE: Callstack item 1 is this function; item 2 is this tag; and, item 3 will
// be the calling context.
var callstack = callstackGet();
var callingContext = callstack[ 3 ];
return( generateFilenameForContext( callingContext.template, callingContext.lineNumber ) );
}
/**
* I use Java RegEx patterns to replace all of the occurrences of the given pattern
* with the given replacement.
*
* NOTE: I am using this as a performance improvement over reReplace(), and to get
* some additional pattern matching functionality.
*
* @input I am the content being inspected.
* @patternText I am the RegEx pattern being matched.
* @replacementText I am the replacement text for the matched pattern.
*/
public string function reReplaceAll(
required string input,
required string patternText,
required string replacementText,
boolean quoteReplacement = false
) {
if ( quoteReplacement ) {
replacementText = createObject( "java", "java.util.regex.Matcher" )
.quoteReplacement( replacementText )
;
}
var result = javaCast( "string", input )
.replaceAll( patternText, replacementText )
;
return( result );
}
</cfscript>
I'm actually really enjoying that it writes the generated CFML code to disk. This makes it much easier to understand what code is being generated because you can actually go and look at the generated template. For example, here's the CFML template that was generated by my demo:
<cfset request.values = [ "a", "b", "c" ] />
<cfoutput>
<cfloop index="request.i" value="request.value" array="#request.values#">
<cfif ( request.value neq "b" )>
<p>
This is my item from within the tag: #encodeForHtml( request.value )#.
</p>
<!-- Compiled 16-Mar-21 at 07:30 AM. -->
</cfif>
</cfloop>
</cfoutput>
Now, with this in place, we can see that the first request of the page is relatively slow as it does the compilation for the CFML code and File IO. But, the subsequent requests are blazing fast!
It's one thing to do this in a stand-alone demo; it's an entirely different thing to do this at scale. I'm going to start to play around with applying this approach to my ColdFusion custom tag DSL for HTML emails. I'll leave the current proof-of-concept (POC) in place and start a separate GitHub repo that rabbit-holes down this experiment.
Want to use code from this post? Check out the license.
Reader Comments
Is the Docker issue, the only reason you are implementing this two pass approach?
Maybe, you could have two implementations. A Docker two pass implementation and the original methodology for everything else. It seems like such a shame to write off all the work you had done before you tested it, in Docker.
To be honest, there are a lot of CF Developers, like myself, that don't use containers.
@Charles,
Yeah, that's a good point. I think it would have to be maintained as a second set of ColdFusion custom tags as I don't think there would be an easy way to make it work both ways. But, as I start to play around with this more, I may find that a good pattern emerges? We shall see.
@All,
So, I've started to play around with this approach in the context of my HTML Email tags and it's extremely confusing. Especially with regard to the HEX colors that get inlined. Because all of the HEX colors start with
#
, which then causes an error if they happened to get rendered inside an<runtime:output>
tag (since ColdFusion then tries to interpret the color as a variable and can't find the closing pound-sign).I think I'm going to need to rethink this.
UGGGGG! I wish this wasn't so complicated.
@All,
This is additionally confusing because any given ColdFusion custom tag may accept attributes that are a mixture of compile-time and runtime values. And, it's hard to know which is which at a glance.
@All,
Ok, some exciting news! It turns out that this whole thing may not even be necessary. I just ran some test code in both my local Docker For Mac setup as well as the production setup. And, it seems that the File IO overhead that affects the
<CFImport>
tag is 68-times slower locally than it is in production:www.bennadel.com/blog/4010-my-docker-for-mac-file-io-is-68-times-slower-than-it-is-in-production.htm
In fact, in production, the
<CFImport>
code seemed to be only 2.4-times slower that the same<CFModule>
code. Which is basically nothing! Woot woot!We're back in the game, baby!!