My ColdFusion Color Coding Solution Explanation
After my last post about color coding, Trond of CFSkill.com asked me for an example, so I thought I would just demonstrate how I am putting in the links and then how I am displaying the code in a pop-up window. For starters, I am adding the "Launch code in new window" at run time. This is NOT part of the stored content:
<!--- Output the entry content with code links. --->
#REQUEST.UDFLib.Custom.AddLaunchCodeLinks(
ID = REQUEST.EntryQuery.id,
Content = REQUEST.EntryQuery.content,
DAX = REQUEST.DAX
)#
I have a custom user defined library (UDF), AddLaunchCodeLinks(), that takes the ID of the blog entry, the content of the entry, and customized framework object (My very own DAX Framework) and runs through the code looking for places to put links. It knows where to put links because all my code samples are in this format:
<!--- Ouput code sample. --->
<div class="code[fixed]">
<ul>
....
</ul>
</div>
They all are wrapped in a div that has a class of either "code" or "codefixed". So, how do I figure it out? Here is the UDF:
<cffunction
name="AddLaunchCodeLinks"
access="public"
returntype="string"
output="false"
hint="This adds the launch code links to blog entries.">
<!--- Define arguments. --->
<cfargument name="ID" type="numeric" required="true" />
<cfargument name="Content" type="string" required="true" />
<cfargument name="DAX" type="any" required="true" />
<!--- Define the local scope. --->
<cfset var LOCAL = StructNew() />
<!---
Create a pattern for matching the code divs. We will
use this pattern and its matcher to iterate through
code samples in the content text.
--->
<cfset LOCAL.Pattern = CreateObject(
"java",
"java.util.regex.Pattern"
) />
<!---
Compile the patter (to get the pattern object instance).
When compiling the pattern, we need to have it find
divs that have class 'code' or 'codefixed'. Since this
code is coming out of XStandard, we don't have to worry
about case or anything (it's XHTML compliant - all
lower case).
--->
<cfset LOCAL.Pattern = LOCAL.Pattern.Compile(
"<div class=""code[^""]*"">"
) />
<!---
Get the pattern matcher object so that we can loop
through matching code divs.
--->
<cfset LOCAL.Matcher = LOCAL.Pattern.Matcher(
ARGUMENTS.Content
) />
<!---
Create a response buffer to hold the updated code. We
need to keep adding our updated changes to this buffer
so that we can return fully updated content at the end.
--->
<cfset LOCAL.Output = CreateObject(
"java",
"java.lang.StringBuffer"
).Init(
""
) />
<!---
Keeping looping over the content until we cannot find
any more instances of the code div. Each iteration of
this loop will stop at a matching div.
--->
<cfloop condition="LOCAL.Matcher.Find()">
<!---
Get the matching code. This will return the
expression that was matched by the compiled
pattern.
--->
<cfset LOCAL.Match = LOCAL.Matcher.Group() />
<!---
We want to add the "Launch" link to the content
before the code div is displayed. Since we have the
matching code div in our LOCAL.Match variable, all
we have to do is prepend the link code to the
matched content.
When creating this link, we want to pass in the
offset of this code div in the given content.
That way, when we are parsing it out later, we
will know when we have the correct code div.
--->
<cfset LOCAL.Match = (
"<a href=""#ARGUMENTS.DAX.ParentGroup.ToAction( "viewcode", ARGUMENTS.ID )#&start=#LOCAL.Matcher.Start()#"" target=""_blank"" class=""codelauncher""> Launch code in new window »</a>" &
LOCAL.Match
) />
<!---
Add the updated code div and prepended launch link
to the buffer. When appending the replacement, be
sure to escape any $ signs with literal $ signs so
that they are not evaluated are regular expression
groups.
The AppendReplacement() method will not only add the
given passed in string (LOCAL.Match) to the output
buffer, it will also append all the content that
was in the content data previous to this matched
group. That is how we end up with the ENTIRE data
set back in the output and not just the matched
groups.
--->
<cfset LOCAL.Matcher.AppendReplacement(
LOCAL.Output,
LOCAL.Match.ReplaceAll(
"\\",
"\\\\"
).ReplaceAll(
"\$",
"\\\$"
)
) />
</cfloop>
<!---
We have run out of matching code divs. Now, we just
need to add the rest of the content to the output
buffer.
--->
<cfset LOCAL.Matcher.AppendTail( LOCAL.Output ) />
<!--- Return the output buffer (convert to string). --->
<cfreturn LOCAL.Output.ToString() />
</cffunction>
Now, as you can see, there should be a "Launch code in new window" link directly above this code sample. The code above is a fairly standard example of how a Java Pattern / Matcher loop works. But, I want to just talk about the link that gets created. It's a bit complicated because it uses my DAX framework to build the link, but really, I just want to touch upon the use of:
LOCAL.Matcher.Start()
LOCAL.Matcher.Start() returns the character index of the matched group. When I pop-up the code launching window, all I pass to it is the blog entry ID and the offset of the code to parse. That page then does a similar loop with a modified Pattern compilation and attempts to find the code div with the passed in "Start" offset. That is how I know which code div I am looking for.
Now, what happens on the pop-up page? Well, there is some validation for blog entry existence and what not. But, assuming that I have a valid blog entry in REQUEST.EntryQuery (query object), here is how I attempt to parse out the code:
<!---
The blog entry is good, now we just have to find the code
snippet that we have selected. Let's set a default snippet
value to use as a flag (check for length later).
--->
<cfset strCodeSnippet = "" />
<!---
Now, let's try and find the matching code snippet using a
java pattern matcher. Create a pattern for matching the
code divs. Unlike the calling page, we want to find NOT ONLY
the start of the div, we want to find the ENTIRE div and
all of its contents (so that we can parse it out).
--->
<cfset objPattern = CreateObject(
"java",
"java.util.regex.Pattern"
) />
<!---
Compile the pattern (to get the pattern object instance).
Notice that we are using the same first group () to find the
div, but then we are using a NON-Greedy search to find the
closing/matching div. In my code, I can make the assumption
that there are no other DIVs inside a code div.
--->
<cfset objPattern = objPattern.Compile(
"(<div class=""code[^""]*"">)([\w\W]+?)(</div>)"
) />
<!---
Get the pattern matcher object so that we can loop through
matching code divs.
--->
<cfset objMatcher = objPattern.Matcher(
REQUEST.EntryQuery.content
) />
<!---
Loop over the content for every instance of the code divs.
Now, since we are looking for a specific offset, we will
hopefully be able to break out of the loop once we find the
div that we are looking for.
--->
<cfloop condition="objMatcher.Find()">
<!---
Check to see if this is the requested offset. Even
though our ultimate regular expression is different, it
starts with the same group and therefore, should match
on the same offsets that the calling page matched.
In this case, REQUEST.Attributes.start is the valud of
the ULR parameter, start, that was passed in from the
calling page.
--->
<cfif (objMatcher.Start() EQ REQUEST.Attributes.start)>
<!---
We have found the code! Now, extract the code
snippet. If you look at our regular expression,
the code snippet is inbetween the open / close div
tags, and this is what should be in the second
grouping.
--->
<cfset strCodeSnippet = objMatcher.Group( 2 ) />
<!---
Break out of the loop. We found the desired code
snippet - we don't need to keep checking.
--->
<cfbreak />
</cfif>
</cfloop>
<!---
At this point, we may or may not have a matching code
snippet that we need to format. Check the length of the
snippet to determine if one was found.
--->
<cfif Len( strCodeSnippet )>
<!---
We have a code snippet! This will be in the format of
an unordered list. We have to remove all the list
elements and put in the formatting.
The output will go inside of a set of PRE tags.
Therefore we need to get rid of all unwanted HTML
markup and put in actual white space and line breaks.
First we will start with removing the UL tags.
--->
<cfset strCodeSnippet = strCodeSnippet.ReplaceAll(
"</?ul>",
""
) />
<!--- Remove all non-breaking spaces. --->
<cfset strCodeSnippet = strCodeSnippet.ReplaceAll(
"( |&##160;)",
" "
) />
<!---
Remove all endling LI tags and replace with a line
break.
--->
<cfset strCodeSnippet = strCodeSnippet.ReplaceAll(
"</li>",
"#Chr( 13 )##Chr( 10 )#"
) />
<!---
Now, loop over each LI tab type and replace with the
appropriate tabbing. Each tab class (ex. tab1, tab2,
tab3, etc) represents a number of tabs.
--->
<cfloop index="intTab" from="1" to="10" step="1">
<cfset strCodeSnippet = strCodeSnippet.ReplaceAll(
"<li class=""tab#intTab#"">",
RepeatString( " ", intTab )
) />
</cfloop>
<!---
Even though we just replaced out all the tabbed LIs,
there are going to be LIs that did not have any tabs.
Replace out any LIs that are not tabbed.
--->
<cfset strCodeSnippet = strCodeSnippet.ReplaceAll(
"<li>",
""
) />
<!--- Replace out any break tags and trim the code. --->
<cfset strCodeSnippet = strCodeSnippet.ReplaceAll(
"<br( ?/)?>",
""
).Trim()
/>
<!---
ASSERT: At this point, we should have stripped out all
the tags that we need to. If dumped out now, the code
should "look" the way we want it to (minus the color
coding of course).
--->
<!---
Set up some constants for escaped characters. This just
keeps us from making errors later on.
--->
<cfset strQuoteOpen = Chr( 900 ) />
<cfset strQuoteClose = Chr( 901 ) />
<!---
Escape the attributs. We want to take out all quotes for
the moment so that our color coding is easier. However,
we don't want to lose them, so we are going to replace
them with constants.
--->
<cfset strCodeSnippet = strCodeSnippet.ReplaceAll(
"("".*?"")",
"#strQuoteOpen#$1#strQuoteClose#"
) />
<!--- Add the comment code. --->
<cfset strCodeSnippet = strCodeSnippet.ReplaceAll(
"(<!--[\w\W]*?-->)",
"<span class=""commentcodecolor"">$1</span>"
) />
<!--- Add the script comment code. --->
<cfset strCodeSnippet = strCodeSnippet.ReplaceAll(
"((?<!:)//[^\r\n]*)",
"<span class=""commentcodecolor"">$1</span>"
) />
<!--- Add the cfml code. --->
<cfset strCodeSnippet = strCodeSnippet.ReplaceAll(
"(</?cf[\w\W]+?>)",
"<span class=""cfmlcodecolor"">$1</span>"
) />
<!--- Put the open attribute quotes back in. --->
<cfset strCodeSnippet = strCodeSnippet.ReplaceAll(
strQuoteOpen,
"<span class=""attributecodecolor"">"
) />
<!--- Put the close attribute back in. --->
<cfset strCodeSnippet = strCodeSnippet.ReplaceAll(
strQuoteClose,
"</span>"
) />
</cfif>
The "color coding" portion of the code is probably a dumbbed down version of stuff you have seen in other blog software, such as BlogCFC, but it seems to work for me for the time being. I am sure I will need to update it as I go. This is the same type of code that is doing the color coding in the Skin Spider code viewer. I should probably break this out into a user defined function (UDF) and then reuse in both places (it's just a matter of time).
But there you have it. The key here is really that the beginning of the regular expressions on this page vs. the calling page are the same. That means that as the Java Pattern Matcher loops through the content, the offsets of the matching expressions will be the same. That's how I know where I am.
Want to use code from this post? Check out the license.
Reader Comments
Cool :)
Now - since you have two versions of the code, could you not add the line number feature I demonstrated earlier to one of them.
Actually, what I'd do if I where you, I'd have the color code with the line numbers inside the article (max readability), and have a black and white - copy'n paste version in the pop up.
Keep up all the good work Ben, both with this, the skin spider and other stuff. Your knowledge will be used/abused for the cfskill project once we really get it to take off (it's not dead - I promise - just a bit slow going for the moment).