Skip to main content
Ben Nadel at cf.Objective() 2014 (Bloomington, MN) with: Oguz Demirkapi
Ben Nadel at cf.Objective() 2014 (Bloomington, MN) with: Oguz Demirkapi

RELoop ColdFusion Custom Tag Case Study

By
Published in Comments (6)

A while back, I created a ColdFusion custom tag that looped over text using regular expression patterns rather than list delimiters. For each iteration, it returned a variable that contained either the matched string or the matched groups (depending on utilized tag attributes). These values could then be modified and stored back into the original string. The whole point of the ColdFusion custom tag was to mimic the powerful use of nameless functions in Javascript's string replace() method.

There were not many people who were convinced that it was a good idea, but today on the CF-Talk list, I saw what could be a really convenient use case for it. Here is the problem that Josh Nathanson asked about:

Got a regex challenge...I was able to solve it using an REFind and then REReplace, but I'm wondering if anyone can come with a "one-shot" way to replace without looping. I need to remove any carriage returns within a quoted string, but not touch them if they are outside quotes. So: "the quick brown fox \r\n jumps over the \r\n lazy dog" <-- remove the \r\n's My name is mud \r\n <-- leave this one alone I'm sure this is probably easy for the regex gurus...

As it turns out, this is not the easiest kind of regular expression to code when you want to take care of it all in one shot, but it is a task that becomes quite easy when you use my RELoop ColdFusion custom tag. Take a look:

<!---
	Build up our test data. This data will have line
	breaks inside and outside of quoted values.
--->
<cfsavecontent variable="strText">

	Hey there, here some text that is not quoted
	that has line breaks in it. Then, here is some
	"quoted text that also
	has some line breaks" in it. Of course, not
	all "quoted text" needs to
	have "line
	breaks" in it; that is only going to happend some
	of the time and we want to be sure not to replace
	out the line breaks that are NOT "within quoted
	values".

</cfsavecontent>


<!---
	Replacing the line breaks directly in the regular
	expression is gonna be a huge pain in the butt, so
	we are gonna do the next best thing - we are gonna
	find all the quoted values and then act on them
	individually.
--->
<cf_reloop
	index="strValue"
	text="#strText#"
	pattern="(""[^""]*"")"
	variable="strText">

	<!---
		Now that we have a quoted value, just replace the line
		breaks and carriage returns. This might be overkill some
		of the time, but it is the easy solution.
	--->
	<cfset strValue = strValue.ReplaceAll(
		JavaCast( "string", "[\r\n]+" ),
		JavaCast( "string", " " )
		) />

</cf_reloop>


<!---
	When we output the new text, we are going to replace the
	newlines / carriage returns with <br /> tags so that we
	can see where the line breaks exist in an HTML context.
--->
<p>
	#strText.ReplaceAll(
		JavaCast( "string", "\r\n" ),
		JavaCast( "string", "<br />" )
		)#
</p>

First, I am building up a chunk of text that has line breaks in both the quoted and the non-quoted parts. Then, I am using the RELoop to iterate over all quoted values and within that, it takes just one simple ReplaceAll() method to clear out the line breaks. There is some overkill for this as you are going to run replaces on quoted values that don't have any line breaks; however, I think the time / effort you save on not having an insane regular expression is worth the overhead of some extraneous replace calls. Running the above code we get the following output:

Hey there, here some text that is not quoted
that has line breaks in it. Then, here is some
"quoted text that also has some line breaks" in it. Of course, not
all "quoted text" needs to
have "line breaks" in it; that is only going to happend some
of the time and we want to be sure not to replace
out the line breaks that are NOT "within quoted values".

It's almost too easy. In my gut, I really feel like this kind of a custom tag is useful, but maybe it's just for a few cases.

Want to use code from this post? Check out the license.

Reader Comments

21 Comments

Hey Ben,

Haven't played with your solution to test it, but it appears that you're not taking into account escaped quotes within a quoted string, so for something like the following string:

She said: "If you want to quote something add a "" symbol to start of \r\n
the string,\r\n
and a "" to the end\r\n
of the text you're\r\n
working with."

Just made that up so there is probably a better example out there, but it does seem that the RegExp you're using only matches pairs of quotes, not matching pairs of quotes.

Again, was just thinking about the escaped quote issue and haven't tested it to see if your solution accommodates them already.

15,848 Comments

@Danilo,

My current regular expression does not take into account any concept of escaping. It's tough to do because depending on the context of the problem, escaping means different things. If you looking at CSV (comma separated values) data, the an escaped quote would be "". If you were looking at Javascript data, an escaped quote might look like \".

You can write regular expressions that take into account escaped quotes, and this can be done using what Steve Levithan showed me was called "unrolling the loop":

www.bennadel.com/index.cfm?dax=blog:978.view

But again, you can only do this when you know the way escaping is done and the context of the problem.

59 Comments

I think this is very useful. In fact, I have run across several problems for which this would have been very useful. Next time I run into one, I will definitely have to download this and try it out.

I really like that it is a custom tag. This allows me to take any type or number of actions within the loop.

What version of CF does it require?

15,848 Comments

@Steve,

I have tested it in ColdFusion MX7. I uses the Java Pattern object behind the scenes (for more powerful and easier iteration), so it needs to be MX of some sort. I haven't tested on MX6, but as long as it can use CreateObject( "java" ) then it should be fine.

Also, you have option to return a struct of data rather than a simple string, which contains the indexed-group matching, the group count, and the character offset of the match. I tried to make it as flexible as possible. Glad you might find it useful.

21 Comments

@Ben,

Thanks for the link. I was also thinking about the different escape chars for different languages/context, puls what happens when you combine the two, like a CF string with escaped string that has an embedded JavaScript code sample. The mind just goes a little wonky.

Thanks for the link, I'll take a look at it if/when I need such RegExp operations.

I believe in love. I believe in compassion. I believe in human rights. I believe that we can afford to give more of these gifts to the world around us because it costs us nothing to be decent and kind and understanding. And, I want you to know that when you land on this site, you are accepted for who you are, no matter how you identify, what truths you live, or whatever kind of goofy shit makes you feel alive! Rock on with your bad self!
Ben Nadel