ColdFusion Regular Expressions Match New Line Using "."
Minor note here, but over on CF-Talk, Rick Root just pointed out a flaw in my understanding of ColdFusion regular expressions. I was under the impression that the period "." would NOT match the new line character or the carriage return character in a target string. I think this is traditionally how regular expressions work (or maybe not). It turns out, that this is dead wrong. The "." character DOES match new lines and carriage returns. I am not sure if this changed in MX(?7) or if I just never learned in properly.
Thanks Rick!
My test:
<cfsavecontent variable="strText">
<a href="http://www.google.com"
target="_blank"
>Google</a>
</cfsavecontent>
<!--- Replace tags. --->
#REReplaceNoCase( strText, "</?[a-z].*?>", "", "ALL" )#
Want to use code from this post? Check out the license.
Reader Comments
Yep, this has always been the case. From "Ben Forta's Advanced CFMX Application Development" (3rd ed.);
"It's important to understand that in ColdFusion MX, the dot character always matches newlines, which is not always the case with Perl."
Good to know this (despite the fact it says it clearly in the docs, right back to CF5... who RTFMs? ;-)
I just ass-u-me`d that CF followed the practice set by PERL and Java. No. That would be too sensible.
Cheers for the heads-up.
--
Adam
I think the problem is that I read the text book and learned a lot about ColdFusion before I even learned how to use Regular Expressions. So, by the time I knew what they were and how to apply them, I was no longer learning from the documentation... I was learning from google'd examples of regular expressions.
Other's have said this, but Perl does it the way you stated, Ben. From the Perl docs: ". matches any single character except a newline". And ColdFusion regular expressions are based on the Perl syntax, but apparently they took liberties. :)
I just found out that you can also remove new lines by doing:
<pre>
<cfsavecontent variable="testString">
<a href="http://www.google.com"
target="_blank"
>Google</a>
</cfsavecontent>
<cfoutput>#testString.replaceall("\r\n", "")#</cfoutput>
</pre>
Boyan, excellent point. Just a follow up, that would only work if you don't mind losing other line breaks (outside of tags).
It hurts my feelings tremendously that ColdFusion does this. I hate using [^\r\n] for staying within lines. And I typically use [\S\s] to match any character if the regex library I'm using doesn't support an explicit "dot matches newline" option. Semi-related sidenote: Non-negated \r\n pairs are unreliable in JavaScript, since Firefox uses only linefeeds (\n) to separate lines in <textarea>s. Instead, I typically use \r?\n, which works cross-browser.
Yeah, I have run into that \r\n thing in Javascript before. It was driving me CRAZY! Cause I was outputting the ASCII values in ColdFusion to debug and sure enough it was there, but then in Javascript when I was testing for it, NOT there. That was like 30 minutes of my life I won't get back :)