Issues with Back Slashes (\) And Java's Matcher::AppendReplacement() Method
I just talked about the problems using $ signs and AppendReplacement(), but ironically, that post in and of itself, broke my code. Since my code had a slot of back slashes literals in it, the AppendReplacement() method was trying to evaluate them as esacpING characters, NOT escapED characters.
To quickly fix this, I updated the code:
// Loop over the matcher while we still have matches.
while ( LOCAL.Matcher.Find() ){
// Get the sample.
LOCAL.Sample = LOCAL.Matcher.Group();
... MANIPULATION of Group Data ...
// Add the sample to the buffer. When appending the
// replacement, be sure to escape any $ signs with
// literal $ signs so that they are not evaluated are
// regular expression groups. Also, escape the \ so
// that they are not evaluated as escaping characters.
LOCAL.Matcher.AppendReplacement(
LOCAL.Results,
LOCAL.Sample.ReplaceAll( "\\", "\\\\" ).ReplaceAll( "\$", "\\\$" )
);
}
This escapes all back slash literals in my string before it goes through and escapes all dollar sign literals.
Want to use code from this post? Check out the license.
Reader Comments
It should be, in fact, replaceAll( "\\\\", "\\\\\\\\" )
@Vlad,
I am not sure that I understand what you mean? I have tested my code and it works fine.
May be I'm missing something, but it can't be compiled as replaceAll("\\","\\\\"), it would give:
<code>
Exception in thread "main" java.util.regex.PatternSyntaxException: Unexpected internal error near index 1 \^
at java.util.regex.Pattern.error(Unknown Source)
at java.util.regex.Pattern.compile(Unknown Source)
at java.util.regex.Pattern.<init>(Unknown Source)
at java.util.regex.Pattern.compile(Unknown Source)
at java.lang.String.replaceAll(Unknown Source)
</code>
It makes sense to me, as you have to escape it twice: once for string and once for regex.
Not sure how it works for you.
@Vlad,
Ahh, I see what's going on here. This is not Java code. This is ColdFusion code (which automatically compiles down to Java). In ColdFusion, there are no special string characters; as such, I only have to escape the value for the RegularExpression, not for the string itself.
Simple miscommunication :)
My bad, I missed that.
I see I'm not the first one to make that mistake in your blog :)
Your pages come up nicely when querying google for java issues
Thanks
@Vlad,
No worries my friend. Heck, I didn't even mention the language in the blog post. That's pretty exciting for me, though, that this kind of stuff comes up for Java searches as well :)
Not to nitpick, but wouldn't '#' be considered a special string character in ColdFusion?
@KingErroneous,
Yeah, # is definitely a special character in ColdFusion, but only in a string that is going to be evaluated. Having a # in a piece of string data (such as that read from a file or in a FORM post), there is not special about it. And, of course, to escape it in ColdFusion, you need to use a double-pound, ##, rather than a back-slash as you might in other languages.