Issues with Dollar Signs ($) And Java's Matcher::AppendReplacement() Method
I was working with regular expressions the other day and, in particular, I was using the Java Pattern and Matcher classes to iterate over the matching expressions in a block of text. Occasionally though, the Matcher would throw an error like:
Group Index out of bounds. There is no group 3.
This was driving me crazy because none of my expressions had any references to a third group, which would be denoted by "$3". I started to comment out my code to narrow down the problem area. It came down to the AppendReplacement() method. The code looked like this:
// Loop over the matcher while we still have matches.
while ( LOCAL.Matcher.Find() ){
// Get the sample.
LOCAL.Sample = LOCAL.Matcher.Group();
... MANIPULATION of Group Data ...
// Add the sample to the buffer.
LOCAL.Matcher.AppendReplacement(
LOCAL.Results,
LOCAL.Sample
);
}
What this did was take a results buffer (StringBuffer Class) and for each iteration, update the matching expression and then replace it back into the target string. For more information about the AppendReplace() method, check out the Java 2 documentation. When I realized that this was the code throwing the error, it drove me even more crazy because this code doesn't even refer to a group at all! How could it have a group out of bounds?
What I totally forgot was that the AppendReplacement() method evaluates group references within the passed in text. My problem was that the my text (LOCAL.Sample) occasionally had a dollar sign ($) in it. In this case, there was a dollar amount reference, $3. The AppendReplacement() method was trying to evaluate this as a group reference. In order to overcome this, I had to escape the dollar signs to be literal dollar signs that would not get evaluated.
To escape the $ signs, I had to do this:
// Loop over the matcher while we still have matches.
while ( LOCAL.Matcher.Find() ){
// Get the sample.
LOCAL.Sample = LOCAL.Matcher.Group();
... MANIPULATION of Group Data ...
// Add the sample to the buffer. When appending the
// replacement, // be sure to escape any $ signs with
// literal $ signs so that they are not evaluated are
// regular expression groups.
LOCAL.Matcher.AppendReplacement(
LOCAL.Results,
LOCAL.Sample.ReplaceAll( "\$", "\\\$" )
);
}
This escapes the dollar signs before the sample code is passed back. The escape sequence can be a bit confusing, so let me explain it.
\$
This first parameter of the ReplaceAll() method is the regular expression. As stated above, Java regular expressions use $ signs to denote group references. But, I am NOT looking for a group reference, I am looking for a dollar sign literal. Hence, I have to escape the $ sign using the "\" in the regular expression.
\\\$
The second parameter of the ReplaceAll() method is the replacement text. You can view this one as actually having two parts( "\\" + "\$" ). When you break it down like this, it's a bit easier to understand. The first part is a literal "\". Since the back slash character "\" is used to escape things in Java regular expressions (and ColdFusion expressions as well), in order to get a literal back slash, we have to escape the back slash itself with another back slash. The second part "\$" is just an escaped "$" sign. This will give us the literal $ sign. Combined, they should replace all "$" signs with "\$".
So there you have it. Be sure to escape all dollar signs ($) before you pass any text to the AppendReplace() method.
Want to use code from this post? Check out the license.
Reader Comments
Actually it would have to be
replace = replace.replaceAll("\\$","\\\\\\$");
@Lawati,
It might need to be that for languages where the standard string uses "\" as an escape value. However, in ColdFusion, "\" has no significant value (and hence I do not need that many in a row).
neither of those two compiled for me. IDE warned about em b4 i tried to cimplie anyway.
myData = myData.replaceAll("\\$", "\\\\$");
that did compile, but it didn't do anything positive.
I do need a replace all slightly later in the code to be cool with $ signs.
any advice?
here is the code block. it gets a hashtable of keys and values. we loop em. the keys match text to swap out for the values . sometimes the values have $ signs in them.
for (Enumeration e = htData.keys() ; e.hasMoreElements(); ) {
String myKey = (String)e.nextElement();
try{
String myData = (String)htData.get(myKey.toString());
//added line below and several variations to handle $ sign
myData = myData.replaceAll("\\$", "\\\\$");
String myTag = "<"+myKey+"/>";
//System.out.println("replacing:"+myTag +" with:"+myData);
sHtml = sHtml.replaceAll(myTag,myData);
} catch (Exception ex){
System.out.println("Error Swapping tag: "+myKey+", error:"+ex);
}
}
help. thanks in advance.
@Neil,
I use the Java methods within ColdFusion. ColdFusion has different string-escaping needs. I am not sure what the particular needs would be off hand without testing it.
This post helped me with the same issue. Couldn't figure out why it was only happening in some cases. Looks like some data I was reading also had a $ in it. Thanks for the help!
@Eric,
Yeah, this was a beast of an issue when I first came across it. I found that it is best to escape "$" and "\" when dealing with the append-replacement data.
Why didn't you just use Matcher.quoteReplacement on the replacement value?
@Rob,
Whoa!! I didn't know about that method! I just looked it up. Brilliant Rob, thanks for the dynamite tip.
No problem.. only took me about 2 hours of head scratching before I decided to actually read the API myself. :)
@Rob,
Ha ha, I know what you mean. Often times, when I find something cool, I just jump all over it without taking the time to explore the rest of the API to see what it can do :)
@Rob,
Thanks again for bringing this to my attention - it was actually bigger than you thought; it never occurred to me that the Java API documentation that I was using was out of date. I knew, of course, that the Java language has moved on, but since Java is only a small part of what I do, it never occurred to me that the docs might be out of date:
www.bennadel.com/blog/1826-Java-Matcher-s-QuoteReplacement-And-Java-6-vs-Java-1-4-2.htm
You rock!
This is sent from heaven! I fully understood the problem/solution because of your explanation! finally I can stop scratching my head too! thanks!