Syncing %{REQUEST_URI} Behaviors In Apache mod_rewrite And Helicon Ape mod_rewrite
In my local development environment, I have Mac's OSX and I use Apache mod_rewrite to handle URL rewriting. In my production environment, I have Window's 2008 Server and I use Helicon Ape mod_rewrite to handle URL rewriting. While Helicon Ape does a pretty good job of matching Apache's URL rewriting behavior, it differs from Apache in one extremely critical way: handling %{REQUEST_URI} values within an internal redirect.
Image that you have an .htaccess file with the given rule:
RewriteRule . index.cfm
This will rewrite every incoming URL to be "index.cfm". I know that's kind of a useless rule; but, now image that the user makes a request to the following URL:
/foo/bar
Apache's mod_rewrite would see the %{REQUEST_URI} like this:
- First pass: "/foo/bar"
- [internal redirect to "index.cfm"]
- Second pass: "/index.cfm"
As you can see, Apache correctly changes the value of %{REQUEST_URI} to be "/index.cfm" on the second pass. That is, when you rewrite to a new URL, the internal redirect properly assigns the new URL to the %{REQUEST_URI} variable.
Now, let's take a look at how Helicon Ape's mod_rewrite would see %{REQUEST_URI} in the same request:
- First pass: "/foo/bar"
- [internal redirect to "index.cfm"]
- Second pass: "/foo/bar"
As you can see here, Helicon Ape fails to update the value of %{REQUEST_URI} in accordance with the new, internal URL. Instead, Helicon Ape continues to point to the original REQUEST_URI value associated with the first pass of the rewrite engine.
Since URL rewriting always triggers a subsequent pass on the rule set, failure to update the %{REQUEST_URI} variable on an internal redirect can quickly trigger an infinite loop (or rather, an infinite loop that fails fast).
This problem had me stumped for the better part of a day. And by that, I mean about 6 hours of my time went into trying to debug this problem! After much research and Googling, I confirmed that other Helicon Ape customers were having the same problem; but, I was unable to find any kind of satisfactory solution.
Then, as I was reading the Apache mod_rewrite documentation, I saw this as part of the RewriteCond explanation:
TestString is a string which can contain the following expanded constructs in addition to plain text: RewriteRule backreferences: These are backreferences of the form $N (0 <= N <= 9). $1 to $9 provide access to the grouped parts (in parentheses) of the pattern, from the RewriteRule which is subject to the current set of RewriteCond conditions. $0 provides access to the whole string matched by that pattern.
I have only ever used Server Variables in my RewriteCond statements; but, if I could use the $0 to reference the value against which the RewriteRule was being applied, perhaps I could cut out my dependence on the %{REQUEST_URI} variable (and its updated value in subsequent passes).
I went back to my rule set and looked at this rule (example, not a real rule):
RewriteCond %{REQUEST_URI} !^/index.cfm
RewriteRule . index.cfm [L]
This rule says rewrite every request to "index.cfm" if and only if the request is not already for "index.cfm". In Apache mod_rewrite, this works perfectly since internal redirects update the value of %{REQUEST_URI}; however, in Helicon Ape, which doesn't update the value of %{REQUEST_URI} for internal redirects, you can see how this would quickly cause an infinite loop (the RewriteCond would never match).
To fix this, I changed the "." to a ".+", took off the "/", and replaced %{REQUEST_URI} with "$0":
RewriteCond $0 !^index.cfm
RewriteRule .+ index.cfm [L]
I needed to replace "." with ".+" so that my $0 reference would contain the entire directory-local URI being tested. Then, I had to strip off the leading "/" from my RewriteCond since I was no longer testing the incoming request, but rather the directory-local request. After I made these changes, both Apache mod_rewrite and Helicon Ape mod_rewrite started to behave in a similar manner.
In the end, I couldn't get the %{REQUEST_URI} to behave the same in both URL rewriting engines; however, I was able to produce similar behavior when I replaced the %{REQUEST_URI} server variable with a reference to the value being tested in each RewriteRule. Not a perfect solution; but, it is one that I am able to live with for now.
Want to use code from this post? Check out the license.
Reader Comments
Hey Ben, just noticed a little typo near the top of your article:
"Image that you have an .htaccess file with the given rule:"
Rewriting can get a little tricky with CF the following seems to work spot on ...
== .htaccess ==
mod_rewrite handles per-directory, .htaccess rewrites differently than server config or virtual host context rewrites ...
http://httpd.apache.org/docs/current/mod/mod_rewrite.html#rewriterule
Per-directory Rewrites
A) "When using the rewrite engine in .htaccess files the per-directory prefix (which always is the same for a specific directory) is automatically removed for the RewriteRule pattern matching and automatically added after any relative (not starting with a slash or protocol name) substitution encounters the end of a rule set."
B) "The removed prefix always ends with a slash, meaning the matching occurs against a string which never has a leading slash. Therefore, a Pattern with ^/ never matches in per-directory context."
@Jordan,
Oh snap, thanks! That's one of those cases where Spell Checker simply doesn't help :D
@Edward,
Yeah, this stuff is definitely wicked tricky. No matter how many times I deal with it, writing a Rewrite Rule is always about trial and error (at least for me).
As far for where the rewrite rules exist, I tend to put mine in an .htaccess file in the root of my app. I've not really ever put them in my actual Virtual Host directive in the Apache config, so I haven't had to deal too much with the differences.
@Ben,
I run most of my re-write rules in .htaccess too and typically use vhost conf for domain level directives ...
@Ben,
Snap ... I forgot to add ... Apache Ape doesn't support [PT] it's implicit ... moreover .htaccess re-writes require FollowSymLinks ...
@Edward,
The other day, I was working with a Window Server 2008 with IIS 7. IIS 7 has a URL rewriting module that has an Apache mod_rewrite importer. It worked sort of well -- it didn't understand the (?i) case-insensitive RegEx; I had to use an ignore regex IIS 7 regex.
I'll have to check out my new signed-edition of the RegEx Cookbook I got from SL to look into that ;-) ...
FWIW ... I haven't used IIS in quite a while as most of my sites are on CentOS ...