I'm trying to replace all -
to +
, but both are special characters
There's nothing particularly "special" about these characters. At least, not in the context they are going to be used. The -
(hyphen) is a range specifier when used inside a regex character class, otherwise it's just a hyphen and safe to use unencoded in the query string part of the URL (we do not need to use it in a regex character class). The +
(plus) is a regex quantifier, but we don't need to use it in a regex, only in the substitution string (2nd argument to the RewriteRule
directive, which is not a regex). The +
is an encoded space when used in the query string part of the URL - which I guess you are aware of and is why you need to redirect the request (although since this is part of a URL parameter value, it's also curious why this can't be resolved in the application).
One thing of note, however, is that since +
is a URL encoded space we do need to use the NE
(noescape
) flag on the RewriteRule
directive to prevent mod_rewrite from URL encoding the +
as %2B
(a literal +
) in the redirect response. (The same goes for any other %-encoded characters we might capture from the QUERY_STRING
- this server variable is not %-decoded.)
RewriteRule ^?s=(.*)\-(.*)$ ?s=$1\+$2 [L,R=301]
The RewriteRule
pattern (first argument) matches against the URL-path only, not the query string. So, a RewriteRule
directive alone will never match the query string. Incidentally, ?
is a special regex character (0 or 1 quantifier of the preceding token) and ^
(the preceding token) is not quantifiable, so I would be surprised whether this would even compile (resulting in a 500 Internal Server Error). A literal ?
in a regex needs to be backslash-escaped.
As mentioned above, the NE
flag would be required here. And the +
does not need to be backslash-escaped in the substitution string, since it carries no special meaning here (it's not a regex). Also, due to the relative substitution string (ie. ?s=$1\+$2
), unless you have a RewriteBase
defined this would have resulted in a malformed redirect.
To match against the query string you need an additional condition (RewriteCond
directive) and match against the QUERY_STRING
server variable. So, as a first attempt, you could do something like this:
# First attempt (inefficient when multiple "-" are present)
RewriteCond %{QUERY_STRING} ^s=([^&]*)-([^&]*)
RewriteRule ^$ /?s=%1+%2 [NE,R=301,L]
Note that the %1
and %2
backreferences (as opposed to $1
, etc.) contain the values captured from the preceding CondPattern (RewriteCond
directive), rather than the RewriteRule
pattern (which in this case just matches the empty string).
(NB: You should always test first with 302 - temporary - redirects to avoid potential caching issues and make sure any intermediary caches are cleared before testing.)
I'm assuming s
is the only URL parameter. With the above rule, any other URL params that follow will be discarded.
HOWEVER, the above is very inefficient since it triggers an external redirect for every instance of -
. So, your example of /?s=i-need-to-rewrite-this-url
would trigger 5 redirects.
Solution
Instead of the above you should recursively replace all-but-one -
internally and only trigger the external redirect (and replace the last -
) once all the -
have been replaced. For this we need an additional rewrite to perform the internal replacements. For example:
# Internally replace all but the last "-" with "+" in the URL param
RewriteCond %{QUERY_STRING} ^s=([^&]*-[^&]*){2}
RewriteCond %{QUERY_STRING} ^s=([^&]*)-([^&]*)
RewriteRule ^$ ?s=%1+%2 [N=20]
# Replace the last "-" and redirect
RewriteCond %{QUERY_STRING} ^s=([^&-]*)-([^&-]*)
RewriteRule ^$ /?s=%1+%2 [NE,R=301,L]
The first condition that checks against the regex ^s=([^&]*-[^&]*){2}
simply determines that there are at least 2 hyphens in the value of the s
URL parameter. (Otherwise the rule is skipped and goes straight to the second/redirect rule.) The second condition then captures the relevant parts of the query string around the last -
, which are then used in the following RewriteRule
.
The N
flag causes the rewrite engine to immediately start over. The 20
sets a limit on the number of iterations (Apache 2.4+) - so I'm assuming no more than 20+1 hyphens (the last hyphen is replaced in the second/redirect rule).
Since we know there is only going to be at most 1 hyphen remaining in the URL parameter when the second rule is processed, I added the hyphen to the regex character class as an optimisation, to avoid unnecessary backtracking.
Future
There is a replace
function coming to Apache expressions, that (as the name suggests) allows you to search/replace chars in a string, but I don't think this has shipped yet in the latest public builds. But this would potentially allow you to do something like the following in a single rule:
RewriteCond %{QUERY_STRING} ^s=[^&]*-
RewriteCond expr "replace(%{QUERY_STRING},'-','+') =~ /^s=([^&]+)/"
RewriteRule ^$ /?s=%1 [NE,R=301,L]