Score:1

htaccess redirects with special characters on an specific url not working

in flag

I need some help creating a redirect, I'm trying to replace all - to +, but both are special characters and also want to do it only when the URL contains ?s=.

This is an example

example.com/?s=i-need-to-rewrite-this-url
example.com/?s=i+need+to+rewrite+this+url

This is what I have, I already tried to use () or "" but didn't work

RewriteEngine on
RewriteRule ^?s=(.*)\-(.*)$ ?s=$1\+$2 [L,R=301]

I appreciate any help. Thanks.

Score:1
kz flag

I'm trying to replace all - to +, but both are special characters

There's nothing particularly "special" about these characters. At least, not in the context they are going to be used. The - (hyphen) is a range specifier when used inside a regex character class, otherwise it's just a hyphen and safe to use unencoded in the query string part of the URL (we do not need to use it in a regex character class). The + (plus) is a regex quantifier, but we don't need to use it in a regex, only in the substitution string (2nd argument to the RewriteRule directive, which is not a regex). The + is an encoded space when used in the query string part of the URL - which I guess you are aware of and is why you need to redirect the request (although since this is part of a URL parameter value, it's also curious why this can't be resolved in the application).

One thing of note, however, is that since + is a URL encoded space we do need to use the NE (noescape) flag on the RewriteRule directive to prevent mod_rewrite from URL encoding the + as %2B (a literal +) in the redirect response. (The same goes for any other %-encoded characters we might capture from the QUERY_STRING - this server variable is not %-decoded.)

RewriteRule ^?s=(.*)\-(.*)$ ?s=$1\+$2 [L,R=301]

The RewriteRule pattern (first argument) matches against the URL-path only, not the query string. So, a RewriteRule directive alone will never match the query string. Incidentally, ? is a special regex character (0 or 1 quantifier of the preceding token) and ^ (the preceding token) is not quantifiable, so I would be surprised whether this would even compile (resulting in a 500 Internal Server Error). A literal ? in a regex needs to be backslash-escaped.

As mentioned above, the NE flag would be required here. And the + does not need to be backslash-escaped in the substitution string, since it carries no special meaning here (it's not a regex). Also, due to the relative substitution string (ie. ?s=$1\+$2), unless you have a RewriteBase defined this would have resulted in a malformed redirect.

To match against the query string you need an additional condition (RewriteCond directive) and match against the QUERY_STRING server variable. So, as a first attempt, you could do something like this:

# First attempt (inefficient when multiple "-" are present)
RewriteCond %{QUERY_STRING} ^s=([^&]*)-([^&]*)
RewriteRule ^$ /?s=%1+%2 [NE,R=301,L]

Note that the %1 and %2 backreferences (as opposed to $1, etc.) contain the values captured from the preceding CondPattern (RewriteCond directive), rather than the RewriteRule pattern (which in this case just matches the empty string).

(NB: You should always test first with 302 - temporary - redirects to avoid potential caching issues and make sure any intermediary caches are cleared before testing.)

I'm assuming s is the only URL parameter. With the above rule, any other URL params that follow will be discarded.

HOWEVER, the above is very inefficient since it triggers an external redirect for every instance of -. So, your example of /?s=i-need-to-rewrite-this-url would trigger 5 redirects.

Solution

Instead of the above you should recursively replace all-but-one - internally and only trigger the external redirect (and replace the last -) once all the - have been replaced. For this we need an additional rewrite to perform the internal replacements. For example:

# Internally replace all but the last "-" with "+" in the URL param
RewriteCond %{QUERY_STRING} ^s=([^&]*-[^&]*){2}
RewriteCond %{QUERY_STRING} ^s=([^&]*)-([^&]*)
RewriteRule ^$ ?s=%1+%2 [N=20]

# Replace the last "-" and redirect
RewriteCond %{QUERY_STRING} ^s=([^&-]*)-([^&-]*)
RewriteRule ^$ /?s=%1+%2 [NE,R=301,L]

The first condition that checks against the regex ^s=([^&]*-[^&]*){2} simply determines that there are at least 2 hyphens in the value of the s URL parameter. (Otherwise the rule is skipped and goes straight to the second/redirect rule.) The second condition then captures the relevant parts of the query string around the last -, which are then used in the following RewriteRule.

The N flag causes the rewrite engine to immediately start over. The 20 sets a limit on the number of iterations (Apache 2.4+) - so I'm assuming no more than 20+1 hyphens (the last hyphen is replaced in the second/redirect rule).

Since we know there is only going to be at most 1 hyphen remaining in the URL parameter when the second rule is processed, I added the hyphen to the regex character class as an optimisation, to avoid unnecessary backtracking.


Future

There is a replace function coming to Apache expressions, that (as the name suggests) allows you to search/replace chars in a string, but I don't think this has shipped yet in the latest public builds. But this would potentially allow you to do something like the following in a single rule:

RewriteCond %{QUERY_STRING} ^s=[^&]*-
RewriteCond expr "replace(%{QUERY_STRING},'-','+') =~ /^s=([^&]+)/"
RewriteRule ^$ /?s=%1 [NE,R=301,L]
I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.