Score:2

Why the redirection in .htaccess not work?

sa flag

I have a Wordpress website. I want to redirect .php urls to the ones without the .php suffix. The .htaccess is as follows:

# BEGIN WordPress
<IfModule mod_rewrite.c>
RewriteEngine On


RewriteRule ^(.*)\.php$ "$1" [R=301,L,NC]


RewriteBase /
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>

# END WordPress

But when I visit https://www.example.com/somepage.php, the page cannot display. The following error is shown in browser:

The page isn’t redirecting properly

    An error occurred during a connection to www.example.com.
    
        This problem can sometimes be caused by disabling or refusing to accept cookies.

And the url in address bar becomes https://www.example.com/index.

If I change the rewrite rule as:

# BEGIN WordPress
<IfModule mod_rewrite.c>
RewriteEngine On


RewriteRule ^(.*)\.html$ "$1" [R=301,L,NC]


RewriteBase /
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>

# END WordPress

And visit https://www.example.com/somepage.html, it is redirected successfully to https://www.example.com/somepage and the webpage is displayed normally. Why?

Score:2
kz flag

Because, since the redirect is unconditional, you end up redirecting again after the URL has been rewritten to index.php (the WordPress front-controller).

When you request /somepage.php:

  1. You are redirected to /somepage (by the first rule). The redirect response is sent back to the client.
  2. On the second request, /somepage is internally rewritten to /index.php by the last rule. The rewriting engine then starts over (in a directory context)...
  3. /index.php is redirected to /index (by the first rule). The redirect response is sent back to the client.
  4. On the third request /index is internally rewritten to /index.php by the last rewrite. The rewriting engine then starts over...
  5. Goto 3 (stuck in an endless redirect-loop).

In a directory context (like .htaccess) the rewriting engine does not simply make a single pass through the script. It loops until the URL passes through unchanged. (Unless you use the END flag on Apache 2.4, or an external 3xx redirect occurs.)

Changing to remove .html works OK because you are rewriting to /index.php, which doesn't end in .html, so the redirect directive (that removes .html) does not match.

To resolve this you need to avoid redirecting the rewritten request. You can do this by either:

  • using the END flag (Apache 2.4+) on the last rewrite, instead of L to prevent any further loops of the rewrite engine. Although you should avoid changing the stock WordPress directives (see below), so this may not be the preferred option. This also does not work on Apache 2.2.

  • Or, check for the .php extension against THE_REQUEST server variable (which contains the initial line of the HTTP request headers and does not change when the request is rewritten). For example:

    # Remove ".php" extension on "direct" (not rewritten) requests only
    RewriteCond %{THE_REQUEST} [A-Z]{3,7}\s/[^?]+\.php(?:\?|\s|$) [NC]
    RewriteRule (.+)\.php$ /$1 [R=301,L,NC]
    
  • Or, check the REDIRECT_STATUS environment variable, which is empty on the initial request and set to 200 (as in 200 OK HTTP status) on the first successful rewrite (this is simpler than the rather more complex regex above). For example:

    # Remove ".php" extension on "direct" (not rewritten) requests only
    RewriteCond %{ENV:REDIRECT_STATUS} ^$
    RewriteRule (.+)\.php$ /$1 [R=301,L,NC]
    

However, you should not edit the code inside the # BEGIN WordPress section, since WordPress itself tries to maintain this and may overwrite this code later. This rule needs to go before the # BEGIN WordPress comment marker. You do not need to repeat the RewriteEngine On directive that appears later in the file (in the WordPress section).

You will need to clear your browser cache before testing, since the erroneous (permanent) redirect will likely have been cached by the browser. Test first with 301 (temporary) redirects to avoid caching issues.

However, this alone does not allow you to access .php files without the .php extension. Since the extensionless URL will need to be internally rewritten back to the .php file.

peter avatar
sa flag
During the first request "/somepage.php", after processing "RewriteRule ^(.*)\.php$ "$1" [R=301,L,NC]", does the rewriting engine start over because the rewritten url "/somepage" is not the same as the original one?
kz flag
@peter During the first request the rewrite engine is immediately terminated at the redirect, it does not start over. During the _second request_ the rewrite engine "starts over" because the rewritten URL `/index.php` is not the same as the original (`/somepage`). Yes, the rewrite engine "starts over" because the URL is changed during the rewriting process.
peter avatar
sa flag
so how can I tell if the rewrite engine will start over or not? The two requests both rewrite the url to a different one, why the second request causes the start-over but the first request does not?
kz flag
@peter If a 3xx (external redirect) response code is set, then at the end of the current pass through the rewrite engine, the rewrite engine terminates and the redirect occurs, there are no additional passes through the rewrite engine in this case. In the above the rewriting process is always terminated by the redirect. If there was no "redirect" (eg. remove the `R` flag from the first rule) then it will result in an internal rewrite loop (500 Internal Server Error response to the client).
kz flag
Just to clarify, by "redirect occurs", I mean the 3xx redirect response is sent back to the client (complete with `Location` HTTP response header with the intended redirect target).
kz flag
@peter I've updated my answer to hopefully clarify some points (regarding the redirect) and added an example that uses `THE_REQUEST` to prevent the redirect-loop. Although I still think the example that I initially posted (using the `REDIRECT_STATUS` env var) is the preferred/cleanest solution in this WordPress site.
peter avatar
sa flag
Your answer is very detailed, clear, and helpful. Excellent! Thanks a lot!
mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.