If there is no matching #3 DirectoryIndex then continue to #4 CMS
You can't fail "gracefully" with mod_dir's DirectoryIndex
to then do something else with the request using mod_rewrite (ie. route the request to #4 the CMS). mod_dir is processed too late. So, instead of using DirectoryIndex
we would need to simulate this with mod_rewrite.
However, another (minor) issue here is that the WordPress code block (that, as the comment states, should not be edited manually) needs to be edited to allow requests for filesystem directories to be passed to the CMS.
I'm assuming that any direct requests for a directory should include a trailing slash. For example, if /hello
is a physical directory then you should be requesting /hello/
(with a trailing slash). We will append the trailing slash if omitted (which is what mod_dir will do by default anyway, but we need to do this manually if overriding Directoryindex
.) We could disable the trailing slash (and make the canonical URL the one without a trailing slash) but this requires additional rewriting.
So, to satisfy your requirements, you could do it like this in the root .htaccess
file:
Options -Indexes
# Required for the root directory (eg. the homepage of the CMS)
DirectoryIndex index.html index.htm index.php
RewriteEngine On
# Initially part of the WordPress/CMS block
# (This is just an optimisation)
RewriteRule ^index\.php$ - [L]
# Abort early if a file is requested directly
# (Regardless of whether that file includes a file extension.)
RewriteCond %{REQUEST_FILENAME} -f
RewriteRule . - [L]
# If a directory is requested, which is missing the trailing slash then append it
RewriteCond %{DOCUMENT_ROOT}/$1 -d
RewriteRule ^(.*[^/])$ /$1/ [R=301,L]
# Test if "<url>.html" exists and rewrite if so
RewriteCond %{DOCUMENT_ROOT}/$1.html -f
RewriteRule ^([^.]*[^/])$ $1.html [L]
# Optimisation: If a directory is not requested then skip the next 3 rules
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . - [S=3]
# Check for "DirectoryIndex" documents in order: index.html, index.htm and index.php
# NB: Directories end in a trailing slash (enforced above)
RewriteCond %{DOCUMENT_ROOT}/$1/index.html -f
RewriteRule ^(.+)/$ $1/index.html [L]
RewriteCond %{DOCUMENT_ROOT}/$1/index.htm -f
RewriteRule ^(.+)/$ $1/index.htm [L]
RewriteCond %{DOCUMENT_ROOT}/$1/index.php -f
RewriteRule ^(.+)/$ $1/index.php [L]
# CMS Fallback...
# But note that the two conditions (filesystem checks) are removed.
# The first one that checks for a "file" is simply not required.
# However, the second check MUST be removed otherwise directories that do not contain a "DirectoryIndex" are not routed to the CMS.
# WordPress...
RewriteRule ^ - [E=HTTP_AUTHORIZATION:%{HTTP:Authorization}]
# CMS / Front-Controller
RewriteRule . /index.php [L]
Additional notes:
As an optimisation, I'm assuming your URLs that map to .html
files do not contain dots. This is what you had done, so I assume that's OK. (There is no need to backslash-escape a literal dot when used inside a regex character class.)
I've removed the WordPress comment markers and reduced the WordPress code block to all that's required. One of the RewriteRule
directives is moved to the top of the .htaccess
file (since this is an optimisation, it doesn't make much sense to have it at the end anymore). You would need to configure WordPress (or your file perms) to prevent WordPress from trying to maintain the .htaccess
file (although this could cause issues with plugins).
Passing filesystem directories to the CMS is certainly non-standard. And boilerplate code (front-controller pattern) for most CMSs will explicitly exclude physical directories. However, the added complication here is that you only want directories where the DirectoryIndex document is not present in that directory, to be passed to the CMS.
I like to preserve the possibility for files without a suffix.
The "problem" with not having file extensions on the underlying file is that Apache does not necessarily know how to handle the request and what "Content-Type" header to send (so the browser does not know how to handle the response).
A workaround in this case is to have all extensionless "files" of a specific type in a known subdirectory and force all those requests with the same Content-Type.
Note that files and URLs are very different in this respect. URLs without extensions is not a problem.
Aside:
RewriteRule ^([^\.]+)$ $1.html [NC,L]
The problem with this rule is that you are unconditionally applying the .html
extension to any URL that does not contain a dot. /a01
is rewritten to /a01.html
, which is not a file (so the condition is successful) and /a01
(the URL that WP sees) is not a registered WP URL so results in a 404 generated by the CMS/WordPress.