Module mod_rewrite (Version 3.0)

This module is contained in the mod_rewrite.c file, with Apache 1.2 and later. It provides a rule-based rewriting engine to rewrite requested URLs on the fly. mod_rewrite is not compiled into the server by default. To use mod_rewrite you have to enable the following line in the server build Configuration file:
    Module  rewrite_module   mod_rewrite.o

Summary

This module uses a rule-based rewriting engine (based on a regular-expression parser) to rewrite requested URLs on the fly.

It supports an unlimited number of additional rule conditions (which can operate on a lot of variables, including HTTP headers) for granular matching and external database lookups (either via plain text tables, DBM hash files or external processes) for advanced URL substitution.

It operates on the full URLs (including the PATH_INFO part) both in per-server context (httpd.conf) and per-dir context (.htaccess) and even can generate QUERY_STRING parts on result. The rewritten result can lead to internal sub-processing, external request redirection or to internal proxy throughput.

The latest version can be found on
http://www.engelschall.com/sw/mod_rewrite/

Copyright © 1996,1997 The Apache Group, All rights reserved.
Copyright © 1996,1997 Ralf S. Engelschall, All rights reserved.

Written for The Apache Group by

Ralf S. Engelschall
rse@engelschall.com
www.engelschall.com

Directives


Configuration Directives

RewriteEngine

Syntax: RewriteEngine {on,off}
Default: RewriteEngine off
Context: server config, virtual host, per-directory config

The RewriteEngine directive enables or disables the runtime rewriting engine. If it is set to off this module does no runtime processing at all. It does not even update the SCRIPT_URx environment variables.

Use this directive to disable the module instead of commenting out all RewriteRule directives!


RewriteOptions

Syntax: RewriteOptions Option ...
Default: -None-
Context: server config, virtual host, per-directory config

The RewriteOption directive sets some special options for the current per-server or per-directory configuration. The Option strings can be one of the following:


RewriteLog

Syntax: RewriteLog Filename
Default: -None-
Context: server config, virtual host

The RewriteLog directive sets the name of the file to which the server logs any rewriting actions it performs. If the name does not begin with a slash ('/') then it is assumed to be relative to the Server Root. The directive should occur only once per server config.

To disable the logging of rewriting actions it is not recommended to set Filename to /dev/null, because although the rewriting engine does not create output to a logfile it still creates the logfile output internally. This will slow down the server with no advantage to the administrator! To disable logging either remove or comment out the RewriteLog directive or use RewriteLogLevel 0!

SECURITY: See the Apache Security Tips document for details on why your security could be compromised if the directory where logfiles are stored is writable by anyone other than the user that starts the server.

Example:

RewriteLog "/usr/local/var/apache/logs/rewrite.log"


RewriteLogLevel

Syntax: RewriteLogLevel Level
Default: RewriteLogLevel 0
Context: server config, virtual host

The RewriteLogLevel directive set the verbosity level of the rewriting logfile. The default level 0 means no logging, while 9 or more means that practically all actions are logged.

To disable the logging of rewriting actions simply set Level to 0. This disables all rewrite action logs.

Notice: Using a high value for Level will slow down your Apache server dramatically! Use the rewriting logfile only for debugging or at least at Level not greater than 2!

Example:

RewriteLogLevel 3


RewriteMap

Syntax: RewriteMap Mapname {txt,dbm,prg}:Filename
Default: not used per default
Context: server config, virtual host

The RewriteMap directive defines an external Rewriting Map which can be used inside rule substitution strings by the mapping-functions to insert/substitute fields through a key lookup.

The Mapname is the name of the map and will be used to specify a mapping-function for the substitution strings of a rewriting rule via

${ Mapname : LookupKey | DefaultValue }
When such a directive occurs the map Mapname is consulted and the key LookupKey is looked-up. If the key is found, the map-function directive is substituted by SubstValue. If the key is not found then it is substituted by DefaultValue.

The Filename must be a valid Unix filepath, containing one of the following formats:

  1. Plain Text Format

    This is a ASCII file which contains either blank lines, comment lines (starting with a '#' character) or

    MatchingKey SubstValue
    pairs - one per line. You can create such files either manually, using your favorite editor, or by using the programs mapcollect and mapmerge from the support directory of the mod_rewrite distribution.

    To declare such a map prefix, Filename with a txt: string as in the following example:

    #
    #   map.real-to-user -- maps realnames to usernames
    #
    
    Ralf.S.Engelschall    rse   # Bastard Operator From Hell 
    Dr.Fred.Klabuster     fred  # Mr. DAU
    

    RewriteMap real-to-host txt:/path/to/file/map.real-to-user
    

  2. DBM Hashfile Format

    This is a binary NDBM format file containing the same contents as the Plain Text Format files. You can create such a file with any NDBM tool or with the dbmmanage program from the support directory of the Apache distribution.

    To declare such a map prefix Filename with a dbm: string.

  3. Program Format

    This is a Unix executable, not a lookup file. To create it you can use the language of your choice, but the result has to be a run-able Unix binary (i.e. either object-code or a script with the magic cookie trick '#!/path/to/interpreter' as the first line).

    This program gets started once at startup of the Apache servers and then communicates with the rewriting engine over its stdin and stdout file-handles. For each map-function lookup it will receive the key to lookup as a newline-terminated string on stdin. It then has to give back the looked-up value as a newline-terminated string on stdout or the four-character string ``NULL'' if it fails (i.e. there is no corresponding value for the given key). A trivial program which will implement a 1:1 map (i.e. key == value) could be:

    #!/usr/bin/perl
    $| = 1;
    while (<STDIN>) {
        # ...here any transformations 
        # or lookups should occur...
        print $_;
    }
    

    But be very careful:

    1. ``Keep the program simple, stupid'' (KISS), because if this program hangs it will lead to a hang of the Apache server when the rule occurs.
    2. Avoid one common mistake: never do buffered I/O on stdout! This will cause a deadloop! Hence the ``$|=1'' in the above example...

    To declare such a map prefix Filename with a prg: string.

The RewriteMap directive can occur more than once. For each mapping-function use one RewriteMap directive to declare its rewriting mapfile. While you cannot declare a map in per-directory context it is of course possible to use this map in per-directory context.

For plain text and DBM format files the looked-up keys are cached in-core until the mtime of the mapfile changes or the server does a restart. This way you can have map-functions in rules which are used for every request. This is no problem, because the external lookup only happens once!


RewriteBase

Syntax: RewriteBase BaseURL
Default: default is the physical directory path
Context: per-directory config

The RewriteBase directive explicitly sets the base URL for per-directory rewrites. As you will see below, RewriteRule can be used in per-directory config files (.htaccess). There it will act locally, i.e. the local directory prefix is stripped at this stage of processing and your rewriting rules act only on the remainder. At the end it is automatically added.

When a substitution occurs for a new URL, this module has to re-inject the URL into the server processing. To be able to do this it needs to know what the corresponding URL-prefix or URL-base is. By default this prefix is the corresponding filepath itself. But at most websites URLs are NOT directly related to physical filename paths, so this assumption will be usually be wrong! There you have to use the RewriteBase directive to specify the correct URL-prefix.

So, if your webserver's URLs are not directly related to physical file paths, you have to use RewriteBase in every .htaccess files where you want to use RewriteRule directives.

Example:

Assume the following per-directory config file:

#
#  /abc/def/.htaccess -- per-dir config file for directory /abc/def
#  Remember: /abc/def is the physical path of /xyz, i.e. the server
#            has a 'Alias /xyz /abc/def' directive e.g.
#

RewriteEngine On

#  let the server know that we are reached via /xyz and not 
#  via the physical path prefix /abc/def
RewriteBase   /xyz

#  now the rewriting rules
RewriteRule   ^oldstuff\.html$  newstuff.html

In the above example, a request to /xyz/oldstuff.html gets correctly rewritten to the physical file /abc/def/newstuff.html.

For the Apache hackers:
The following list gives detailed information about the internal processing steps:

Request:
  /xyz/oldstuff.html

Internal Processing:
  /xyz/oldstuff.html     -> /abc/def/oldstuff.html    (per-server Alias)
  /abc/def/oldstuff.html -> /abc/def/newstuff.html    (per-dir    RewriteRule)
  /abc/def/newstuff.html -> /xyz/newstuff.html        (per-dir    RewriteBase)
  /xyz/newstuff.html     -> /abc/def/newstuff.html    (per-server Alias)

Result:
  /abc/def/newstuff.html
This seems very complicated but is the correct Apache internal processing, because the per-directory rewriting comes too late in the process. So, when it occurs the (rewritten) request has to be re-injected into the Apache kernel! BUT: While this seems like a serious overhead, it really isn't, because this re-injection happens fully internal to the Apache server and the same procedure is used by many other operations inside Apache. So, you can be sure the design and implementation is correct.


RewriteCond

Syntax: RewriteCond TestString CondPattern
Default: -None-
Context: server config, virtual host, per-directory config

The RewriteCond directive defines a rule condition. Precede a RewriteRule directive with one or more RewriteCond directives. The following rewriting rule is only used if its pattern matches the current state of the URI AND if these additional conditions apply, too.

TestString is a string which contains server-variables of the form

%{ NAME_OF_VARIABLE }
where NAME_OF_VARIABLE can be a string of the following list:

HTTP headers:

HTTP_USER_AGENT
HTTP_REFERER
HTTP_COOKIE
HTTP_FORWARDED
HTTP_HOST
HTTP_PROXY_CONNECTION
HTTP_ACCEPT

connection & request:

REMOTE_ADDR
REMOTE_HOST
REMOTE_USER
REMOTE_IDENT
REQUEST_METHOD
SCRIPT_FILENAME
PATH_INFO
QUERY_STRING
AUTH_TYPE

server internals:

DOCUMENT_ROOT
SERVER_ADMIN
SERVER_NAME
SERVER_PORT
SERVER_PROTOCOL
SERVER_SOFTWARE
SERVER_VERSION

system stuff:

TIME_YEAR
TIME_MON
TIME_DAY
TIME_HOUR
TIME_MIN
TIME_SEC
TIME_WDAY

specials:

API_VERSION
THE_REQUEST
REQUEST_URI
REQUEST_FILENAME
IS_SUBREQ

These variables all correspond to the similar named HTTP MIME-headers, C variables of the Apache server or struct tm fields of the Unix system.

Special Notes:

  1. The variables SCRIPT_FILENAME and REQUEST_FILENAME contain the same value, i.e. the value of the filename field of the internal request_rec structure of the Apache server. The first name is just the commonly known CGI variable name while the second is the consistent counterpart to REQUEST_URI (which contains the value of the uri field of request_rec).

  2. There is the special format: %{ENV:variable} where variable can be any environment variable. This is looked-up via internal Apache structures and (if not found there) via getenv() from the Apache server process.

  3. There is the special format: %{HTTP:header} where header can be any HTTP MIME-header name. This is looked-up from the HTTP request. Example: %{HTTP:Proxy-Connection} is the value of the HTTP header ``Proxy-Connection:''.

  4. There is the special format: %{LA-U:url} for look-aheads like -U. This performs a internal sub-request to look-ahead for the final value of url.

  5. There is the special format: %{LA-F:file} for look-aheads like -F. This performs a internal sub-request to look-ahead for the final value of file.

CondPattern is the condition pattern, i.e. a regular expression which gets applied to the current instance of the TestString, i.e. TestString gets evaluated and then matched against CondPattern.

Remember: CondPattern is a standard Extended Regular Expression with some additions:

  1. You can precede the pattern string with a '!' character (exclamation mark) to specify a non-matching pattern.

  2. There are some special variants of CondPatterns. Instead of real regular expression strings you can also use one of the following:

    Notice: All of these tests can also be prefixed by a not ('!') character to negate their meaning.

Additionally you can set special flags for CondPattern by appending

[flags]
as the third argument to the RewriteCond directive. Flags is a comma-separated list of the following flags:

Example:

To rewrite the Homepage of a site according to the ``User-Agent:'' header of the request, you can use the following:
RewriteCond  %{HTTP_USER_AGENT}  ^Mozilla.*
RewriteRule  ^/$                 /homepage.max.html  [L]

RewriteCond  %{HTTP_USER_AGENT}  ^Lynx.*
RewriteRule  ^/$                 /homepage.min.html  [L]

RewriteRule  ^/$                 /homepage.std.html  [L]
Interpretation: If you use Netscape Navigator as your browser (which identifies itself as 'Mozilla'), then you get the max homepage, which includes Frames, etc. If you use the Lynx browser (which is Terminal-based), then you get the min homepage, which contains no images, no tables, etc. If you use any other browser you get the standard homepage.


RewriteRule

Syntax: RewriteRule Pattern Substitution
Default: -None-
Context: server config, virtual host, per-directory config

The RewriteRule directive is the real rewriting workhorse. The directive can occur more than once. Each directive then defines one single rewriting rule. The definition order of these rules is important, because this order is used when applying the rules at run-time.

Pattern can be (for Apache 1.1.x a System V8 and for Apache 1.2.x a POSIX) regular expression which gets applied to the current URL. Here ``current'' means the value of the URL when this rule gets applied. This may not be the original requested URL, because there could be any number of rules before which already matched and made alterations to it.

Some hints about the syntax of regular expressions:

^           Start of line
$           End of line
.           Any single character
[chars]     One of chars 
[^chars]    None of chars 

?           0 or 1 of the preceding char
*           0 or N of the preceding char
+           1 or N of the preceding char

\char       escape that specific char 
            (e.g. for specifying the chars ".[]()" etc.)

(string)    Grouping of chars (the Nth group can be used on the RHS with $N)

Additionally the NOT character ('!') is a possible pattern prefix. This gives you the ability to negate a pattern; to say, for instance: ``if the current URL does NOT match to this pattern''. This can be used for special cases where it is better to match the negative pattern or as a last default rule.

Notice! When using the NOT character to negate a pattern you cannot have grouped wildcard parts in the pattern. This is impossible because when the pattern does NOT match, there are no contents for the groups. In consequence, if negated patterns are used, you cannot use $N in the substitution string!

Substitution of a rewriting rule is the string which is substituted for (or replaces) the original URL for which Pattern matched. Beside plain text you can use

  1. pattern-group back-references ($N)
  2. server-variables as in rule condition test-strings (%{VARNAME})
  3. mapping-function calls (${mapname:key|default})
Back-references are $N (N=1..9) identifiers which will be replaced by the contents of the Nth group of the matched Pattern. The server-variables are the same as for the TestString of a RewriteCond directive. The mapping-functions come from the RewriteMap directive and are explained there. These three types of variables are expanded in the order of the above list.

As already mentioned above, all the rewriting rules are applied to the Substitution (in the order of definition in the config file). The URL is completely replaced by the Substitution and the rewriting process goes on until there are no more rules (unless explicitly terminated by a L flag - see below).

There is a special substitution string named '-' which means: NO substitution! Sounds silly? No, it is useful to provide rewriting rules which only match some URLs but do no substitution, e.g. in conjunction with the C (chain) flag to be able to have more than one pattern to be applied before a substitution occurs.

Notice: There is a special feature. When you prefix a substitution field with http://thishost[:thisport] then mod_rewrite automatically strips it out. This auto-reduction on implicit external redirect URLs is a useful and important feature when used in combination with a mapping-function which generates the hostname part. Have a look at the first example in the example section below to understand this.

Remember: An unconditional external redirect to your own server will not work with the prefix http://thishost because of this feature. To achieve such a self-redirect, you have to use the R-flag (see below).

Additionally you can set special flags for Substitution by appending

[flags]
as the third argument to the RewriteRule directive. Flags is a comma-separated list of the following flags:

Remember: Never forget that Pattern gets applied to a complete URL in per-server configuration files. But in per-directory configuration files, the per-directory prefix (which always is the same for a specific directory!) gets automatically removed for the pattern matching and automatically added after the substitution has been done. This feature is essential for many sorts of rewriting, because without this prefix stripping you have to match the parent directory which is not always possible.

There is one exception: If a substitution string starts with ``http://'' then the directory prefix will be not added and a external redirect or proxy throughput (if flag P is used!) is forced!

Notice! To enable the rewriting engine for per-directory configuration files you need to set ``RewriteEngine On'' in these files and ``Option FollowSymLinks'' enabled. If your administrator has disabled override of FollowSymLinks for a user's directory, then you cannot use the rewriting engine. This restriction is needed for security reasons.

Here are all possible substitution combinations and their meanings:

Inside per-server configuration (httpd.conf)
for request ``GET /somepath/pathinfo'':

Given Rule                                      Resulting Substitution
----------------------------------------------  ----------------------------------
^/somepath(.*) otherpath$1                      not supported, because invalid!

^/somepath(.*) otherpath$1  [R]                 not supported, because invalid!

^/somepath(.*) otherpath$1  [P]                 not supported, because invalid!
----------------------------------------------  ----------------------------------
^/somepath(.*) /otherpath$1                     /otherpath/pathinfo

^/somepath(.*) /otherpath$1 [R]                 http://thishost/otherpath/pathinfo
                                                via external redirection

^/somepath(.*) /otherpath$1 [P]                 not supported, because silly!
----------------------------------------------  ----------------------------------
^/somepath(.*) http://thishost/otherpath$1      /otherpath/pathinfo

^/somepath(.*) http://thishost/otherpath$1 [R]  http://thishost/otherpath/pathinfo
                                                via external redirection

^/somepath(.*) http://thishost/otherpath$1 [P]  not supported, because silly!
----------------------------------------------  ----------------------------------
^/somepath(.*) http://otherhost/otherpath$1     http://otherhost/otherpath/pathinfo
                                                via external redirection

^/somepath(.*) http://otherhost/otherpath$1 [R] http://otherhost/otherpath/pathinfo
                                                via external redirection
                                                (the [R] flag is redundant)

^/somepath(.*) http://otherhost/otherpath$1 [P] http://otherhost/otherpath/pathinfo
                                                via internal proxy

Inside per-directory configuration for /somepath
(i.e. file .htaccess in dir /physical/path/to/somepath containing RewriteBase /somepath)
for request ``GET /somepath/localpath/pathinfo'':

Given Rule                                      Resulting Substitution
----------------------------------------------  ----------------------------------
^localpath(.*) otherpath$1                      /somepath/otherpath/pathinfo

^localpath(.*) otherpath$1  [R]                 http://thishost/somepath/otherpath/pathinfo
                                                via external redirection

^localpath(.*) otherpath$1  [P]                 not supported, because silly!
----------------------------------------------  ----------------------------------
^localpath(.*) /otherpath$1                     /otherpath/pathinfo

^localpath(.*) /otherpath$1 [R]                 http://thishost/otherpath/pathinfo
                                                via external redirection

^localpath(.*) /otherpath$1 [P]                 not supported, because silly!
----------------------------------------------  ----------------------------------
^localpath(.*) http://thishost/otherpath$1      /otherpath/pathinfo

^localpath(.*) http://thishost/otherpath$1 [R]  http://thishost/otherpath/pathinfo
                                                via external redirection

^localpath(.*) http://thishost/otherpath$1 [P]  not supported, because silly!
----------------------------------------------  ----------------------------------
^localpath(.*) http://otherhost/otherpath$1     http://otherhost/otherpath/pathinfo
                                                via external redirection

^localpath(.*) http://otherhost/otherpath$1 [R] http://otherhost/otherpath/pathinfo
                                                via external redirection
                                                (the [R] flag is redundant)

^localpath(.*) http://otherhost/otherpath$1 [P] http://otherhost/otherpath/pathinfo
                                                via internal proxy

Example:

We want to rewrite URLs of the form
/ Language /~ Realname /.../ File
into
/u/ Username /.../ File . Language

We take the rewrite mapfile from above and save it under /anywhere/map.real-to-user. Then we only have to add the following lines to the Apache server configuration file:

RewriteLog   /anywhere/rewrite.log
RewriteMap   real-to-user               txt:/anywhere/map.real-to-host
RewriteRule  ^/([^/]+)/~([^/]+)/(.*)$   /u/${real-to-user:$2|nobody}/$3.$1