A Deeper Look at mod_rewrite

A Deeper Look at mod_rewrite for Apache

Sep 14th in Other by Joseph Pecoraro

When people think of .htaccess configuration the first thing that pops into most people's minds is URL manipulation with mod_rewrite. People typically get frustrated with mod_rewrite's complexity. This tutorial will walk you through everything you need to know for the most common mod_rewrite tasks.

PG

Author: Joseph Pecoraro

My name is Joseph Pecoraro. I'm a web developer and designer from western New York. I am presently attending the great Rochester Institute of Technology to earn my MS in Computer Science by the end of 2009.

When people think of .htaccess configuration the first thing that pops into most people’s minds is URL manipulation with mod_rewrite. Thoughts on mod_rewrite vary quite a bit. To get a quick feel for what the world thinks for it I just ran a twitter search on “mod_rewrite” and picked a few from the front pages at the time I started this article:

mldk: Aargh! .htaccess and mod_rewrite can be such a pain in the ---!

bsterzenbach: Man do I love mod_rewrite. I could work with it the rest of my life and still not master it - so powerful

mikemackay: Still loving the total flexibility of mod_rewrite - coming to the rescue again. Often so overlooked…and easier than you might think too!

hostpc: I hate mod_rewrite. Can’t get this dang application to work properly :(

awanderingmind: Oh Wordpress and Apache, how thou dost vex me. Mod_rewrite be damned!

danielishiding: Why won’t mod_rewrite work! Damn it!

A few things I noticed are that people clearly recognize the power of mod_rewrite but are often frustrated by the syntax. Thats not surprising considering the front page of Apache’s mod_rewrite documentation says basically the same thing:

Despite the tons of examples and docs, mod_rewrite is voodoo. Damned cool voodoo, but still voodoo.” — Brian Moore

What a turn off! So, in this article I’m really going to take things down a notch. I’m going to try and address not only mod_rewrite’s syntax but try and provide a workflow that you can use to debug and solve your mod_rewrite problems. I’m also going to give you a few useful real-world examples.

However, before I start I’m going to give a warning. With many subjects, this one in particular, you won’t learn unless you try on your own! That is one of the reasons I’m going to focus on teaching a debug workflow. As usual I’ll show you how to get your system setup if you don’t already have the module loaded. I urge you to work through the examples on your own server, preferably in a test environment. The more experience and success that you have the easier it will be to expand on that knowledge to more advanced examples and applications. Enjoy.

What is mod_rewrite?

mod_rewrite is an Apache module that allows for server-side manipulation of requested URLs. Incoming URLs are checked against a series of rules. The rules contain a regular expression to detect a particular pattern. If the pattern is found in the URL, and the proper conditions are met, the pattern is replaced with a provided substitution string or action. This process continues until there are no more rules left or the process is explicitly told to stop.

This is summarized in these three points:

  • There are a list of rules that are processed in order.
  • If a rule matches it checks the conditions for that rule.
  • If everything is a go it makes a substitution or action.

Advantages of mod_rewrite

There are some obvious advantages of using a URL rewriting tool like this but there are some things that are probably not as obvious.

The main reason people use mod_rewrite are to transform ugly, cryptic URLs into what are known as “friendly URLs” or “clean URLs.” The new URLs are friendly in more ways then one. They are user friendly often making it much easier for humans to understand at a glance and possibly manipulate on their own. As an added bonus these URLs are also more search engine friendly. Creating friendly URLs is one search engine optimization technique. URLs are an effective way to describe the content its linking to. Take the following example:

Not so friendly: http://example.com/user.php?id=4512
Much friendlier: http://example.com/user/4512/
Even better:     http://example.com/user/Joe/

Not only is the final link easier on the eyes, its possible for search engines to extract semantic meaning from it. This basic kind of URL rewriting is one way that mod_rewrite is used. However, as you will see it can do a lot more then just these simple transformations.

Expanding on the same example, some people claim there are security benefits by having mod_rewrite tranform your URLs. Given the same example, imagine the following attack on the user id:

http://example.com/user.php?id=AHHHHHH
http://example.com/user/AHHHHHH/

In the first example the php script is explicitly being invoked and must handle the invalid id number. A poorly written script would likely fail and in a more extreme case (in a poorly written web application) bad input could cause data corruption. However, if the user is only ever shown the friendlier URLs they wouldn’t even know that the user.php page existed. They might only know about the friendly URL structure. Trying the same attack in that case would likely fail before it even reaches the php script. This is because at the core of mod_rewrite is regular expression pattern matching. In the example case above you would have been expecting a number, for example (\d+), not characters like a-z. The rewrite would have failed as soon as it found letters instead of numbers.

This extra layer of abstraction is nice from a security perspective. You could even prevent direct access to the original PHP scripts if you wanted. However, I am in no way condoning using mod_rewrite as a replacement for the usual security measures. You should always have server-side validation in your scripts.

Enabling mod_rewrite on the Server

Just like enabling .htaccess support, enabling mod_rewrite or any apache module must be done from the global configuration file (httpd.conf). Just as before, since mod_rewrite usage is so widespread hosting companies nearly always have it enabled. However, if you suspect that your hosting company does not have it enabled, and we will test for that below, you should contact them and they will likely enable it.

If you rolled your own Apache installation its worth noting that mod_rewrite needs to be included when compiled Apache, as it is not done so by default. However, its so common that nearly all installation guides, including Apache’s show how in their example. However, pre-packaged versions will have it enabled. If you’re reading this there is probably a 99% chance that mod_rewrite is compiled on your Apache, so you can just proceed to the next step.

If you’re the administrator for your webserver and you want to make sure that you load the module you should look in the httpd.conf file. In the configuration file there will be a large section which just loads a whole bunch of modules. The following line will likely appear in the file. If it is, great! If its commented out, meaning there is a # symbol at the start of the line then remove the # to be left with:

LoadModule rewrite_module modules/mod\_rewrite.so

Olders version of Apache 1.3, may require you to add the following directive, after the LoadModule directive.

# Only in Apache 1.3
AddModule mod\_rewrite.c

However, this seems to have disappeared in Apache 2 and later. Only the LoadModule directive is required.

If you had to modify the configuration file at all then you will have to restart the web server. As always you should remember to make a backup of the original file in case you need to revert back to it later.

Testing for mod_rewrite

You can test if mod_rewrite is enabled/working in a number of ways. One of the simplest ways is to view the output from PHP’s phpinfo function. Create this very simple PHP page, open it in your browser and search for “mod_rewrite” in the output.

<?php phpinfo(); ?>

mod_rewrite should show up in the “Loaded Modules” section of the page like so:

Good, mod_rewrite enabled

However, if you’re not using PHP (although I will for the rest of the tutorial) there are some others ways to check. Apache comes with a number of command line tools. I mentioned the htpasswd tool in the first tutorial for Basic Authentication. You can use other tools like apachectl or httpd to directly test for the module. There are command line switches that allow you to check all of the loaded modules in the existing installation. You can execute the following to get a listing of all of the loaded modules.

 shell> apachectl -t -D DUMP_MODULES 

Here I show the help page for the command. I then run the command and search for “rewrite” in the results and it shows there was a line of output that matched!

apache test

Finally, if you are still unsure if its enabled, like before just give it a shot and see what happens! I’ll go over the syntax later but here would be a very bare bones test to see if its working. The following .htaccess file would redirect any request in the given folder to the good.html file. That means if mod_rewrite is working you should see good.html. If mod_rewrite is not working then you will see index.html which shows a warning.

# Redirect everything in this directory to "good.html"
RewriteEngine on
RewriteRule .* good.html

Here are the Good and Bad result pages:

Good, mod_rewrite worked
Bad, mod_rewrite didn't work

Inside .htaccess

As always, anything that you can put in a .htaccess file can also be put inside the global configuration file. With mod_rewrite there is a small differences if you put a rule in one or the other. Most noticeably:

if you’re putting […] rules in an .htaccess file […] the directory prefix (/) is removed from the REQUEST_URI variable, as all requests are automatically assumed to be relative to the current directory. - Apache Documentation

Just something to keep in mind if you see examples online or if you’re trying an example yourself, beware of the leading slash! I will attempt to clarify this below when we got through some examples together.

Regular Expressions

This tutorial does not intend to teach you regular expressions. For those of you that know regular expressions, the regular expressions used in mod_rewrite seem to vary between versions of Apache. In Apache 2.0 they seem to be Perl Compatible Regular Expressions (PCRE). This means that many of the shortcuts you are used to, such as \w meaning [A-Za-z0-9_], \d meaning [0-9], and much more do exist. However, my hosting company uses Apache 1.3 and the regular expressions are more limited.

If you don’t know regular expressions here are some useful tutorials that will bring you up to speed quickly.

And a few references that everyone should know about:

If you haven’t spent the time to learn regular expressions I highly suggest that you take the time to learn them. As is usually the case, they are not as complex as you may think they are. I selected the links above from my years of experience working with regular expressions. I feel that these guides do a very good job of getting the basics across. Regular Expressions are crucial to know if you want to effectively use mod_rewrite, and they are also useful to know for many different aspects of general development, such as “find/replace” in your favorite code editor!

Getting a Feel for it.

Okay, you’ve waited patiently enough, lets run through a quick example. This is included in the linked source files. Here is the code from the .htaccess file:

# Enable Rewriting
RewriteEngine on

# Rewrite user URLs
#   Input:  user/NAME/
#   Output: user.php?id=NAME
RewriteRule ^user/(\w+)/?$ user.php?id=$1

Before I can explain any of it I have to give you more information about the other files in the directory.

The directory contains an index.php and a user.php file. The index just has some links, of various formats, to the user page. The php code is purely debug to show that the page was accessed and what the given “id” parameter contained. Here is user.php code:

<?php

// Get the username from the url
$id = $_GET['id'];

?><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
    <title>Simple mod\_rewrite example</title>
    <style type="text/css"> .green { color: green; } </style>
</head>
<body>
  <h1>You Are on user.php!</h1>
  <p>Welcome: <span class="green"><?php echo $id; ?></span></p>
</body>
</html>

This example has a few different parts. First, notice that URL Rewriting must be enabled via the RewriteEngine directive! If your .htaccess file is going to use rewrite rules you should always include this line, otherwise you can’t be sure if its enabled or not! As a rule of thumb, always include it and make sure that you only have it once per .htaccess file. The string “on” is case insensitive. So when you see other examples on the net that show “On”, that is equally acceptable.

The first RewriteRule is for handling the user.php page. As the comments indicate we are rewriting the friendly URL into the format of the normal URL. To do that, when the friendly URL comes in as input we are actually transforming it into the standard query string URL. Breaking it down we get:

The Rule:
RewriteRule ^user/(\w+)/?$ user.php?id=$1

Pattern to Match:
^              Beginning of Input
user/          The REQUEST_URI starts with the literal string "user/"
(\w+)          Capture any word characters, put in $1
/?             Optional trailing slash "/"
$              End of Input

Substitute with:
user.php?id=   Literal string to use.
$1             The first (capture) noted above.

Here are some examples and an explanation for each:

User.php
Incoming Match Capture Outgoing Result
user.php?id=joe No   user.php?id=joe Normal
user/joe Yes joe user.php?id=joe Good
user/joe/ Yes joe user.php?id=joe Good
user/joe/x No   user/joe/x Fail

So the first example goes through unaffected by the RewriteRule and works just fine. The second and third examples match the RewriteRule, are rewritten accordingly and end up working just fine. The last example does not match the rule and proceeds untouched. The server doesn’t have a user directory and fails trying to find it. This is as expected because user/joe/x is a bad URL in the first place!

This example was rather easy to understand. However, there were a lot of minute details that I glossed over. To do anything more complex I want to clarify exactly what is happening. In the next section I’m going to walk through exactly what happens and we will take a look at a more complex example that touches on all the core aspects of rewriting.

NOTE: If this example didn’t work for you its possible that your Apache or mod_rewrite versions are not PCRE compatible. Try changing ^user/(\w+)/?$ into ^user/([a-z]+)/?$. Notice that I did not use the \w shorthand. If this version works for you then you will have to avoid the regex shortcuts and instead use their longer equivalents (see the Regular Expressions section above).

Flow of Execution in Detail

The flow of execution through the rewrite rules is simple but its not exactly straightforward. So, I’m going to break it down into painful detail. It all starts with the user making a request to your server. They type a URL into their browser’s address bar, their browser translates that into an HTTP request to send to the server, Apache receives that request, and parses it into pieces. Here is an example:

Full URL Analysis

Note that whenever I mention one of Apache’s variables I use a weird syntax: %{APACHE_VAR}. That is just because its similar to the syntax that mod_rewrite uses to access the variables. However, it is the name inside the braces that is important.

So what part does mod_rewrite deal with? If you’re working inside a .htaccess file, then you’re working with the REMOTE_URI portion but without the leading slash! I mentioned this before, and its something that is very confusing for most people when they start out. If you’re working from inside the global configuration file, then you would leave the leading slash in.

To be as specific as possible, buried in the Apache Documentation is this description of the “URL Part” that mod_rewrite acts on:

The Pattern is always a regular expression matched against the URL-Path of the incoming request (the part after the hostname but before any question mark indicating the beginning of a query string). Apache Documentation

To remove any ambiguity, highlighted in gold in these two URLs below is the “URL Part” that mod_rewrite acts on inside a .htaccess file:

The Rewrite Portion of the URL

For the rest of this section I’ll be using these two URLs to describe the flow of execution. I will refer to the first url as the “green” URL and the second as the “blue” URL. Also I will be using “URL Part” throughout this analysis, meaning the REMOTE_URI without the leading slash.


For those readers that want to be 100% technical these two things that I am calling URLs are actually URIs. The definition of a Uniform Resource Identifier (URI) differs from a Uniform Resource Locator (URL). A URI is just an indicator of where a resource is. This means that multiple URIs can point to the same resource but are themselves different addresses. Following a URI might take any number of hops or redirections until it actually arrives at the resource. A URL however, is a stricter term that identifies the exact location of a resource. This subtle difference has blurred over time such that nobody cares about the difference. I will continue to use the term URL because people are more comfortable with it.


So, now we know what the rewrite rules are going to be acting on. Once Apache has parsed the request it translates that to the file it thinks is needed and proceeds to fetch that file. At this point it will traverse directories and encounter the .htaccess files. Assuming the .htaccess file enables the RewriteEngine any RewriteRule could change the URL. A drastic enough change (such as one that points Apache to another directory instead of the original directory it was heading towards) will cause Apache to issue a sub-request and proceed to fetch the new file.

In most cases sub-requests are invisible to you. This implementation detail is not important to know for the majority of the simple rewrites that you will ever write or use. What is more important to know is how Apache processes the rewrite rules inside a .htaccess file.

The rules in a .htaccess file are processed in the order that they appear. Note that each RewriteRule is acting on the “URL Part” that is similar to the REMOTE_URI. When a rule makes a substitution, the modified “URL Part” will be handed to the next rule. That means that the URL that a rule is processing may have been edited by a previous rule! The URL is continually being updated by each rule that it matches. This is important!

Here is a flow chart that tries to provide a visualization of the generic flow of execution across multiple rules in a .htaccess file:

mod_rewrite flow chart

Note that at the top of the flow chart the value going into the rewrite rules is that “URL Part” and if the substitution is successful the modified part proceeds into the next rule.

I mentioned rewriting conditions earlier but I didn’t go into detail. Each RewriteCond is associated with a single RewriteRule. The conditions appear before the rule they are associated with, but only get evaluated if the rule’s pattern matched. As the flow chart shows, if a rewrite rule’s pattern matches then Apache will check to see if there are any conditions for that rule. If there are no conditions then it will make the substitution and continue. If there are conditions then it will only make the substitution if all of the conditions are true. Lets visualize this in a concrete example.

The URLs that I’m working with are actually part of the “Profile Example” that I’ve included in the source code download in the “profile_example” directory. This is similar to the previous example with the user.php but it now has a profile.php page, an added rewrite rule, and a condition! Lets take a look at the code and Apache’s flow of execution through it:

Profile Rewrite Rules

Here there are two rules. Rule #1 is the same as the user example we saw before. Rule #2 is new and notice that it has a Condition. The “URL Part” we have been discussing goes through the rules in order, top to bottom. So it will first go through Rule #1 and then Rule #2.

The key to understanding this example is to first understand the goal. In this example I am going to allow friendly profile URLs but I’m actually going to explicitly forbid access to the php page directly. Note, some people might say that this is a bad idea. They might say that as a developer this will make things harder for you to debug. Thats true, I don’t actually recommend doing a trick like this, but it makes for an excellent example! More practical uses for mod_rewrite will show up later in this tutorial.

So, with that in mind lets see what happens with our green URL. This one we want to be successful.

Green URL Execution

Up at the top you see Apache’s THE_REQUEST variable. I put this at the top because, unlike many of the Apache variables we will deal with, during the duration of the request this variables value will never change! That is one of the reasons why Rule #2 uses %{THE_REQUEST}. Underneath THE_REQUEST we see the green “URL Part” going into the first rule:

  • The URL matches the pattern.
  • There are no conditions, so it continues.
  • The substitution is made.
  • There are no flags, so it continues.

After making it through the first rule, the URL has changed. The total URL has been rewritten to profile.php?id=joe which Apache then breaks down and updates many of its variables. The ?id=joe portion gets hidden from us and profile.php, the new “URL Part”, continues into the second rule. It is our first encounter with conditions:

  • The URL matches the pattern.
  • There are conditions so we will try the conditions.
  • THE_REQUEST does not contains profile.php so the condition fails.
  • Because a condition failed we ignore the substitution and flags.
  • The URL is unchanged by this rule.

At this point we made it through all the rewrites and the profile.php?id=joe page will be fetched properly.

Here is how the execution looks for the blue URL, the one we want to fail:

Blue URL Execution

Again I put the THE_REQUEST value at the top. The blue “URL Part” enters Rule #1:

  • The URL does not match the pattern.
  • Everything else is ignored and the URL proceeds unchanged.

The first rule was easy. As is often the case a URL that you have won’t match a rule’s patten and will proceed untouched. Now it enters Rule #2:

  • The URL matches the pattern.
  • There are conditions so we will try the conditions.
  • THE_REQUEST contains profile.php so the condition passes.
  • All the conditions passed we can make the substitution.
  • ”-” is a special substition that means don’t change anything.
  • There are flags on the rule so we process the flags
  • There is a F flag which means return a forbidden response.
  • A 403 Forbidden response is sent to the client.

A few things are worth re-iterating. In order for the substitution to work, all of the conditions have to pass. In this case there is only one, and it passes so the substitution happens. Note that - is a special substitution that doesn’t change anything. This is useful when you just want to use flags to do something for you, which is exactly what we want to do in this case.

Here is the familiar table breakdown of example URLs and their responses:

Profile.php
Incoming Match Capture Outgoing Result
profile.php?id=joe Yes (#2)   profile.php?id=joe Forbidden
profile/joe Yes (#1) joe profile.php?id=joe Good
profile/joe/ Yes (#1) joe profile.php?id=joe Good
profile/joe/x No   profile/joe/x Fail

Syntax

While going over the syntax of RewriteRule and RewriteCond I would suggest that you first download the AddedBytes Cheatsheet. This is because the cheatsheet lists the most useful server variables, flags, has regular expression tips, and even a few examples. There is just so much there that it would be difficult to put it inline.

Lets start with RewriteRule. You can always visit Apache’s Documentation on RewriteRule if you need to do something really specific. Here is my overview:

Syntax of RewriteRule

The cheatsheet shows what types of flags are available. Many tutorials cover these flags in detail, and I’ll go through the flags that I see most commonly used in the examples below.

Here is Apache’s Documentation on RewriteCond and my overview:

Syntax of RewriteCond

Debug Workflow

Whenever you’re working with mod_rewrite and creating new rules, always start with a simple, dumbed down version of the rule and work your way up towards the final version. Never try to do everything at once. The same thing applies for conditions. Add rules and conditions one at a time. Test often!

The key concept I am trying to get across with this approach is that this will let you know quickly if a change you made doesn’t work properly or causes something to work incorrectly. When doing too much at once inevitably you will run into an error and you will have to revert all of the changes you made to track down what the problem was. This is a very roller coaster approach and will likely lead to frustration. However, if you’re always steadily advancing, and each step along the way moving to workable checkpoints you’re in much, much better shape.

People often ignore this advice, create a complex rule, and it ends up not working. Hours later they find out the problem was not in the complex portion, but instead it was just a simple mistake in the regular expression that could have been caught much earlier had they carefully constructed the rule like I’ve explained above. The same goes for deconstructing a rule to reverse engineer a problem. This approach will seriously reduce frustration!

In the Examples

In the examples below I will always assume the website’s domain is example.com. This domain name is important because it affects the HTTP_HOST variable as well as specifying a redirect URL to another file on your website. Keep this in mind in case you intend to modify any of the examples for your own website. If so, simply replace “example.com” with your domain. For example, Nettuts would replace “example.com” with “nettuts.com”.

Removing www

This is the most classic rewrite rule. This will make it so anyone who comes to your website via http://www.example.com they will get a hard redirect and thus the Location Bar in their browser will update accordingly.

RewriteEngine on 
RewriteCond %{HTTP_HOST} ^www\.example\.com$ [NC]
RewriteRule ^(.*)$ http://example.com/$1 [R=301,L]

The rule matches anything, and saves everything as $1. The important part in this example is the condition. The condition checks the HTTP_HOST variable to see if it started with “www.” If this condition is true, the rewrite happens:

  • The substitution is a full URL (it starts with http://)
  • The substitution contains $1 which was captured earlier
  • The [R=301] flag redirects the browser to the rewritten URL, this is a hard redirect in the sense that it causes the browser to load the new page and update its Location Bar with the new URL.
  • The [L] flag means that this is the last rule to parse, the rewrite engine, should stop.

If the incoming URL had been “http://www.example.com/user/index.html” then HTTP_HOST would have been www.example.com and the rewrite would happen creating http://example.com/user/index.html.

If the incoming URL had been “http://example.com/user/index.html” then HTTP_HOST would have been example.com, the condition would have failed and the rewrite engine would proceed with the URL unchanged.

Forbid Hotlinking

Hotlinking, referred to as Inline Linking on Wikipedia, is the term used to describe one site leeching off of another site. Usually one site, the Leecher, will include a link to some media file (lets say an image or video) that is hosted on another site, the Content Host. In this scenario, the Content Host’s servers are wasting bandwidth serving content to some other website.

For many people, its fine if other sites cross link to their content. However, many people would rather prevent hotlinking so as not to pay for the extra bandwidth required to send the content to someone else’s site.

The most common, and basic approach to preventing hotlinking is to whitelist a number of websites, and block everything else. To find out who is asking for the content from your site you can check the referrer. The HTTP_REFERER header (yes that is how it is spelled) is set by the browser or client that is requesting the resource. In the end this is not 100% reliable, however it is more then effective at stopping the majority of hotlinking. So, verify if the referrer is in a whilelist of acceptable referrers. If the referrer is not acceptable (blank or someone else’s site) then you can send them a forbidden warning:

# Give Hotlinkers a 403 Forbidden warning.
RewriteEngine on
RewriteCond %{HTTP_REFERER} !^http://example\.net/?.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://example\.com/?.*$ [NC]
RewriteRule \.(gif|jpe?g|png|bmp)$ - [F,NC]

Here the RewriteRule is checking for a request of a file with any popular image extension. Such as .gif, .png, or .jpg. You could add other extensions to this list if you want to protect .flv, .swf, or other files.

The domains that are allowed to access the content are “example.net” and “example.com”. In either of these two cases a Rewrite Conditions will fail and the substitution won’t happen. If any other domain makes an attempt, lets say “sample.com” attempts, then all the Rewrite Conditions will pass, the substitution will happen, and the [F] forbidden action will trigger.

Give Hotlinkers a Warning Image

The previous example returns a 404 Forbidden warning when someone trys to hotlink content from your server. You can actually go one step further, and send the hotlinker any resource of your choice. For example you can send a useful warning image can just be some text stating “hotlinking is not allowed”. This way the other person can realize their mistake and host a copy on their own server. The only change is to actually go through with the rewrite substitution and provide a choosen image instead of the one being requested:

# Redirect Hotlinkers to "warning.png"
RewriteEngine on
RewriteCond %{HTTP_REFERER} !^http://example\.net/?.*$
RewriteCond %{HTTP_REFERER} !^http://example\.com/?.*$   [NC]
RewriteRule \.(gif|jpe?g|png|bmp)$ http://example.com/warning.png [R,NC]

Note that this is an example of what I call a “hard” or “external” redirect. The RewriteRule has a URL in the substitution portion and it also has the [R] flag.

Custom 404

One trick that you can do with htaccess is check to see if the current “URL Part” leads to an actual file or directory on the web server. This is a good way to create a custom 404 “File not Found” page. For example, if a user trys to fetch a page in a particular directory that doesn’t exist you can redirect them to any page you want, such as the index page or a custom 404 page.

# Generic 404 to show the "custom_404.html" page
# If the requested page is not a file or directory
# Silent Redirect: the user's URL bar is unchanged.
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule .* custom_404.html [L]

This is a great example of mod_rewrite’s file test operators. They are identical to file tests in bash shell scripts and even Perl scripts. Here the condition checks if the REQUEST_FILENAME is not a file and not a directory. In the case where it is neither, then then there is no such file for the request.

If the incoming request filename can’t be found then this loads a “custom404.html” page. Note that there is no [R] flag, so this is a silent redirect, not a hard redirect. The user’s Location Bar will not change, but the contents of the page will be “custom404.html”. Short and simple.

Safety First

If you have mod_rewrite snippets that you use often and want to easily distribute to other servers or environments you may want to be careful. As was mentioned, any invalid directive in a .htaccess file will likely cause internal server errors. So if an environment you move the snippet to doesn’t have mod_rewrite you could temporarily break it.

One solution to this problem is the “check” for the mod_rewrite module. This is possible with any module. Just wrap your mod_rewrite code in an <IfModule> block and you’ll be all set:

<IfModule mod_rewrite.c>

  # Turn on
  RewriteEngine on

  # Always remove www (with a hard redirect)
  RewriteCond %{HTTP_HOST} ^www\.example\.com$ [NC]
  RewriteRule ^(.*)$ http://example.com/$1 [R=301,L]

  # Generic 404 for anyplace on the site
  # ...

</IfModule>

Conclusion

I hope that this tutorial proves that mod_rewrite isn’t all that scary, and in fact its quirks and speed bumps can be avoided with careful development practices.


Related Posts

Check out some more great tutorials and articles that you might like

Enjoy this Post?

Your vote will help us grow this site and provide even more awesomeness

Plus Members

Source Files, Bonus Tutorials and
More for $9 a month for all TUTS+
sites in one subscription.

Join Now

User Comments

( ADD YOURS )
  1. PG

    Steven September 14th

    First :-)
    Nice tut!

    ( Reply )
    1. PG

      Ahmad Alfy September 14th

      You’re abusing this site’s resources …
      Grow up!

      ( Reply )
      1. PG

        Omar September 14th

        Especially since it only took three words to get the point across.

      2. PG

        Gav September 15th

        Really good tutorial, always found this the trickiest bit of web dev – but its clearer now thanks again..

        p.s Steven.. you really need to grow up, how old are you??

  2. PG

    claudio September 14th

    A useful insight, thanks.

    ( Reply )
  3. PG

    Furley September 14th

    Great breakdown. Some seriously useful info here.

    ( Reply )
  4. PG

    Joseph Pecoraro September 14th

    For all those that are interested in trying out the example, I have them up on my server at:
    http://bogojoker.com/htaccess/part_2/

    ( Reply )
    1. PG

      Ahmad Alfy September 14th

      Thanks :)

      ( Reply )
    2. PG

      Evan Jones September 14th

      Sweetness. I love when you talk server-side :P

      ( Reply )
  5. PG

    Ahmad Alfy September 14th

    Awesome… Mod Rewrite always confused.
    Now It’s clear that I need to dig more into RegExp to excel mor_rewrite!

    ( Reply )
  6. PG

    Abderrahmane TJ September 14th

    Hey, I like you tut.

    ( Reply )
  7. PG

    Dhruv Kumar September 14th

    Helped me a lot, Thank you :)

    ( Reply )
  8. PG

    Rishi Patel September 14th

    a lot of tutorial on mod_rewrite………..

    Another nice one…….will look it in detail after some time.

    ( Reply )
  9. PG

    Stoian Kirov September 14th

    The best article EVER!
    Thank you veeeeery much :)

    ( Reply )
  10. PG

    jlapitan September 14th

    very helpful, especially with the www part..

    thanks you!!!

    ( Reply )
  11. PG

    HullDO September 14th

    While mod_rewrite can sometimes be tricky, it’s really a lot easier to understand what’s already there. It’s a good addition to any website and a very useful SEO technique.

    ( Reply )
  12. PG

    Juan C Rois September 14th

    Very good tutorial, Thanks

    ( Reply )
  13. PG

    Matthew September 14th

    Really appreciate this one!

    ( Reply )
  14. PG

    neil September 14th

    Cool article!

    ( Reply )
  15. PG

    David Moreen September 14th

    Reading all of this actually hurt my brain! Good stuff.

    ( Reply )
    1. PG

      Diego SA September 14th

      same

      ( Reply )
  16. PG

    ron September 14th

    Voodoo :)

    ( Reply )
  17. PG

    Hitesh September 14th

    Very Simple and to the point.
    Thanks

    ( Reply )
  18. PG

    Nikola Malich September 14th

    This is such a great read… thanks!

    ( Reply )
  19. PG

    Joris September 14th

    Faved o//

    ( Reply )
  20. PG

    genius_advice September 14th

    nettuts got a good deal for this author =P
    very comprehensive. . . hurray for friendly URLs!

    ( Reply )
  21. PG

    adam September 14th

    Really hit the spot. Something that lots of people can benefit from! thanks…

    ( Reply )
  22. PG

    rizq September 14th

    Nice Job !

    ( Reply )
  23. PG

    WebTitan September 14th

    The best mod_rewrite tutorial I’ve ever seen :D I love that :D

    ( Reply )
  24. PG

    Jefferson September 14th

    O melhor tutorial de mod_rewrite disponível na web.
    The best tutorial mod_rewrite available on the web.

    ( Reply )
  25. PG

    Miles Johnson September 14th

    Great article, probably one of the best ive seen on mod_rewrite.

    If you can’t understand this, you best learn how http request/response and headers work.

    http://www.amazon.com/HTTP-Developers-Handbook-Chris-Shiflett/dp/0672324547/ref=sr_1_3?ie=UTF8&s=books&qid=1252973206&sr=8-3

    ( Reply )
  26. PG

    Jamal September 14th

    Great tut, removed the myths about mod_rewrite. I have a question from the user example.

    What if I want to pass a url like http://www.example.com/joe instead of http://www.example.com/user/joe, will it work or does it always have to follow with a function like user/joe.

    Looking to achieve something like twitter vanity urls.

    Cheers!

    ( Reply )
    1. PG

      Joseph Pecoraro September 15th

      It can certainly work at the top level. You would just place a .htaccess file and your appropriate RewriteRules at the top level. With your example I would advice that you then check if the “name” matches an already existing page or directory (for instance if you have an “about”, or “contact” page you would want to load those, not a user profile with that name). It would look something like this:

      # Rewrite user URLs from the Top Level
      # Input: example.com/NAME
      # Output: example.com/user/user.php?id=NAME
      RewriteCond %{REQUEST_FILENAME} !-f
      RewriteCond %{REQUEST_FILENAME} !-d
      RewriteRule ^(\w+)/?$ user/user.php?id=$1

      Cheers.

      ( Reply )
      1. PG

        Jamal September 28th

        @Joseph Thanks,

        “..I would advice that you then check if the “name” matches an already existing page or directory (for instance if you have an “about”, or “contact” page you would want to load those, not a user profile with that name)….”

        Good advice, it never crossed my mind :-)

        I am going to try this out.

  27. PG

    Guillermo Carrion September 14th

    Wow…. you really went deep inside mod_rewrite… great article….

    ( Reply )
  28. PG

    Hasanga September 14th

    Now this is what you call a tutorial. Sweet!
    Keep em coming Joseph!

    ( Reply )
  29. PG

    Brian Temecula September 14th

    Don’t you love it when you get that 500 Error! Apache can be so unforgiving sometimes.

    ( Reply )
  30. PG

    Ignas September 15th

    This one is what I need! I looked info about this and here it is! Thanks a lot!

    ( Reply )
  31. PG

    Daniel September 15th

    Nice tutorial, thanks.

    ( Reply )
  32. PG

    David Horn September 15th

    Wonderful – love seeing these really in depth tutorials. Good stuff.

    ( Reply )
  33. PG

    Tutorial City September 15th

    Absolutely amazing! Thanks for the tutorial ;)

    ( Reply )
  34. PG

    Peace4man September 15th

    Wow thats GREAT man Greate Job !

    ( Reply )
  35. PG

    Thang NH September 16th

    Hi,

    I want to convert relative path to absolute path by mod require, but no success, this is my code:
    RewriteRule ^.(images|css|js)/$ ./(images|css|js)/ [L]
    Its purpose is convert path images,css, js to images, css, js in root directory.

    ( Reply )
  36. PG

    logicdesign September 16th

    Hi, fantastic article, im new to this. Does any one know if it is posible to have apache rewrites on a sub domain?

    ( Reply )
  37. PG

    Nick Brown September 16th

    Nettuts has this terrifying ability to ALWAYS post tuts on topics I’m JUST about to work on.

    A few days ago I was about to work on the SEO on my page a bit, and the SEO article came out. Then I wanted to convert all my article pages from things like articles.php?id=13 to ‘multiple-word-subect-here’ with mod_rewrite, and sure enough I check nettuts and here it is.

    Scary.

    ( Reply )
  38. PG

    Davis John September 16th

    Great one! These days, it’s hard to find a tutorial as good, detailed and useful as this one when no one seems to go beyond “the 20 best blah blah”.

    Have a quick question on this. I am not able to get the relative redirects to work on my OSX Leopard box. Like this one:

    RewriteRule ^user/(\w+)/?$ user.php?id=$1

    It resolves to the correct physical path but gives a 404 saying that the file could not be found. It works, however, if I give the full absolute path to user.php (the same path that’s printed in the 404!).

    The directory is located inside a sub-directory in my ~/Sites and I am not using virtual hosts. Any clue?

    ( Reply )
    1. PG

      Davis John September 16th

      While I still don’t know “why”, seems like I had a solution in my question itself. It’s working great if I set DocumentRoot to my ~/Sites directory or create a VirtualHost for that path. Apparently in the user directory setup, apache was not able to map the physical path to the ~username alias.

      Thanks anyway, problem solved!

      ( Reply )
      1. PG

        B2n October 31st

        I have the same problem. Can you tell me how do you fix it ?
        What do you add in httpd.conf and .htaccess ?

  39. PG

    Dustin Lakin September 16th

    Amazing job, This has got to be one of the most useful things to know as a LAMP developer.

    ( Reply )
  40. PG

    Chad September 17th

    Does anyone know if mod_rewrite can clean up .html url structure or is it just for php?

    Example:

    http://www.example.com/about.html -> http://www.example.com/about

    ( Reply )
    1. PG

      David September 22nd

      It will work for any url on any serverside or static language of course!

      ( Reply )
  41. PG

    IgnacioRV September 17th

    Thank you! I really needed a tutorial about mod_rewrite. I wanted to use it for a project but didn’t know where to start… this is very well explained, great work!

    ( Reply )
  42. PG

    Joel September 17th

    Excellent tutorial – that helps a lot.

    Thanks!

    ( Reply )
  43. PG

    Dan September 18th

    Hello!

    Can anyone tell me how could I redirect a domain of a website to a particular file.
    I mean http://domain.website.com -> http://www.website.com/file.php?param=domain.

    I’ve tried with

    RewriteCond %{HTTP_HOST} ^([a-z0-9-]+)\.website\.com$ [NC]
    RewriteCond %{HTTP_HOST} !^www\.website\.com$ [NC]
    RewriteRule ^(.*)$ file.php?param=%2 [L]

    and it doesn’t work.

    I have placed a rule for replacing website.com with http://www.website.com.

    Thanks!

    ( Reply )
  44. PG

    Yogal September 26th

    This is a great tutorial!

    The best thing is, you have explained the inner workings of mod_rewrite. I’ve learned a lot, especially about R flag and how rules are processed (without L flag).

    To many sited offer only examples, so even if something work’s you dont know how.

    For those who have hard time learning Regular Expressions i find this tool priceless : http://www.gskinner.com/RegExr/ (Actually if I remember correctly, Jeffrey Way recommended it in one of his screencasts). With this tool you can see how your regex behaves in real time!

    ( Reply )
  45. PG

    Violet Bliss Dietz September 28th

    Terrific tutorial. I have your first one bookmarked. I’m adding this one. I’ve gone to some lengths at times to avoid having to use mod_rewrite because I couldn’t get it to work correctly. This will help a lot.

    ( Reply )
  46. PG

    Faifas October 11th

    should be

    ( Reply )
  47. PG

    Andri Yudatama October 11th

    This is exactly what i’ve been looking for.. nice tuts..!!

    ( Reply )
  48. PG

    some guy in Vienna October 16th

    very nice – thanks for a good tutorial :)

    ( Reply )
  49. PG

    dee October 19th

    great tutorial thanks

    im having trouble with rewriting hebrew in my urls

    i am able to rewrite the test.php to “test_heb”(inhebrew)

    but able to rewrite “test_heb”/”parameter_heb”
    this is the syntax that works
    RewriteRule ^test/([0-9A-Za-z]+)/?$ /test.php?id=$1 [NC,L]

    ( Reply )
    1. PG

      dee October 19th

      this is the “UNABLE PART”
      but unable to rewrite “test_heb”/”parameter_heb”
      this is the syntax that works
      RewriteRule ^test/([0-9A-Za-z]+)/?$ /test.php?id=$1 [NC,L]

      ( Reply )
  50. PG

    John90 October 22nd

    Would we accept advocacy of slavery, the legal distribution of child pornography, or the hunting of racial minorities for sport? ,

    ( Reply )
  1. Arrow
    Gravatar

    Your Name
    October 22nd