Using HTACCESS for Pretty URLS

Using htaccess Files for Pretty URLS

Continuing our review of htaccess files, today we’ll examine how to use mod_rewrite to create pretty URLs.

Benefits of Formatted URLs

While some claim pretty URLs help in search engine rankings, the debate here is fierce, we can all agree that pretty URLs make things easier for our users and adds a level of professionalism and polish to any web application. I could go over all the theoretical reasons for this, but I like real-world examples better. Like it or hate it we all must admit that Twitter is a wildly popular web application and part of the reason for that is most certainly how it formats URLs. I can tell anyone in the know that my Twitter username is noahhendrix, and they know my profile can easily be found at twitter.com/noahhendrix. This seemingly simple concept has vast effects in the popularity of your application.

Just to put things in perspective we can look at another popular social networking website, Facebook. Since the site launched in 2004 the profile system has grown and evolved to better tailor to users, but one glaring hole was the URL to a profile. From the time I registered with Facebook my profile was at the URL http://www.facebook.com/profile.php?id=1304880680. That is quite a mouth full, and just recently it appears Facebook has realized that and they launched Facebook vanity URLs. Now I can share my Facebook profile by telling people my Facebook username is “noahhendrix”, which they know can be found by going to facebook.com/noahhendrix. While the odds are that we won’t have an application as popular as Facebook, we can still borrow a few pages from their book.

Quick Overview

A quick overview before we dive into code, in today’s tutorial we will go over two slightly different methods of creating pretty URLs using HTACCESS. The difference between the methods is whether Apache or PHP is doing the heavy lifting to break the URL apart for parsing. I want to point out that mod_rewrite tutorials are almost as old as the internet itself and this is not the first. At the end I will use one of the methods to create a simple application to show how these solutions would look in a real-live website (well not 100% production quality). The service we will create is a URL shortener that can mirrors the functionality of such sites like bit.ly, TinyURL, or su.pr. So without anymore fluff let us look at the code.

Using Apache

First, we can place all of our code in Apache .htaccess files. This could look something like this:

  Options +FollowSymLinks
  RewriteEngine On

  RewriteCond %{SCRIPT_FILENAME} !-d
  RewriteCond %{SCRIPT_FILENAME} !-f

  RewriteRule ^users/(\d+)*$ ./profile.php?id=$1
  RewriteRule ^threads/(\d+)*$ ./thread.php?id=$1

  RewriteRule ^search/(.*)$ ./search.php?query=$1

Let’s start at the top and work our way down to better understand what is going on here. The first line sets the environment up to follow symbolic links using the Options directive. This may or may not be necessary, but some web hosts use symlinks (similar to alias in MacOSX or shortcuts is Windows) for common HTTP request errors and these are usually symlinked files, or at least this is how I understand the reasoning. Next we tell Apache we are going to use the Rewrite Engine. The next two lines are very, very important it restricts rewriting URLs only to paths that do not actually exists. This prevents the rules below from matching example.com/images/logo.png for example. The first prevents existing directories with the !-d flag and the second with !-f means ignore existing files.

The next three lines are the actual URL rewriting commands. Each line creates a rule that tries to match a regular expressions pattern against the incoming URL. Regular expressions, at least for me, are a hard set of rules to remember but I always find it helpful to use this tutorial by Nettut’s own Jeffery Way and the tool he recommends. I found it easy to type in sample URLs we want to match and then try to hack together the pattern.

The first argument is the pattern, between the caret and dollar sign. We tell Apache we want URLs asking for the users directory (an artificial directory, doesn’t have to actually exist) followed by a / and any length of numbers. The parenthesis create a capture group, you can use as many of these as you want, they serve as variables that we can then transplant into our rewrite. The asterisk means the user can enter whatever they want, and it won’t affect the rewrite, this is primarily to handle a trailing slash so example.com/users/123 is the same as example.com/users/123/ as users would expect.

The second argument is the path we want to actually call, this unlike the first must be a real file. We tell Apache to look in the current directory for a file called profile.php and send the parameter id=$1 along with it. Remember the capture group earlier? That is where we get the variable $1, capture groups start at one. This creates a URL on the server like example.com/profile.php?id=123.

This method is great for legacy web applications that have existing URL structures that prevent us from easily rewriting the backend to understand a new URL schema because to the server the URL looks the same, but to the user it looks much nicer.

Using PHP

This next method is great for those who don’t want to distribute too much logic to Apache and feel more comfortable in PHP (or similar scripting languages). The concept here is capture any URL the server receives and push it to a PHP controller page. This comes with the added benefit of control, but greater complexity at the same time. Your HTACCESS file might look something like this:

  Options +FollowSymLinks
  RewriteEngine On

  RewriteCond %{SCRIPT_FILENAME} !-d
  RewriteCond %{SCRIPT_FILENAME} !-f

  RewriteRule ^.*$ ./index.php

Everything is the same as above, except the last line so we will skip to it. Instead of creating a capture group we just tell Apache to grab every URL and redirect it to index.php. What this means is we can do all of our URL handling in PHP without relying too much on stringent URL paths in HTACCESS. Here is what we might do at the top of our index.php file to parse out the URL:

  <?php
    #remove the directory path we don't want
    $request  = str_replace("/envato/pretty/php/", "", $_SERVER['REQUEST_URI']);

    #split the path by '/' 
    $params     = split("/", $request);
  ?>

The first line is not necessary unless you application doesn’t live at the root directory, like my demos. I am removing the non-sense part of the URL that I don’t want PHP to worry about. $_SERVER['REQUEST_URI'] is a global server variable that PHP provides and stores the request URL, it generally looks like this:

  /envato/pretty/php/users/query

As you can see it is basically everything after the domain name. Next we split up the remaining part of the virtual path and split it by the / character this allows us to grab individual variables. In my example I just printed the $params array out in the body, of course you will want to do something a little more useful.

One thing you might do is take the first element of the $params array and include a file by that same name and within in the file you can use the second element in the array to execute some code. This might look something like this:

	 <?php
	   #keeps users from requesting any file they want
	   $safe_pages = array("users", "search", "thread");
	   
	   if(in_array($params[0], $safe_pages)) {
	     include($params[0].".php");
	   } else {
	     include("404.php");
	   }
	 ?>

WARNING: The first part of this code is unbelievably important! You absolutely must restrict what pages a user can get so they don’t have the opportunity to print out any page they wish by guessing at file names, like a database configuration file.

Now that we have the soapbox out of the way let’s move on. Next we check if the requested file is in the $safe_pages array, and if it is we include otherwise will include a 404 not found page. In the included page you will see that you have access to the $params array and you can grab whatever data from it that is necessary in your application.

This is great for those who want a little more control and flexibility. It obviously requires quite a bit extra code, so probably better for new projects that won’t require a lot of code to be updated to fit the new URL formats.

A Simple URL Shortner

This last part of the tutorial is going to let us put some use to the code we went over above, and is more or less a “real-life” example. We are going to create a service called shrtr, I made up this name so any other products with this name are not associated with the code I am posting below. Note: I know this is by far not an original concept, and is only meant for demonstration of mod_rewrite. First let’s take a look at the database:

As you can see this is very straightforward, we have only 4 columns:

  • id: unique identifier used to reference specific rows
  • short: unique string of characters appended to the end of our URL to determine where to redirect
  • url: the URL that the short url redirects to
  • created_at: a simple timestamp so we know when this URL was created

The Basics

Next, let’s go over the six files we need to create for this application:

  • .htaccess: redirects all short urls to serve.php
  • create.php: validates URL, creates shortcode, saves to DB
  • css/style.css: holds some basic styling information
  • db_config.php: store variables for database connections
  • index.php: The face of our application with form for entering URL
  • serve.php: looks up short URL and redirects to actual URL

That is all we need for our basic example. I will not cover index.php or css/style.css in very great detail because they are have no PHP, and are static files.

# index.php
----
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    <title>Makes URLs Shrtr</title>
    <link type="text/css" rel="stylesheet" href="./css/style.css" />
	</head>
	<body>
	 <div id="pagewrap">
  	 <h1>shrt<span class="r">r</span>.me</h1>
  	 
  	 <div class="body">
  	   <form action="./create.php" method="post">
  	   
  	     <span class="instructions">Type your URL here</span>
  	     <input name="url" type="text" />
  	     <input type="submit" value="shrtr" />
  	   
  	   </form>
  	 </div>
	   
	 </div>
	</body>
</html>

The only real interesting to note here is that we submit the form with a field called URL to create.php.

# css/style.css
----
/* reset */
* {
  font-family: Helvetica, sans-serif;
  margin: 0;
  padding: 0;
}

/* site */
html, body { background-color: #008AB8; }
a { color: darkblue; text-decoration: none;}

  #pagewrap {
    margin: 0 auto;
    width: 405px;
  }
  
    h1 {
      color: white;
      margin: 0;
      text-align: center;
      font-size: 100px;
    }
      h1 .r { color: darkblue; }
    
    .body {
      -moz-border-radius: 10px;
      -webkit-border-radius: 10px;
      background-color: white;
      text-align: center;
      padding: 50px;
      height: 80px;
      position: relative;
    }
    
      .body .instructions {
        display: block;
        margin-bottom: 10px;
      }
      .body .back {
        right: 15px;
        top: 10px;
        position: absolute;
      }
      
      .body input[type=text] {
        display: block;
        font-size: 20px;
        margin-bottom: 5px;
        text-align: center;
        padding: 5px;
        height: 20px;
        width: 300px;
      }

That is all very generic, but makes our application a little more presentable.

The last basic file we need to look at is our db_config.php, I created this to abstract some of the database connection information.

# db_config.php
----
<?php

  $database = "DATABASE_NAME";
  $username = "USERNAME";
  $password = "PASSWORD";
  $host     = "localhost";

?>

You need to replace the values with what works in your database, and host is probably localhost, but you need to double check with your hosting provider to make sure. Here is the SQL dump of the table, url_redirects that holds all the information we showed above:

--
-- Table structure for table `url_redirects`
--

CREATE TABLE IF NOT EXISTS `url_redirects` (
  `id` int(11) NOT NULL auto_increment,
  `short` varchar(10) NOT NULL,
  `url` varchar(255) NOT NULL,
  `created_at` timestamp NOT NULL default CURRENT_TIMESTAMP,
  PRIMARY KEY  (`id`),
  KEY `short` (`short`)
) ENGINE=MyISAM  DEFAULT CHARSET=utf8;

Creating the Short URL

Next lets look at the code necessary to create our short URL.

# create.php
----
<?php
  require("./db_config.php");
  
  $url = $_REQUEST['url'];
  
  if(!preg_match("/^[a-zA-Z]+[:\/\/]+[A-Za-z0-9\-_]+\\.+[A-Za-z0-9\.\/%&=\?\-_]+$/i", $url)) {
    $html = "Error: invalid URL";
  } else {
    
    $db = mysql_connect($host, $username, $password);
    
      $short = substr(md5(time().$url), 0, 5);
    
      if(mysql_query("INSERT INTO `".$database."`.`url_redirects` (`short`, `url`) VALUES ('".$short."', '".$url."');", $db)) {
        $html = "Your short URL is<br />shrtr.me/".$short;
      } else {
        $html = "Error: cannot find database";
      }
    
    mysql_close($db);
  }
?>

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    <title>Makes URLs Shrtr</title>
    <link type="text/css" rel="stylesheet" href="./css/style.css" />
	</head>
	<body>
	 <div id="pagewrap">
  	 <h1>shrt<span class="r">r</span>.me</h1>
  	 
  	 <div class="body">
  	   <?= $html ?>
  	   <br /><br />
  	   <span class="back"><a href="./">X</a></span>
  	 </div>
	   
	 </div>
	</body>
</html>

Now we are getting a bit more complex! First we need to include the database connection variables we created earlier, then we store the URL parameter sent to us by the create form in a variable called $url. Next we do some regular expressions magic to check if they actually sent a URL, if not we store an error. If the user entered a valid URL we create a connection to the database using the connection variables we include at the top of page. Next we generate a random 5 character string to save to the database, using the substr function. The string we split up is the md5 hash of the current time() and $url concatenated together. Then we insert that value into the url_redirects table along with the actual URL, and store a string to present to the user. If it fails to insert the data we store an error. If you move down into the HTML part of the page all we do is print out the value of $html, be it error or success. This obviously isn’t the most elegant solution but it works!

Serving the Short URL

So we have the URL in the database let’s work on serve.php so we can actually translate the short code into a redirect.

<?php
  require("./db_config.php");

  $short = $_REQUEST['short'];

  $db = mysql_connect($host, $username, $password);    
    $query = mysql_query("SELECT * FROM `".$database."`.`url_redirects` WHERE `short`='".mysql_escape_string($short)."' LIMIT 1", $db);
    $row = mysql_fetch_row($query);

    if(!empty($row)) {
      Header("HTTP/1.1 301 Moved Permanently");
      header("Location: ".$row[2]."");
    } else {
      $html = "Error: cannot find short URL";
    }

  mysql_close($db);
?>

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    <title>Makes URLs Shrtr</title>
    <link type="text/css" rel="stylesheet" href="./css/style.css" />
	</head>
	<body>
	 <div id="pagewrap">
  	 <h1>shrt<span class="r">r</span>.me</h1>

  	 <div class="body">
  	   <?= $html ?>
  	   <br /><br />
  	   <span class="back"><a href="./">X</a></span>
  	 </div>

	 </div>
	</body>
</html>

This one is very similar to create.php we include the database information, and store the short code sent to us in a variable called $short. Next we query the database for the URL of that short code. If we get a result we redirect to the URL, if not we print out an error like before.

As far as PHP goes that is all we need to do, but at the moment to share a short URL users must enter this, http://shrtr.me/server.php?short=SHORT_CODE not very pretty is it? Let’s see if we can’t incorporate some mod_rewrite code to make this nicer.

Pretty-ify With HTACCESS

Of the two methods I wrote about at the beginning of the tutorial we will use the Apache one because this application is already created without considering any URL parsing. The code will look something like this:

  Options +FollowSymLinks
  RewriteEngine On

  RewriteCond %{SCRIPT_FILENAME} !-d
  RewriteCond %{SCRIPT_FILENAME} !-f
  RewriteRule ^(\w+)$ ./serve.php?short=$1

Skipping to the RewriteRule we are directing any traffic that doesn’t already have a real file or directory to serve.php and putting the extension in the GET variable short. Not to bad no go try it out for yourself!

Conclusion

Today we learned a few different ways to utilize mod_rewrite in our application to make our URLs pretty. As always I will be watching over the comments if anybody has trouble, or you can contact me on twitter. Thanks for reading!


Note: Want to add some source code? Type <pre><code> before it and </code></pre> after it. Find out more
  • http://UnleashedEffects.com Ben Williams

    im trying to get this to work on one of my sites but it comes up as a 404 error every time or a blank page.
    I have my example site here.
    http://share-my.info/
    I just copied the code and its not working.
    What should i do.

  • Kasper

    I certainly like this a lot! And I also like the php part alot, but regarding the php.

    Lets say I have some file sructures as
    pages/mainpages and so on.php
    pages/admin/admin pages.php
    pages/error/404 403 405 and so on.php
    pages/party/assign status and so on.php

    How would I with the smartest way pick what file in what folder?
    I suppose making an array with dirs and then specifying them with the url somehow?

    $request = $_SERVER['REQUEST_URI'];
    $params = split(“/”, $request);
    $safe_pages = array(“contact”, “home”, “prices”, “portfolio”);
    $dirs = array(“errors”, “admin”);

    if(in_array($params[1], $safe_pages)) {
    include($params[0].”/”.$params[1].”.php”); // Something like that?
    } else {
    include(“404.php”);
    }

    Or is there a more smart way to structure all this so also stuff as news/category/newstitle also will be smart implemented??

    Best regards
    Kasper

  • http://UnleashedEffects.com Ben Williams

    im trying to get this to work on one of my sites but it comes up as a 404 error every time or a blank page.
    I have my example site here.
    http://share-my.info/
    I just copied the code and its not working.
    What should i do.

  • http://www.website-design-leicestershire.co.uk Jake

    Great introduction to htaccess for beginners, creating pretty URLs is very important, especially for SEO, so this article has been very helpful

  • dhie

    bookmarked.
    nice tutorial. thank :-)

  • Michael

    Finally!!

  • Clem

    thanks Noah !

    I’m trying to implement the .htaccess + PHP solution for my website.
    Everything works fine so far except for the fact that when there more than one parameter, the pages don’t seem to load the javascript and css files included in the tag.

    http://www.mysite.com/something —> this works fine
    http://www.mysite.com/something/something_more —> this doesn’t work

    any idea what might be wrong ? (So far I’ve run everything locally using MAMP)

    thank you,
    Clem

  • http://www.dev-hq.co.uk Joe

    Nice Tutorial. Wish i could use it myself, but unfortunately i chose ASP instead of PHP (damn me) XD

    Keep up the good work!

  • http://www.iczalazar.net deviantz

    cool man! thanks for sharing this! keep it up!

  • http://tanelpuhu.com tanel

    good code but…. if you are using substr of md5, you might eventually experience collision. 32 bytes is pretty sure identical but 5 first chars might match.

    i used similar code and this goes like this:

    if(!isset($_GET['id'])){
    //hash
    $hash = md5(time().microtime()+$_SERVER['HTTP_USER_AGENT']);
    //start
    $i = 4;
    while($iq($q);
    $fetch = mysql_fetch_array($res);
    if($fetch[0] == 0){
    $hash = $k2;
    break;
    }
    }
    with this code you’ll get 5 chars “id’s” but if current code exists it will take 6th char also and so on… this few querys to mysql aint very load-”ishh” to database also, i think :)
    keep up the good work ;)

  • http://tanelpuhu.com tanel

    sorry, pasteing went wrongor just my silly touch-pad again :(

    $id = md5(time().microtime()+$_SERVER['HTTP_USER_AGENT']);
    $i = 4;
    while($iq($q);
    $fetch = mysql_fetch_array($res);
    if($fetch[0] == 0){
    $id = $k2;
    break;
    }
    }

  • http://www.rat32.com rat32

    wow great

  • http://www.maine-exista.ro natura

    Hello. I have a problem with a site and i was wondering if you could help me.
    I want to rewrite www[dot]grindul-lupilor[dot]ro/index.php?limba=engleza (I also have four more languages)
    into this: www[dot]grindul-lupilor[dot]ro

    how can i do it?

    thank you in advance

  • http://www.maine-exista.ro natura

    Any idea ? thanks again

  • http://www.wmskins.com Nabeel

    I have setup redirection for nitrogen.wmskins.com to http://www.wmskins.com/index.php?file=minicms/cms&id=1, its working fine. but the url in the address bar also changes, I would like that url i.e. nitrogen.wmskins.com remains same in the address bar. How to achieve that?

  • mak

    hi frnd great tutorial about short url, but i have one doubt is it possible of adding multiple urls at a time and getting the result for each url link respectively. Thank you .

  • http://link Loy21

    I guess it is just funny, huh. ,

  • http://www.doitbigtickets.com rook

    http://doitbigtickets.com/ResultsGeneral.aspx?kwds=New+York+Yankees

    Above is my site URL and i tried the above .htaccess recommendation but still it does’nt work. Can somebody please help?

  • Pradeep Vanparia

    Very nice..

  • http://arbitrageinvestment.net PRASHANT SHEKHER

    when i am adding Your htaccess file why my index.php file not run.WHY????

    PLZ PLZ PLZ PLZ HELP ME

  • http://www.techbray.com Jaan

    cool article :)

  • http://www.pondicherrytimes.com pondicherry times

    nice article
    post more like this article, its very interesting

  • http://www.ioss.in Abdul Majeed.P

    I searching for replicated site development …..This article make it easy ….Thanks…

  • http://www.zomgxuan.co.cc ZOMGxuan

    I believe that “RewriteRule ^users/(\d+)*$ ./profile.php?id=$1″ is slightly wrong, and instead should be “RewriteRule ^users/(\d+).$ ./profile.php?id=$1″. That is, the * (asterisk) should be changed to a . (period), since * matches 0 or more of the preceding character, and hence will not be able to match an ending forward slash, whereas . matches any character at all. Alternatively, it could be “RewriteRule ^users/(\d+)/?$ ./profile.php?id=$1″, so that a forward slash specifically can be matched optionally.

  • Shavi Levi

    This is quiet an interesting Tuturial, I was looking to use the .htaccess.

  • Kyle

    Hey awesome application you made here. I was reading it and it doesn’t seem like you are able to add custom URLs (http://example.com/hellosir) How would you create that?

    thanks

  • qhost

    can show me your demo version of this tutorial? I try many time in localhost, but same can run and always show error invalid url.

  • Pete

    The redirect with PHP is working fine when there’s one parameter, but when there’s more than one parameter, my page tries to load all my stylesheets and JS files from root/page/, instead of root/. Does anybody know a solution for this?

    • http://puplookup.com Brandon

      Don’t use relative links.

  • http://www.pedroriveros.com Pedro

    I don’t pretend anyone here to think I’m lazy but I’ve been trying this for weeks and still haven’t achieved it. (my helpless attempts have made me even investigate elsewhere, but I know here’s the proper place)
    Can somebody help me attaching all the files?

  • http://www.rugtraders.co.uk Rug Trader

    Very informative and well written article. Bookmarked for future reference :)

  • http://www.rahjv4.tk cute programmer

    well explained! thanks for this post, it really helps me alot as a programmer.

  • Amer

    Thank you very much, awesome tutorial :)

    I have one question for you.
    Let’s say we have more get parameters in url. How then nice url should be formed?
    For example, user wants houses in London which have a garage. So url should be (badly formed):

    http://www.mysite.com?town=London&garage=yes

    What’s the best way for nice forming url. Is this

    http://www.mysite.com/town/London/garage/yes

    So first, third, fifth etc capture group is name of parameter and second, fourth, sixth etc is parameter value. But then url

    http://www.mysite.com?id=100 should be

    http://www.mysite.com/id/100

    Of course, I would do this in index.php, using php (second solution in your tutorial).

    Is this good solution or there is a better way?

    Thans in advance,

    Amer

  • Moose

    Hey Noah Hendrix, your URL tutorial helped me a lot since I am learning PHP, I would like to know how I can add custom links to the shortener. So that if the user enters something into the custom url textbox the site would use it if not it would use the default random text/number combination. Any idea on how to do this?

  • Marcel

    “The next two lines are very, very important it restricts rewriting URLs only to paths that do not actually exists.”

    What?

    Please hire a proofreader.

    • http://powerbreeder.com Brandon

      He is saying that if there actually is a page or file at the given path, the url will not be rewritten… or did you understand what he was saying but just not like the way it was written?

  • http://www.mydamnchannel.com/viewdamnprofile.aspx?user=albertokaufm49 uttetuike

    hi folks I am just the latest in this forum and so i just would like to introduce me personally as well as say hola

  • Dulare Yadav

    Good Post, I bookmarked it……

  • yli

    any idea how to redirect mydomain.com/anyword/default.html to mydomain.com/key-anyword/listings.html ? thanks

  • Wafel

    It works like a charm, it is a very lightweight and neath application!

  • http://furnigate.com vicos2

    working great thanks, but the array wasn’t work for me so i use

    $uri=$_SERVER['REQUEST_URI'];
    list($var0, $var1, $var2, $var3) = split(‘[/]‘, $uri);

    than echo it to :

    echo ”
    var1 $var1
    var2 $var2
    var3 $var3 “;

    $var0 < is the localhost or might be yoursite.com, still keep trying on it

    in the other hand when i have folder named "products" than i try to open "localhost/mydir/products" the result is my "products" directory. is there any solutions of this?

    • blue

      create robots.txt

  • http://www.centerkaos.com/ Konveksi Kaos

    very nice tutorial.. but i’m still get error 404 … :(