Tutorial Details
- Topics: HTTP, REST
- Level: Intermediate
- Estimated Completion Time : 15 mins
The HyperText Transfer Protocol (HTTP) is the life of the Web. It’s used every time you transfer a document, or make an AJAX request. But HTTP is a relative unknown among web developers. This introduction will demonstrate how the set of design principles, known as REST, underpin HTTP, and allow you to embrace its fullest power by building interfaces which can be used from almost any device or operating system.
Why REST
REST is a simple way to organize interactions between independent systems.
REST is a simple way to organize interactions between independent systems. REST has been growing in popularity since 2005, and inspires the design of services such as the Twitter API. This is because REST allows you to interact with minimal overhead with clients as diverse as mobile phones and other websites. In theory, REST is not tied to the web, but REST is almost always implemented as such, and was inspired by HTTP. As a result, REST can be used wherever HTTP can.
The alternative is building relatively complex conventions on top of HTTP. Often, this takes the shape of entire new XML-based languages. The most illustrious example is SOAP. You have to learn a completely new set of conventions, but you never use HTTP to its fullest power. Because REST has been inspired by HTTP and plays to its strengths, it is the best way to learn how HTTP works.
After an initial overview, we will examine each of the HTTP building blocks: URLs, HTTP verbs and response codes. We’ll also review how to use them in a RESTful way. Along the way, we’ll illustrate the theory with an example application, which simulates the process of keeping track of data related to a company’s clients – through a web interface.
HTTP
HTTP is the protocol that allows for sending documents back and forth on the Web. A protocol is a set of rules that decide what messages can be exchanged and what messages are appropriate replies to others. Another common protocol is POP3, which you possibly use to fetch email on your hard disk.
In HTTP, there are two different roles: server and client. In general, the client always initiates the conversation; the server replies. HTTP is text based; that is, messages are essentially bits of text, although the message body can also contain other media. Text usage makes it easy to monitor an HTTP exchange.
HTTP messages are made of a header and a body. The body can often remain empty; it contains data you want to transmit over the network in order to use it according to the instructions in the header. The header contains metadata, such as encoding information; but, in the case of a request, it also contains the important HTTP methods. In the REST style, you will find that header data is often more significant than the body.
Spying HTTP at Work
If you use Firefox with the Firebug extension installed, click on the Net panel, and set it to enabled. You will then have the ability to view the details of the HTTP requests as you surf. For example:
Another helpful way of familiarizing yourself with HTTP is to use a dedicated client, such as cURL.
cURL is a command line tool that is available on all major operating systems.
Once you have cURL installed, type:
curl -v google.com
This will display the complete HTTP conversation. Requests are preceded by >, while responses are preceded by <.
URLS
URLs are how you identify the things that you want to operate on. We say that each URL identifies a resource. These are exactly the same URLs which are assigned to web pages. In fact, a web page is a type of resource. Lt’s take a more exotic example, and consider our sample application, which manages the list of a company’s clients:
/clients
will identify all clients, while
/clients/jim
will identify the client, named ‘Jim’, assuming that he is the only one with that name.
In these examples, we do not generally include the hostname in the URL, as it is irrelevant from the standpoint of how the interface is organized. Nevertheless, the hostname is important to ensure that the resource identifier is unique all over the web. We often say you send the request for a resource to a host. The host is included in the header separately from the resource path, which comes right on top of the request header:
GET /clients/jim HTTP/1.1 Host: example.com
Resources are best thought of as nouns. For example, the following is not RESTful:
/clients/add
This is because it uses a URL to describe an action. This is a fairly fundamental point in distinguishing RESTful from non-RESTful systems.
Finally, URLs should be as precise as needed; everything needed to uniquely identify a resource should be in the URL. You should not need to include data identifying the resource in the request. This way, URLs act as a complete map of all the data your application handles.
But how do you specify an action? For example, how do you tell that you want a new client record created instead of retrieved? That is where HTTP verbs come into play.
HTTP Verbs
Each request specifies a certain HTTP verb, or method, in the request header. This is the first all caps word in the request header. For instance,
GET / HTTP/1.1
means the GET method is being used, while
DELETE /clients/anne HTTP/1.1
means the DELETE method is being used.
HTTP verbs tell the server what to do with the data identified by the URL.
HTTP verbs tell the server what to do with the data identified by the URL. The request can optionally contain additional information in its body, which might be required to perform the operation – for instance, data you want to store with the resource. You can supply this data in cURL with the -d option.
If you have ever created HTML forms, you’ll be familiar with two of the most important HTTP verbs: GET and POST. But there are far more HTTP verbs available. The most important ones for building RESTful API are GET, POST, PUT and DELETE. Other methods are available, such as HEAD and OPTIONS, but they are more rare (if you want to know about all other HTTP methods, the official source is IETF).
GET
GET is the simplest type of HTTP request method; the one browsers use all the time when you click a link or type a URL in the address bar. It instructs the server to transmit the data identified by the URL to the client. Data should never be modified on the server side as a result of a GET request. In this sense, a GET request is read-only, but of course, once the client receives the data, it is free to do any operation with it on its own side – for instance, format it for display.
PUT
A PUT request is used when you want to create or update the resource identified by the URL. For example,
PUT /clients/robin
might create a client called Robin on the server. You will notice that REST is completely backend agnostic; there is nothing in the request that informs the server how the data should be created – just that it should. This allows you to easily swap the backend technology if the need should arise. PUT requests contain the data to use in updating or creating the resource in the body. In cURL, you can add data to the request with the -d switch.
curl -v -X PUT -d "some text"
DELETE
DELETE should perform the contrary of PUT; it should be used when you want to delete the resource identified by the URL of the request.
curl -v -X DELETE /clients/anne
This will delete all data associated with the resource identified by /clients/anne.
POST
POST is used when the processing you wish to happen on the server should be repeated, if the POST request is repeated (that is, they are not idempotent; more on that below). In addition, POST requests should cause processing of the request body as a subordinate of the URL you are posting to.
In plain words:
POST /clients/
should not cause the resource at /clients/, itself, to be modified, but a resource whose URL starts with /clients/. For instance, it could append a new client to the list, with an id generated by the server.
/clients/some-unique-id
PUT requests are used easily instead of POST requests, and vice versa. Some systems use only one, some use POST for create operations, and PUT for update operations (since with a PUT request you always supply the complete URL), some even use POST for updates and PUT for creates.
Sometimes POST requests are used to trigger operations on the server that do not fit into the Create/Update/Delete paradigm; but this, however, is beyond the scope of REST. In our example, we stick with PUT all the way.
Classifying HTTP Methods
- Safe and unsafe methods:
- safe methods are those that never modify resources. The only safe methods, from the four listed above, is
GET. The others are unsafe, because they may result in a modification of the resources. - Idempotent methods:
- These methods achieve the same result, no matter how many times the request is repeated: they are
GET,PUT, andDELETE. The only non idempotent method isPOST.PUTandDELETEbeing considered idempotent might be surprising, though, it, in fact, is quite easy to explain: repeating aPUTmethod with exactly the same body should modify a resource in a way that it remains identical to the one described in the previousPUTrequest: nothing will change! Similarly, it makes no sense to delete a resource twice. It follows that no matter how many times aPUTorDELETErequest is repeated, the result should be the same as if it had been done only once.
Remember: it’s you, the programmer, who ultimately decides what happens when a certain HTTP method is used. There is nothing inherent to HTTP implementations that will automatically cause resources to be created, listed, deleted, or updated. You must be careful to apply the HTTP protocol correctly and enforce these semantics yourself.
Representations
We can sum up what we have seen so far in the following way: the HTTP client and the HTTP server exchange information about resources identified by URLs.
We say that the request and response contains a representation of the resource. By representation, we mean information, in a certain format, about the state of the resource or how that state should be in the future. Both the header and the body are part of the representation.
The HTTP headers, which contain metadata, are tightly defined by the HTTP spec; they can only contain plain text, and must be formatted in a certain manner.
The body can contain data in any format, and this is where the power of HTTP truly shines. You know that you can send plain text, pictures, HTML, and XML in any human language. Through request metadata or different URLs, you can choose between different representations for the same resource. For example, you might send a webpage to browsers and JSON to applications.
The HTTP response should specify the content type of the body. This is done in the header, in the Content-Type field; for instance:
Content/Type: application/json
For simplicity, our example application only sends JSON back and forth, but the application should be architectured in such a way that you can easily change the format of the data, to tailor for different clients or user preferences.
HTTP Client Libraries
To experiment with the different request methods, you need a client which allows you to decide which method to use. Unfortunately, HTML forms do not fit the bill, as they only allow you to make GET and POST requests. In real life, APIs are accessed programatically, through a separate client application or through JavaScript in the browser.
This is the reason why, in addition to the server, it is essential to have good HTTP client capabilities available in your progamming language of choice.
A very popular choice of HTTP client library is, again, cURL. You’re already been familiarized with the cURL command line over the course of this tutorial. cURL includes both a standalone command line program, and a library that can be used from many programming languages. In particular, cURL is very often the HTTP client solution of choice with PHP. Other languages, such as Python, offer more native HTTP client libraries.
Setting up the Example Application
Our example PHP application is extremely barebones. I wanted to expose the low level functionality as much as possible, without any framework magic. I also did not want to use a real API, such as Twitter’s, because they are subject to change unexpectedly, you need to setup authentication, which can be a hassle, and, obviously, you cannot study the implementation.
To run the example application, you will need to install PHP5 and a web server, with some mechanism to run PHP. The current version must be at least version 5.2 to have access to the json_encode() and json_decode() functions.
As for servers, the most common choice is still Apache with mod_php, but you can use anything that you are comfortable with. There is a sample Apache configuration which contains rewrite rules to help you setup the application quickly.
All requests to any URL starting with /clients/ must be routed to our server.php file.
In Apache, you need to enable mod_rewrite and put the supplied mod_rewrite configuration somewhere in your Apache configuration, or your .htacess file. This way, server.php will answer to all requests coming from the server. The same must be achieved with Nginx, or whichever alternative server you decide to use.
How the Example Applications Works
There are two keys to processing requests the REST way. The first key is to initiate different processing, depending on the HTTP method – even when the URLS are the same. In PHP, there is a variable in the $_SERVER global array, which determines which method has been used to make the request:
$_SERVER['REQUEST_METHOD']
This variable contains the method name as a string, for instance ‘GET‘, ‘PUT‘, and so on.
The other key is to know which URL has been requested. To do this, we use another standard PHP variable:
$_SERVER['REQUEST_URI']
This variable contains the URL starting from the first forward slash. For instance, if the host name is ‘example.com‘, ‘http://example.com/‘ would return ‘/‘, while ‘http://example.com/test/‘ would return ‘/test/‘.
Let’s first try to determine which URL has been called. We only consider URLs starting with ‘clients‘. All other are invalid.
$resource = array_shift($paths);
if ($resource == 'clients') {
$name = array_shift($paths);
if (empty($name)) {
$this->handle_base($method);
} else {
$this->handle_name($method, $name);
}
} else {
// We only handle resources under 'clients'
header('HTTP/1.1 404 Not Found');
}
We have two possible outcomes:
- the resource is the clients, in which case, we return a complete listing
- there is a further identifier
If there is a further identifier, we assume it is the client’s name, and, again, forward it to a different function, depending on the method. We use a switch statement, which should be avoided in a real application:
switch($method) {
case 'PUT':
$this->create_contact($name);
break;
case 'DELETE':
$this->delete_contact($name);
break;
case 'GET':
$this->display_contact($name);
break;
default:
header('HTTP/1.1 405 Method Not Allowed');
header('Allow: GET, PUT, DELETE');
break;
}
Response Codes
HTTP response codes standardize a way of informing the client about the result of its request.
You might have noticed that the example application uses the PHP header(), passing some strange looking strings as arguments. The header() function prints the HTTP headers and ensures that they are formatted appropriately. Headers should be the first thing on the response, so you shouldn’t output anything else before you are done with the headers. Sometimes, your HTTP server may be configured to add other headers, in addition to those you specify in your code.
Headers contain all sort of meta information; for example, the text encoding used in the message body or the MIME type of the body’s content. In this case, we are explicitly specifying the HTTP response codes. HTTP response codes standardize a way of informing the client about the result of its request. By default, PHP returns a 200 response code, which means that the response is successful.
The server should return the most appropriate HTTP response code; this way, the client can attempt to repair its errors, assuming there are any. Most people are familiar with the common 404 Not Found response code, however, there are a lot more available to fit a wide variety of situations.
Keep in mind that the meaning of a HTTP response code is not extremely precise; this is a consequence of HTTP itself being rather generic. You should attempt to use the response code which most closely matches the situation at hand. That being said, do not worry too much if you cannot find an exact fit.
Here are some HTTP response codes, which are often used with REST:
200 OK
This response code indicates that the request was successful.
201 Created
This indicates the request was successful and a resource was created. It is used to confirm success of a PUT or POST request.
400 Bad Request
The request was malformed. This happens especially with POST and PUT requests, when the data does not pass validation, or is in the wrong format.
404 Not Found
This response indicates that the required resource could not be found. This is generally returned to all requests which point to a URL with no corresponding resource.
401 Unauthorized
This error indicates that you need to perform authentication before accessing the resource.
405 Method Not Allowed
The HTTP method used is not supported for this resource.
409 Conflict
This indicates a conflict. For instance, you are using a PUT request to create the same resource twice.
500 Internal Server Error
When all else fails; generally, a 500 response is used when processing fails due to unanticipated circumstances on the server side, which causes the server to error out.
Exercising the Example Application
Let’s begin by simply fetching information from the application. We want the details of the client, ‘jim‘, so let’s send a simple GET request to the URL for this resource:
curl -v http://localhost:80/clients/jim
This will display the complete message headers. The last line in the response will be the message body; in this case, it will be JSON containing Jim’s address (remember that omitting a method name will result in a GET request; also replace localhost:80 with the server name and port you are using).
Next, we can obtain the information for all clients at once:
curl -v http://localhost:80/clients/
To create a new client, named Paul…
curl -v -X PUT http://localhost:80/clients/paul -d '{"address":"Sunset Boulevard" }
and you will receive the list of all clients now containing Paul as a confirmation.
Finally, to delete a client…
curl -v -X DELETE http://localhost:80/clients/anne
and you will find that the returned JSON no longer contains any data about Anne.
If you try to retrieve a non-existing client, for example:
curl -v http://localhost:80/clients/jerry
you will obtain a 404 error, while, if you try to create a client which already exists:
curl -v -X PUT http://localhost:80/clients/anne
you will receive a 409 error, instead.
Conclusion
In general, the less assumptions beyond HTTP you make, the better.
It’s important to remember that HTTP was conceived to communicate between systems which share nothing but an understanding of the protocol. In general, the less assumptions beyond HTTP you make, the better: this allows the widest range of programs and devices to access your API.
I used PHP in this tutorial, because it is most likely the language most familiar to Nettuts+ readers. That being said, PHP, although designed for the web, is probably not the best language to use when working in a REST way, as it handles PUT requests in a completely different fashion, than GET and POST. One of the most popular PHP REST libraries is the one included in the popular Zend Framework.
Beyond PHP, you might consider the following:
- The various Ruby frameworks (Rails and Sinatra)
- There’s excellent REST support in Python. Plain Django and WebOb, or Werkzeug should work
- node.js has excellent support for REST
Among the applications which attempt to adhere to REST principles, the classic example is the Atom Publishing Protocol, though it’s honestly not used that much in practice. For a modern application, which is built on the philosophy of using HTTP to the fullest, refer to Apache CouchDB.
Have fun.

Fantastic summation. As a still learning web developer, I only knew what REST was supposed to accomplish, not how. This is a fantastic (as I said before) beginning. Great work!
Great tutorial!
I needed to modify the rewrite rule to make it work on my server.
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule .* server.php/$0 [L]
Very nice indeed!
Really helpful for beginners to catch some knowledge in a glimpse…
Great beginners guide! Even great for more advanced users to brush up on. Thanks!
Many thanks!!!
Very very nice! I thinks this could be useful not only for beginners ;) Thanks!
Hi!
Nice article, but you could write something about security too. For example how do you check if user got permissions to GET/PUT/DELETE etc.
Asking that is like learning how the wheel works and asking your tutor about engine. My point is that it’s relevant, but completely outside of the articles scope. It is important to learn, but maybe by someone else?
However, you should read up on OAuth (http://oauth.net/) for authentication. You’ll find great documentation on the web page. If that don’t suite your application, you should investigate the usage of unique API keys.
Permissions are yet another story, and very dependent on your specific application. If your using PHP you might be interested in Zend Framework’s (http://framework.zend.com) ACL (Access Control List) component. It might be a bit to wrap you head around, but will be rewarding in the long term.
On a general security note: You should, like all data submitted by the client, assume guilt until proven innocent. I.e.: All data should be properly validated, sanitized and escaped.
Hope this points you in the right direction!
Hm, maye I was not clear. I know how to handle authentication on a normal webpage. But when you make a REST request with curl for example, you can’t just submit login form before sending the request. My question was more about how to handle 3rd person authentication. If you have a public avaliable REST service, how do you handle authentication? How the service’s server know that it’s really you? Do you simply use httpasswd authentication and make request to something like http://user:password@server/ or what is the best practice?
That depends on the service in question. If the API is supposed to act on behalf of the user, like Facebook’s Graph API (most of the methods at least), a good choice would be to implement OAuth. You should look into the differences between OAuth 1 and 2, and prepare (if not implement) for the future.
Basically, 3rd party developers register their application in return for an unique key. This key is used to make the initial call to the API, and trigger the approval process for the user. In return you’ll get an access token, which sub-sequent calls will be based on.
There is a bit more to it, so I suggest you read the documentation thoroughly. But it’s easy to understand, and there is a lot of great tools that will simplify the implementation.
This is a respond to redman above. You don’t need to use .htpasswd depending on your language. PHP is perfectly capable of outputting standard HTTP headers and thus triggering standard HTTP based authentication. I do this in inKWell (http://www.github.com/dotink/inkwell) on the inkwell site branch, it allows me to use the same code accessing the same database as I do for a standard login page, just a different method of user input.
This should allow for no separate htpasswd database or the need to re-generate such a thing, and the abilitty for CURL or any HTTP compliant library/software to authenticate.
Nice tutorial.
But for a beginner to understand the code, if you can define more clearly how you got the
$paths variable. I know its from $_SERVER['REQUEST_URI'], but again you are using array_shift, which means the $_SERVER['REQUEST_URI'] should be splited into an array.
But really great article to understand all the pieces.
Thanks
Tanmay
In most (RESTful) situations, you would only have to split the uri into segments by using $path = explode(‘/’, $_SERVER['REQUEST_URI']);
Don’t have to make it harder than that :)
Be careful with explode. It will split the URL into an array where the first value is blank because of the leading slash.
ie /client/jim/ becomes
[0] = ”
[1] = ‘client’
[2] = ‘jim’
[4] = ”
Great article! Thanks for sharing.
Regards,
George
Great article, definitely some good information I was never aware of, and I wouldn’t generally call myself a beginner. That said, I’ll probably really sound like one now: in the article, you say “We use a switch statement, which should be avoided in a real application” Why would you want to avoid a switch statement, especially in that situation? It seems like the most reasonable method for determining what the request is.
I went maybe a bit over the top there. Here there aren’t that many possible choices (they are limited to HTTP methods), but switches can get cumbersome, especially if they are nested or mixed with if/else.
When you need to choose between different function based on some sort of key, it is often recommended to build some kind of lookup mechanism. Basically, you ‘store the functions in an array’, then you just call the function corresponding to the key you get, which is just one line, and you can maintain the functions and the keys separately from where they are called. Both things help with maintenance. See
http://www.php.net/manual/en/functions.variable-functions.php as a starting point.
I am lost! It’s too complicated man!
You guys are mind readers….I was just looking into REST last night. Great tutorial
Surprisingly few web developers know how HTTP requests work, although many use REST-type data transactions each day. Until just a few weeks ago, I was guilty of employing pre-made scripts to make use of cURL such as the standard Authorize.net scripts. This and other articles really make a great effort at enlightening those who may be unfamiliar, in a simple and easy-to-follow way.
Thanks!
thanks for the article as a beginner they don’t understand but great tutorial
regards neo
Thanks for this tutorial, it’s nice.
I just want to add this particulary useful link from stackoveflow that helped me a lot to understand the difference between put and post. Nice post. http://stackoverflow.com/questions/107390/whats-the-difference-between-a-post-and-a-put-http-request
Great tut. Thanks!
Perhaps I’m missing something, but the creation of a extensively RESTful API is in many ways additional work to the creation of a web interface since HTML Forms do not support the PUT/DELETE methods, or anything aside from GET or POST, hence why we tend to get one or the other with parameters like ?action=delete
It’s nice for creating a web API, but it only relates to the architecture of served web pages to a certain degree for this reason.
Excellent post!
Really liked it.
Some typos I found while reading:
“The HTTP response should specify the content type of the body. This is done in the header, in the Content-Type field; for instance:
1. Content/Type: application/json ” -> Shouldn’t be “Content-Type: [...]“?
And in “Setting up Aplication Example”:
“In Apache, you need to enable mod_rewrite and put the supplied mod_rewrite configuration somewhere in your Apache configuration, or your .htacess file” -> Missed a ‘c’ in .htaccess.
And I repeat, EXCELLENT article, big thanks for sharing!
Great tutorial! I am about to implement either SOAP or REST. You convinced me to at least dive a bit deeper into REST. Well explained and laid out.
Really helpful for beginners, also gave me some ideas.
Many Thanks!
Thanks for the tutorial.
Unfortunately, I cannot get it to work. I’m using OS X 10.6 (Snow Leopard). I have Apache running and I can bring up the info.php page. I created an .htaccess folder and put these lines it it:
RewriteEngine on
RewriteRule ^/.* /server.php
I also tried Omar’s alternative above:
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule .* server.php/$0 [L]
I saved the .htaccess file to both the root folder (localhost) and the clients folder. I put server.php in the clients folder.
When I type this into Terminal: curl -v http://localhost:80/clients/jim
I get this:
* About to connect() to localhost port 80 (#0)
* Trying 127.0.0.1… connected
* Connected to localhost (127.0.0.1) port 80 (#0)
> GET /clients/jim HTTP/1.1
> User-Agent: curl/7.19.7 (universal-apple-darwin10.0) libcurl/7.19.7 OpenSSL/0.9.8r zlib/1.2.3
> Host: localhost
> Accept: */*
>
< HTTP/1.1 404 Not Found
< Date: Mon, 15 Aug 2011 03:03:48 GMT
< Server: Apache/2.2.17 (Unix) mod_ssl/2.2.17 OpenSSL/0.9.8r DAV/2 PHP/5.3.4
< Content-Length: 209
< Content-Type: text/html; charset=iso-8859-1
<
404 Not Found
Not Found
The requested URL /clients/jim was not found on this server.
* Connection #0 to host localhost left intact
* Closing connection #0
Does anyone have any suggestions?
Thanks!
Hal
I did finally get it to work. Yeah!
I created an .htaccess file in my clients folder with these lines and used Omar’s rewrite code (above). Thanks Omar!
One example still doesn’t work quite right:
curl -v -X PUT http://localhost:80/clients/paul -d ‘{“address”:”Sunset Boulevard” }
When I try this one in Terminal, it doesn’t execute — it just adds a new line with a single “>” prompt.
From there I can ^C to exit, or enter one of the other examples, which then returns a 400 error.
Did anyone get that one to work?