Getting Started with CouchDB

Getting Started with CouchDB

Tutorial Details
  • Program: CouchDB
  • Version (if applicable): 1.0.2
  • Difficulty: Easy
  • Estimated Completion Time: 45 Minutes

NoSQL has been one of the most talked about topics over the past couple of months. This tutorial will introduce you to CouchDB, a NoSQL implementation and teach you how to get started with the platform.


What is NoSQL?

NoSQL is schema free — you don’t need to decide the structure up front.

NoSQL [not only SQL] is a movement towards document stores that do not make use of the relational model. The fundamental paradigm shift is in the way they store data. For example, when you’d need to store data about an invoice, in RDBMS you’d need to distill this information into tables and then use a server-side language to transform this data back into real life objects. On the other hand, in NoSQL, you just store the invoice. NoSQL is schema free, which means you don’t need to design your tables and structure up front — you can simply start storing new values.

Continuing the invoice example, some invoices may include a VAT number, some may not. In a RDBMS, you’d need to tell your table to first accept a VAT number and then that it could possibly be null. In NoSQL, however, you can just store invoices with or without a VAT number — there is no schema. Keep in mind that NoSQL is not a silver bullet. If your data is truly relational, sticking with your RDBMS would be the right choice.


Querying NoSQL Databases

MapReducing has benefits over SQL queries because the map/reduce task can be distributed among multiple nodes, something not possible in RDBMS.

NoSQL databases use map/reduce to query and index the database. In RDBMS, you run a query joining multiple tables together to first create a pool of data and then the query runs creating a resultset, a subset of the overall data. In NoSQL, you use map/reduce to create a ‘view’ (similar to a resultset) this view is a subset of the overall data.

Map is essentially extracting data and reduce, data aggregation. The more familiar you are with RDBMS, the more difficult grasping map/reduce will be. MapReducing benefits over SQL queries because the map/reduce task can be distributed among multiple nodes, something not possible in RDBMS. Adding a new record to the database does not always constitute the map/reduce task being completely rerun.


Introducing CouchDB

A few facts about CouchDB that you should know:

  • CouchDB is a JSON document-oriented database written in Erlang.
  • It is a highly concurrent database designed to be easily replicable, horizontally, across numerous devices and be fault tolerant.
  • It is part of the NoSQL generation of databases.
  • It is an open source Apache foundation project.
  • It allows applications to store JSON documents via its RESTful interface.
  • It makes use of map/reduce to index and query the database.

Major Benefits of CouchDB

  • JSON Documents – Everything stored in CouchDB boils down to a JSON document.
  • RESTful Interface – From creation to replication to data insertion, every management and data task in CouchDB can be done via HTTP.
  • N-Master Replication – You can make use of an unlimited amount of ‘masters’, making for some very interesting replication topologies.
  • Built for Offline – CouchDB can replicate to devices (like Android phones) that can go offline and handle data sync for you when the device is back online.
  • Replication Filters – You can filter precisely the data you wish to replicate to different nodes.

Putting It All Together

CouchDB is a database designed to run on the internet of today.

CouchDB allows you to write a client side application that talks directly to the Couch without the need for a server side middle layer, significantly reducing development time. With CouchDB, you can easily handle demand by adding more replication nodes with ease. CouchDB allows you to replicate the database to your client and with filters you could even replicate that specific user’s data.

Having the database stored locally means your client side application can run with almost no latency. CouchDB will handle the replication to the cloud for you. Your users could access their invoices on their mobile phone and make changes with no noticeable latency, all whilst being offline. When a connection is present and usable, CouchDB will automatically replicate those changes to your cloud CouchDB.

CouchDB is a database designed to run on the internet of today for today’s desktop-like applications and the connected devices through which we access the internet.


Step 1 – Installing CouchDB

The easiest way to get CouchDB up and running on your system is to head to CouchOne and download a CouchDB distribution for your OS — OSX in my case. Download the zip, extract it and drop CouchDBX in my applications folder (instructions for other OS’s on CouchOne).

Finally, open CouchDBX.


Step 2 – Welcome to Futon

After CouchDB has started, you should see the Futon control panel in the CouchDBX application. In case you can’t, you can access Futon via your browser. Looking at the log, CouchDBX tells us CouchDB was started at http://127.0.0.1:5984/ (may be different on your system). Open a browser and go to http://127.0.0.1:5984/_utils/ and you should see Futon.

Throughout the rest of this tutorial I will be using Futon in Firefox. I’ll also have Firebug and the console view open to see all the HTTP requests Futon is sending behind the scenes. This is useful as your application can do everything Futon is doing. Let’s go ahead and create a database called mycouchshop.

CouchDB jQuery Plugin

Futon is actually using a jQuery plugin to interact with CouchDB. You can view that plugin at http://127.0.0.1:5984/_utils/script/jquery.couch.js (bear in mind your port may be different). This gives you a great example of interacting with CouchDB.


Step 3 – Users in CouchDB

CouchDB, by default, is completely open, giving every user admin rights to the instance and all its databases. This is great for development but obviously bad for production. Let’s go ahead and setup an admin. In the bottom right, you will see “Welcome to Admin Party! Everyone is admin! Fix this”.

Go ahead and click fix this and give yourself a username and password. This creates an admin account and gives anonymous users access to read and write operations on all the databases, but no configuration privileges.

More on Users

In CouchDB it would be unwise to create a single super user and have that user do all the read/write.

Users in CouchDB can be a little confusing to grasp initially, specially if you’re used to creating a single user for your entire application and then managing users yourself within a users table (not the MySQL users table). In CouchDB, it would be unwise to create a single super user and have that user do all the read/write, because if your app is client-side then this super user’s credentials will be in plain sight in your JavaScript source code.

CouchDB has user creation and authentication baked in. You can create users with the jQuery plugin using $.couch.signup(). These essentially become the users of your system. Users are just JSON documents like everything else so you can store any additional attributes you wish like email for example. You can then use groups within CouchDB to control what documents each user has write access to. For example, you can create a database for that user to which they can write to and then add them to a group with read access to the other databases as required.


Step 4 – Creating a Product Document

Now let’s create our first document using Futon through the following steps:

  1. Open the mycouchshop database.
  2. Click “New Document”.
  3. Click “Add Field” to begin adding data to the JSON document. Notice how an ID is pre-filled out for you, I would highly advise not changing this. Add key “name” with the value of “Nettuts CouchDB Tutorial One”.
  4. Make sure you click the tick next to each attribute to save it.
  5. Click “Save Document”.

Go up a level, back to the database and you should see one document listed with the previous ID as the key and a value beginning with{rev: . This is the JSON document you just created.


Step 5 – Updating a Document

CouchDB is an append only database — new updates are appended to the database and do not overwrite the old version. Each new update to a JSON document with a pre-existing ID will add a new revision. This is what the automatically inserted revision key signifies. Follow the steps below to see this in action:

  • Viewing the contents of the mycouchshop database, click the only record visible.
  • Add another attribute with the key “type” and the value “product”.
  • Hit “Save Document”.

After hitting save, a new revision key should be visible starting with the number 2. Going back a level to the mycouchshop database view, you will still see just one document, this is the latest revision of our product document.

Revisions

While CouchDB uses revisions internally, try to not lean on it too much. The revisions can be cleaned through Futon quite easily and it is not designed to be used as a revision control system. CouchDB uses the revisions as part of its replication functionality.


Step 6 – Creating a Document Using cURL

I’ve already mentioned that CouchDB uses a RESTful interface and the eagle eyed reader would have noticed Futon using this via the console in Firebug. In case you didn’t, let’s prove this by inserting a document using cURL via the Terminal.

First, let’s create a JSON document with the below contents and save it to the desktop calling the file person.json.

	{
		"forename":	"Gavin",
		"surname":	"Cooper",
		"type":		"person"
	}

Next, open the terminal and execute cd ~/Desktop/ putting you in the correct directory and then perform the insert with curl -X POST http://127.0.0.1:5984/mycouchshop/ -d @person.json -H "Content-Type: application/json". CouchDB should have returned a JSON document similar to the one below.

{"ok":true,"id":"c6e2f3d7f8d0c91ce7938e9c0800131c","rev":"1-abadd48a09c270047658dbc38dc8a892"}

This is the ID and revision number of the inserted document. CouchDB follows the RESTful convention and thus:

  • POST – creates a new record
  • GET – reads records
  • PUT – updates a record
  • DELETE – deletes a record

Step 7 – Viewing All Documents

We can further verify our insert by viewing all the documents in our mycouchshop database by executing curl -X GET http://127.0.0.1:5984/mycouchshop/_all_docs.


Step 8 – Creating a Simple Map Function

Viewing all documents is fairly useless in practical terms. What would be more ideal is to view all product documents. Follow the steps below to achieve this:

  • Within Futon, click on the view drop down and select “Temporary View”.
  • This is the map reduce editor within Futon. Copy the code below into the map function.
    			function (doc) {
    				if (doc.type === "product" && doc.name) {
    					emit(doc.name, doc);
    				}
    			}
    		
  • Click run and you should see the single product we added previously.
  • Go ahead and make this view permanent by saving it.

After creating this simple map function, we can now request this view and see its contents over HTTP using the following command curl -X GET http://127.0.0.1:5984/mycouchshop/_design/products/_view/products.

A small thing to notice is how we get the document’s ID and revision by default.


Step 9 – Performing a Reduce

To perform a useful reduce, let’s add another product to our database and add a price attribute with the value of 1.75 to our first product.

	{
		"name":		"My Product",
		"price":	2.99,
		"type":		"product"
	}

For our new view, we will include a reduce as well as a map. First, we need to map defined as below.

	function (doc) {
		if (doc.type === "product" && doc.price) {
			emit(doc.id, doc.price);
		}
	}

The above map function simply checks to see if the inputted document is a product and that it has a price. If these conditions have been met, the products price is emitted. The reduce function is below.

function (keys, prices) {
	return sum(prices);
}

The above function takes the prices and returns the sum using one of CouchDB’s built in reduce functions. Make sure you check the reduce option in the top right of the results table as you may otherwise be unable to see the results of the reduce. You may need to do a hard-refresh on the page to view the reduce option


Conclusion

In this tutorial, we took a brief but focused look at CouchDB. We saw the potential power of CouchDB and how easy it is to get started. I’m sure you have plenty of questions at this point so feel free to chime in below. Thank you so much for reading!

Note: Want to add some source code? Type <pre><code> before it and </code></pre> after it. Find out more
  • L2L

    Nicely done. I meddled with it with expressjs and node, I prefer mongodb though.

  • http://newarts.at Drazen Mokic

    Do you have any idea how CouchDB does work on high traffic? What`s its limit?

  • Dels

    Interesting, NoSQL implementation should help a lot, reminds me the day i use flat structured file as database record or xml for that…

    Too bad RDBMS’s and SQL language had poisoned me for years and i addicted to them LOL

  • http://mileonemedia.com Joe Cianflone

    This looks pretty interesting, but I still don’t fully get it. On small sites…is this worth it? Do you only see the benefits on large sites, like Twitter, Digg and Facebook…I feel like I need a beginner primer on NoSQL

    • Nick Sanzone

      I absolutely agree with this. I would love to see a tutorial on NoSQL and what exactly it is / its benefits.

    • Jonas

      “On small sites…is this worth it?”

      No – not in my opinion. You can easily use NoSQL on smaller sites, but you’ll benefit more from using relational databases such as MySQL and PostgreSQL. Check my comment further below.

  • Mike Miller

    Couch is unique in that it scales down to desktop and mobile, while also scaling up to clusters/datacenter (with bigcouch). It can be pretty sweet to write the same code and have it run on massively different scales.

  • Jacen

    Fantastic! I would love to see a tutorial implementing CouchDB with a practical application. Maybe in conjunction with codeigniter?

    Speaking of which, CI 2.0 is released. Are there going to be any articles covering the changes?

    • andy

      I second that!

    • Brad

      Same here!

      • jhonson

        also here !

  • http://www.thedevelopertuts.com Bratu Sebastian

    This is great for social websites and fast experimentation. Imagine making a browser game with this …

    Great tutorial, altrough a bit too complicated for a first tutorial

  • http://mikhailkozlov.com Mikhail

    I wounder how much data you need to have so using noSQL actually pays off? I know everybody talks twtter and facebook, but what are the chances that you next portfolio website will have that many visitors and they all will interact.

    I think it would be interesting to see how noSQL is used for storing a bit complex related data. Just to see how it works without LEFT/RIGHT JOINs.

  • Haider

    Excellent read. May I request for a quick tutorial on MongoDB?

  • http://butenas.com Ignas

    Cool! CouchDB could be really useful in some projects. I know a little, but I also got some interesting points here. Thanks!

  • travis

    Thanks for the detailed tutorial. I’d like to see more about the acl control over this.

  • David Savage

    Nice tutorial! Very easy to follow and was packed with info….hope the next one comes soon!

    So are all NoSQL implementations like this (just essentially single files)? Maybe I’m having a hard time wrapping my head around the whole concept…

  • G

    I’m still a bit confused as to when someone would use NoSQL and when RDBMS.

    • Kevin

      If you deal with extremely large datasets that will typically require master/slave or other replication strategy, then noSQL’s are easy to set up in these configurations – much harder to acomplish in a RDBMS like MySQL.

      That said, there is nothing stopping you using a noSQL db for applications that don’t require replication too. It will not harm or hinder performance. In fact, some people use them as a caching layer as the read/writes are so fast.

  • http://www.queueinspiration.com queueinspiration

    I previously did not know with CouchDB,
    but with this introduction and great tutorial, i will become excited

    thanks very much for share

  • Ali Baba

    Great “Getting Started with CouchDB”. Little bit hard to grasp concept of NoSQL after working with RDBMS for years.
    I would like to see better example like Website with CouchDB on back-end.

  • http://gavincoop.co.uk/ Gavin Cooper

    Thanks for all the constructive comments.

    @Drazen CouchDB is designed for easy horizontal scalability and to handle enormous amounts of traffic. Jan one of the developers of CouchDB ran a crud read benchmark and got 2,500 simultanous connections on a 2GHz Athlonm using 9.8MB of ram. Checkout the case studies on couch.io http://www.couch.io/case-studies

    @Joe it can be worth it on smaller sites, the sheer simplicity of saving and updating objects, means you only need to write the bit that matters. Your business logic.

    I will see what I can do about a more practical example of using CouchDB on an application. Like I said in the tut it’s difficult to get your head around implementing if you’ve been using RDBMS for a while.

    • Hamrath

      I guess, a nice practical example would be a blog with articles & comments. Probably one of the easiest things to show people, what you can do with NoSQL.

      I’m working with MongoDB on a event manager like webapp and I love it.

  • http://www.iamkumaran.com/ Muthu Kumaran

    Interesting, thanks!

  • http://www.shaneparkerphoto.com Shane Parker

    You definitely have my interested peaked, but like the others said, I’m still confused and would really appreciate some real-world examples of CouchDB in action. While this tutorial introduces me to CouchDB, it doesn’t really get me started with it.

  • Jonas

    I’m worried that newcomers to the web development scene might not grasp that NoSQL is simply not suited for most web projects other than those that 1) require massive scalability (think Twitter) and 2) have a very simple data structure.

    Much of the NoSQL hype seems to build on an idea that “MySQL doesn’t scale.” However, MySQL is used for websites that get more hits than the vast majority of sites (Facebook still MySQL for most of its stuff), and scaling it is to some extent mostly a case of knowing what you’re doing in terms of server setups.

    In a few years the dust will have settled and NoSQL will not be surrounded by its present hype. That’s when – hopefully – people will start using the right tool for the right job. And that’s when relational databases such as MySQL and PostgreSQL will get a comeback :)

    If you want to see why NoSQL databases aren’t suited for many tasks, take a look at this example of how to build a Twitter clone in Redis (another NoSQL server): http://redis.io/topics/twitter-clone. It’s absolutely ridiculous how much more code you have to make just to get the performance enhancements. And sure, no NoSQL database is the same, so it might be easier to achieve the same thing in, say, CouchDB or MongoDB, but my overall impression is that if you want to be able to do advanced database queries without manually building indexing tables, and if you want to be able to combine different pieces of data in a multitude of ways, you’re better off with relational databases.

    • http://mikhailkozlov.com Mikhail Kozlov

      This is very true. Right now all the hype is about scale. How many web developers created websites that have twitter traffic? How many websites generate gigabytes of data a day and accessed from every corner of the planet? I think we can count them all on two hands.

      From my personal experience I can tell that even in high traffic websites (airline tickets sales) RDBMS is really a bottleneck (not in my experience at least).

      No doubt their is a place for noSQL, we just need to find where.

      • Jan Harmsen

        Well, anybody bashing CouchDB, claiming it is not suitable for most web development projects without giving detailed reasons makes me quite sceptical.

        1. Fortunately nobody is forced to use CouchDB. It is fine if you stick with your relational databases as long as you’re happy with what you get from RDBMS. There is freedom of choice and I am very happy I found CouchDB about a year ago because it makes my life as a developer much simpler in many ways. Having worked with both db types I would never go back to a relational database for projects that are document-centric.

        2. The unique selling point why I am working with CouchDB is the simplicity of its architecture (data structure, integrated webserver, API) combined with it’s robustness, scalibility / replication and availability on many platforms.

        3. The often quoted weak points of CouchDB (not having dynamic queries / paying a huge price for dynamic queries and no query-chains) are only weak points if you can’t come up with an application architecture that fits into CouchDB’s restrictions.
        I’m at the point where I actually appreciate having to live with these restrictions because it forces me to think before building, and thinking before building is always a good thing. I guess some developers don’t like that and rather start building immediately.

        4. Since using CouchDB for document-centric projects I have a lot less headaches that were caused by RDBMS like e.g. migration-headaches, so I know I have found the right place for CouchDB ;-)

        Last but not least: no doubt, relational databases will keep their place in the future as well.

    • http://gavincoop.co.uk/ Gavin Cooper

      Agreed. I think as long as NoSQL is introduced as not only SQL, newcomers will understand it as a solution for certain needs. It’s not a silver bullet by any means.

      MySQL does scale as you’ve said, but no where near as simply and flexibly as CouchDB. After all the CouchDB have done a lot of work to provide N-Master replication and their focus on offline replication makes for very interesting structures.

      Whilst Facebook do still use MySQL unless you work for them it’s difficult to access to what lengths they use MySQL. The last video I watched suggested they used heavy sharding (per user) and there cache hit rate above 90%. Significantly making it easier to add nodes and with a memcache hit rate like that speed is less of an issue as memcache does have horizontal scalability. If you shard to this extent you arguably remove much of the relational benefits you get with MySQL.

      If you’ve got any more references to their implementation I be keen to have a look :)

      Video Ref: http://www.infoq.com/presentations/Facebook-Software-Stack

  • Ross

    Great introduction. I would love to take this tutorial to the next level and and use it in a practical application.

  • http://www.webentwickler-oase.de Philipp

    I would also like to see a more advanced tutorial, this is great for starting with CouchDB and understand the basics.
    Right now i am trying to develop a WebApp with CouchDB and its really hard if you have just worked with SQL.

  • Tal

    NoSQL/CouchDB seems very like Lotus Notes to me.
    It might have it’s place for some things but for others not so much.

  • v-Light

    I would glad to see a CouchApp tutorial or/and maybe a MongoDB tut. Or just create a premium tut, where you consider the noSQL topic

  • http://webkicks.dotink.org Matthew J. Sahagian

    To me the only thing that makes CouchDB interesting is the fact that it’s protocol is HTTP. That being said, does anyone know a really good PHP library that does *NOT* use CURL in order to implement common database APIs?

    The major thing I see lacking in CouchDB is a standardized method of identifying collections of any sorts. MongoDB has a built in idea of collections, so even if your documents are dissimilar in some ways, you can still group them easily. This allows for a somewhat clean translation to ORM styles with Active Records and Record Set paradigms… in order to achieve a Record Set with CouchDB you have to essentially bank on a custom attribute such a .type (as used in this tutorial). While this seems common, there’s no guarantee other developers won’t use .collection or whatever, and thus you can’t reliably use such a library in place or on an existing site.

    At the end of the day it appears that even some of the simplest information which is used on my websites tends to be relational if the model is expanded properly, and feel free to call me an old man, but I think schema and the schema’s enforcement of certain baseline data integrity is a downright good idea to have in place.

    I don’t see anything drastically different in these databases from early key value store type databases such as Berkeley DB, other than that they are playing more on modern web technologies for their architecture and access / presentation: JSON / Javascript, HTTP, etc. RDBM did not come about simply because there weren’t enough language bindings or ways to make these earlier database models accessible. They came about to solve real problems with data modeling in a concise and fairly standard way. Tell me what happens if I want to switch from CouchDB to MongoDB?

  • http://mikhailkozlov.com Mikhail

    Here is quick look into Schema Design in MangoDB http://www.10gen.com/video/mongosv2010/schemadesign.

  • Superman

    Can we work on windows xp with couchDB?

  • Regis

    Excellent introduction! Short and sweet. It’s the first time I understand the map/redux thingy!

  • Siddhant Sahu

    I am a complete new for CouchDb .I never use it before .So please anyone can suggest me where i can get the vedio tutorial .please answer me as soon as possible .i am in hurry now .

  • http://lujoyglamour.es JJ

    Great tutorial, but I miss the curl command line for doing the reduce bit.

  • http://dev.hasenj.org hasen

    Thank you very much! Really good introduction!

  • stefan

    Nice tutorial and introduction. Thanks.

  • Donald

    I just installed Couchbase and can access http://localhost:8091. Then what?
    I cannot access http://localhost:5984/_utils. Don’t know what to do next.
    thx for any help!

  • manu

    Thanks!!
    but please tell me how can i read db values through html page and print it.

  • Arusarka Haldar

    Great tutorial for a beginner! Thanks

  • Shital Bhabad Ghule

    hello sir, i want more number of tutorial with increasing complexity in each tutorial.