Getting [On|Off] the CouchDB?
Based on my recent experimenting with CouchDB I have conflicted feelings of whether to get on, or stay off the couch. I am intrigued by the notion of a schema-free document-based database, and how applications could be created to leverage such a technology. I am also reserved about its benefits to my current work environment. Most of the relational data storage I’ve dealt with has been handled by Spring/Hibernate and now Grails/Hibernate. Does CouchDB buy anything that isn’t already easy to do with methodology like GORM? Let me attempt to dissect some of the pros and cons of CouchDB.
PROS
Probably the number one mentioned feature by anyone talking about CouchDB is that it’s schema-free. That means that each document is its own structure, that everything is based on a key-value pairing, where the value is typically a document, but can also be a base type. This is pretty cool when dealing with describing real world objects, as one can be more descriptive and hierarchical in storing the data. Also, updating your application’s domain model is much like evolution, as soon as you need something added to the model, just add it.
CouchDB’s RESTful HTTP service implementation is great for the many applications being created for today’s web. Send Javascript queries/requests to the service, and get JSON formatted data in return. This is great, since many front-end technologies can decipher JSON, and the format itself is even human readable if need be.
There is support for interacting with CouchDB data offline. One can access and edit data while not connected, and when back online can merge and update changes made while offline. This kind of support out of the box makes CouchDB a prime candidate forĀ web based applications like blogs, wikis, bug reporting, CRM, etc.
Another quality of CouchDB is its MapReduce implementation for handling large data sets. I’m going to redirect you to the author’s notes for this one: MapReduce and Google’s definition.
CONS
Transaction management is non-existent. Relational database managements systems (RDBMS) provide mechanisms for things like transaction management, particularly unique constraints and locking. CouchDB is based on being ‘eventually consistent’ and ‘available’ (See the CAP Theorem section for more), which although great for many things, doesn’t lend itself to transaction management.
Data security is also a concern. A somewhat old article describes this issue quite well: Planned Security Model For CouchDB. The CouchDB team has been working on security and validation but are still early in their development.
To a less important degree, but still worth mentioning, is that personally I really don’t find the out-of-the-box UI for CouchDB to be all that useful. It feels clunky to navigate your key/document pairs, and too feature-light for managing the database.
Conclusion
If you’re looking to create a document based web application, then this may be a worthwhile technology to use. CouchDB has a number of very impressive features. However, for now at least, I believe I’ll be staying off the couch.
Aside:
When testing out CouchDB I went through the process of manually installing the application. I had to resolve the dependencies, and deal with compilation errors on my own. I kind of wish I’d found this site earlier. Follow that link if you’re a MacOSX user and want an installer for CouchDB.
For me, this was pretty much theĀ only useful tutorial on getting started with CouchDB: Programming CouchDB with Javascript.
Resources:
- http://couchdb.apache.org/
- http://books.couchdb.org/relax/
- http://www.eflorenzano.com/blog/tag/couchdb/
- http://labs.google.com/papers/mapreduce.html
- http://damienkatz.net/2008/02/incremental_map.html
- http://jan.prima.de/~jan/plok/archives/142-CouchDBX-Revival.html
- http://www.automatthew.com/2008/01/planned-security-model-for-couchdb.html
- http://jchrisa.net/drl/_design/sofa/_show/post/couchdb_edge__security_and_vali
- http://jan.prima.de/~jan/plok/archives/108-Programming-CouchDB-with-Javascript.html
- http://www.infoq.com/CouchDB
- http://www.infoq.com/news/2007/11/the-rdbms-is-not-enough
- http://www.infoq.com/news/2008/11/Database-Martin-Fowler
- http://code.google.com/p/couchdb-lounge/wiki/SettingUpTwoCouchInstances
Tags: couchdb, database, MapReduce, schema-free

July 1st, 2009 at 15:59:53
Hi,
one possible metric to look at is lines of code. ORM’s are traditionally heavy here. Less LOC means less bugs since bugs per LOC is pretty constant.
E.g. (no bashing) CouchDB has less LOC than Rails’ ActiveRecord (pears and beers comparison, of course).
Cheers
Jan
–
July 3rd, 2009 at 09:14:31
Maybe worth you checking out M/DB:X which is a different take on the idea of a JSON database by converting to and from XML and storing as persistent XML DOMs. The XML DOMs can be modified, manipulated and searched in the XML domain and then returned as JSON strings. And with a security mechanism modelled on SimpleDB, it answers your issues regarding security. See http://www.mgateway.com/mdbx.html if you’d like to know more.