The hyping of the NoSQL foo

by Adam Ferrari

Adam Ferrari

Last week I discussed my impressions of the NoSQL movement, and because it’s such a fast-moving topic, I wanted to round up some interesting developments since then.

My main point in my last post was that NoSQL technology should not be defined diffusely as any of the new generation of databases. NoSQL is primarily motivated by the needs of large, widely distributed web applications. In other words, the main NoSQL technical innovations are focused on elastic, horizontal scalability, and higher availability by trading against strict transactional data consistency. And given this primary motivation, I argued that it’s just plain confusing to lump in other technologies like Endeca’s search and analysis engine. Even though they share some important traits like non-relational data models, it would be a shame to draw focus away from other great innovation happening with new databases

Even in the short time since I wrote that post, I’ve seen some nice corroboration of this view.  For example, I had the opportunity to attend the NoSQL Live conference last week, which was hosted by 10gen, who sponsor the MongoDB project. Dwight Merriman, CEO of 10gen, kicked off the event with a brief discussion of the question, “what is NoSQL?” I really admired the concreteness and specificity of the answer he gave; paraphrasing, he basically described NoSQL as systems that make big horizontal scaling feasible by avoiding the complexity involved in handling requirements like joins and heavy duty transaction support. The focus in his definition is on big distributed systems, not on the non-tabular data aspect, highlighting once again why NoSQL is a relatively misleading name for this area of technology

Curt Monash took up this naming issue in his recent “naming of the foo” post, one of three that he published in quick succession about NoSQL.  In that post he suggested “High-Volume Simple Processing” (HVSP, in the spirit of OLTP) as an alternative. Again, the focus is primarily on scale and distribution (the “simple” part is where the distribution lives). And Monash also takes up the issue of keeping the NoSQL definition focused:

Systems I’m leaving out of the HVSP and hence also NoSQL categories include… non-SQL data stores that don’t meet the HVSP criteria. Dave Kellogg stretches things when he claims that MarkLogic is a NoSQL system.

(Echoing a friendly debate I had with Dave in the comments of my last post.)

It’s no surprise that there’s continuing confusion on this point. In another moment I enjoyed at NoSQL Live, Tim Anglade invoked the Gartner Hype Cycle to characterize the current frenzy of activity around NoSQL. He correctly pointed out that although individual projects may be farther along, the category as a whole is in the infamous “Peak of Inflated Expectations,” and experiencing the commensurate level of chaos. But give the number and enthusiasm of the attendees, and the discussion of important topics, it seems like a good chaos with a Plateau of Productivity ahead.

  • Share/Bookmark
Posted on March 22, 2010 at 11:47 am · Permalink
In: databases

Leave a Reply