When you talk to many people in the same domain, you either get totally confused or begin to see some commonality in their views and thus some light. Every vendor has its claims about its technologies and products. Some try to emphasize their merits and downplay their demerits. That is understandable. I sat down with Bob Wiederhold of Couchbase at a recent NoSQL conference and asked about the company and its products. Bob was very frank about their products and the status of their progress.
Bob Wiederhold, President and Chief Executive Officer, Couchbase
Couchbase was formed about a year and a half ago (February 2011) by merging Membase (based in Mountain View, CA) and Couchone (based in Oakland, CA). Bob came from the Membase side, and the new Couchbase is located in Mountain View. Couchone was behind Apache CouchDB, which is open source written in Erlang (also open source) and with Apache License 2.0. Most of the original key developers and committers (including Damien Katz) for Apache CouchDB moved from Couchone to Couchbase. The original developers and committers still contribute to Apache CouchDB, but most efforts are now focused on Couchbase 2.0, which is a separate open source project also licensed with Apache License 2.0 and is being implemented mostly in C. This is because Erlang is a functional programming language and C is more appropriate to increase speed. I’ve dealt with many programming languages in the past but never touched Erlang before. Bob emphasized that while Couchbase is heavily influenced by Apache CouchDB, it is a completely separate open source project. Bob told me that the merger went very smoothly and they are now about 100 people strong.
I told Bob that I am confused by the NoSQL market, and he shared his view of it. It is interesting to hear different persons’ views on the market. Of course, there is not 100% agreement on the current market, but different views sometimes give me a pretty good perspective. He first distinguished the operational from the analytics engine, as below. The analytics engine is a Hadoop and its derivatives, such as Cloudera, Hortonworks, and MapR. Note that Couchbase is a partner of Cloudera.
Then he expanded the NoSQL area according to technology and placed NoSQL players in each category. I will not discuss each category in detail. Those who want more detail can reference here. Wikipedia classifies the NoSQL categories in a much finer way. For example, there are several subcategories for the key-value camp, and it distinguishes the graph-based from the object-based ones. By the way, at the conference I took a four-hour crash course given by Dan McCreary of Kelley-McCreary & Associates. It was a good tutorial, and if you had the chance, you could sit down and spend a half day in his class. I also thought a whitepaper by Couchbase, Navigating the Transition From Relational to NoSQL Database Technology, useful. It describes document-based technology in comparison with the relational database.
The current version of Couchbase (1.8) is in the key-value camp. But come the 2.0 release, it will become a document-based database completely. Each camp has it merits and shortcomings. Will one category dominate others and all its technologies be consolidated into one? As for what will happen to this market, Bob thinks the following.
He thinks the key-value and the document-based databases will be merged, and the merged area will be the biggest of the three new areas. The other two areas will not go away but remain somewhat a niche market. The document-based solution is powerful, as it can contain a document like an entire website as a blob (in a JSON format) and retrieve it. For this, JSON is becoming the de facto standard over XML; Couchbase also uses JSON. There are proponents for both JSON and XML. In the Web environment, JSON is far more suitable, but XML has its own areas of application. There are a few tools for converting JSON to XML and vice versa.
As for the competition, Bob was very frank in analyzing Couchbase against other players in the document camp, as in the following table. Checkmark size indicates how strong and complete an attribute is. Well, the size is somewhat arbitrary and just indicates relative competency. Bob said that Couchbase has put a lot of emphasis on performance, scalability, and always-on features (thus, big checkmarks) with less focus on ease of development (thus, a smaller checkmark). He also added that with the 2.0 release, ease of development will improve significantly since this is the point at which they become a document database. He said that his competition has put a lot of emphasis on ease of development but needs to work on other features.
Couchbase moves to focus on ease of development
Competition moves to other features
He said although ease of development requires a lot of expertise, other things, like performance, are very hard to improve. He told me Couchbase has a big advantage in that it can consistently provide sub millisecond latencies for reads and writes that is often 1/3 to 1/10 the latencies of other solutions. In addition, Couchbase can provide throughput/server that is often 2-4x higher than competing solutions (see http://bit.ly/NKJkVH and http://bit.ly/Qulb4R). The consistent low latency assures very responsive applications and the higher throughput per server means you need to buy less hardware and software than with other competing solutions.
The current application areas that use Couchbase include social gaming, ad and offer targeting, social networking, online business services, e-commerce, cloud data services, and mobile-to-cloud data synchronization. Because I am interested in the application of NoSQL technologies to power utilities companies for smart meters and monitoring (such as with sensors with SCADA access) many types and speeds of data (static, like asset data, to real-time meter-read data), I wondered how products like Couchbase can be applied. Bob’s view was that as the amount of sensor data and the frequency at which it is gathered increases, having a central database that can keep up with the inflow of data will become a challenge. NoSQL databases that have an ability to linearly scale up write throughput are an easy solution to capture the incoming data stream. Techniques like Couchbase Server’s incremental map reduce are ideal to provide real-time aggregation/analytics over the data.
I asked him about an ecosystem for each player. He thinks developing an ecosystem is vital for the success of Couchbase. The way things are, the market seems still very confused, but it is expanding rapidly.