NoSQL is hot nowadays. Really hot. When you think of building your next cool application around it, there are dozens of options such as Accumulo, Cassandra, Coherence, CouchDB, GemFire, Hazelcast, HBase, LevelDB, MarkLogic, Memcached, MongoDB, Neo4J, Riak, Redis, and you name it. They offer different data models (key-value pairs, wide-columnar, objects/documents, graph, etc.) and attractive features such as horizontal scalability and high availability. It is always exciting to work with cutting-edge technologies. On the other hand, cutting-edge technologies may turn bleeding-edge if we choose a unsuitable one. Unfortunately, it is pretty hard to choose the “right” NoSQL. We could be easily lost in the endless feature comparisons. What data model? CP or AP? Synchronous or asynchronous replication? In memory or durability? ACID vs availability? Active-active multi-datacenter support? There are a lot of hard decisions to make. The various (and sometime unclear) benchmarks with contradicting results may just confuse us further. Continue reading
For in-depth information on various Big Data technologies, check out my free e-book “Introduction to Big Data“.
We have explored several interesting distributed key-value databases including HBase and Accumulo, Riak, and Cassandra. Although key-value pairs are very flexible, it is tedious to map them to objects in applications. In this post we will learn about a popular document-oriented database MongoDB. MongoDB uses JSON-like documents with dynamic schemas, making the integration of data in certain types of applications easier and faster. Beyond key search, MongoDB supports search by value, range queries, and text searches. Any field in a document can be indexed (by B-Trees, similar to those in RDBMS). Continue reading