MongoDB Facts: Lightning speed aggregation

In my previous post, I demonstrated how fast you can insert 50 millions time-event entries with MongoDB. This time we will make use of all that data to fuel our aggregation tests. This is how one time-event entry looks like:             { "_id" : ObjectId("529a2a988cccdb538932d31f"), "created_on" : ISODate("2012-05-02T06:08:47.835Z"), "value" : 0.9270193106494844 } Beside the default ...

MongoDB Facts: 80000+ inserts/second on commodity hardware

While experimenting with some time series collections I needed a large data set to check that our aggregation queries don’t become a bottleneck in case of increasing data loads. We settled for 50 million documents, since beyond this number we would consider sharding anyway. Each time event looks like this:           { "_id" : ObjectId("5298a5a03b3f4220588fe57c"), "created_on" ...

Spring Data MongoDB cascade save on DBRef objects

Spring Data MongoDB by default does not support cascade operations on referenced objects with @DBRef annotations as reference says: The mapping framework does not handle cascading saves. If you change an Account object that is referenced by a Person object, you must save the Account object separately. Calling save on the Person object will not automatically save the Account objects ...

Optimistic locking retry with MongoDB

In my previous post I talked about the benefit of employing optimistic locking for MongoDB batch processors. As I wrote before, the optimistic locking exception is a recoverable one, as long as we fetch the latest Entity, we update and save it. Because we are using MongoDB we don’t have to worry about local or XA transactions. In a future ...

MongoDB optimistic locking

When moving from JPA to MongoDb you start to realize how many JPA features you’ve previously taken for granted. JPA prevents “lost updates” through both pessimistic and optimistic locking. Optimistic locking doesn’t end up locking anything, and it would have been better named optimistic locking-free or optimistic concurrency control, because that’s what it does anyway. So, what does it mean to ...

Using Sorted Sets with Jedis API

In the previous post we started looking into Jedis API a Java Redis Client. In this post we will look into the Sorted Set(zsets). Sorted Set works like a Set in the way it doesn’t allow duplicated values. The big difference is that in Sorted Set each element has a score in order to keep the elements sorted. We can see ...

Crawling the Web with Cassandra and Nutch

So, you want to harvest a massive amount of data from the internet?  What better storage mechanism than Cassandra?  This is easy to do with Nutch. Often people use Hbase behind Nutch.  This works, but it may not be an ideal solution if you are (or want to be) a Cassandra shop.   Fortunately, Nutch 2+ uses the Gora abstraction layer ...

