"Tech Coaster" by Chinthaka Dharmasiri: Learning MongoDB - Event Triggering with Tailable Cursors using Java Driver

I am currently working on a project in which we are designing an interactive task management tool. While working on it, I realized that we need to have a "push" notification mechanism from the database layer. Why, because it provides a very nice way for the middleware to send the latest updates relevant to a particular item (a task in my context) to the client side, without bothering client to keep polling time to time for new data.

Luckily for me, we were NoSQL db fans and were using MongoDB.

Well, Google came for the rescue. Here I was searching event triggers in MongoDB. Apparently Mongo did not have an straight forward event triggering mechanism like what could be achieved in MySQL [1].

However, there is a pretty nice workaround for this in MongoDB which is called tailable cursors. It provides you a mechanism to setup a query that retrieves data dynamically as and when it is inserted/updated in the database. Pretty cool right..

However, here is the tricky part.

It works only with the capped collection in MongoDB. What are Capped Collections in MongoDB?

Capped collections are fixed sized collections which provides high speed writes with the trade-off of being bit slow in reading data as it reads sequentially. They are circular collections, i.e. behaves like a FIFO queue when the size is full. As rightly said in its documentation, these are ideal collections for logging purposes and can well be used for versioning purposes. [2]

Important: In creating a capped collection, we need to specify the size. Make sure the collection is not oversized, but still can accommodate the values for long enough for your application requirement without being evicted.

Now lets look at one of the high level designs where we can use tailable cursors similar to a pub-sub method.

High Level Design : Tailable Cursor Based Pub-Sub System

According to the above design, Listening Channels will be pointing tailable cursors to the Log_Collection to retrieve data of their interest, whenever they are available.

Right, now lets see how we can write bit of a code in Java to establish a tailable cursor.

1. Creating a capped collection in Java.

 if (db.collectionExists("item_collection")) {  
     collection = db.getCollection("item_collection");  
     } else {  
     DBObject options = BasicDBObjectBuilder.start().add("capped", true).add("size", 100000000l).get();  
     collection = db.createCollection("item_collection", options);  
     }  
 }

2. Writing the search query

     BasicDBObjectBuilder builder = BasicDBObjectBuilder.start();  
     builder.add("rev_no",lastRevNo); 
     builder.add("type", typeA);

I am basically looking at documents that are stored to the collection with the given typeA. (Assume typeA to be predefined categorical value) I am using a RevNo which is a global value which helps me to keep track for the last retrieved documents and help me get the delta. [3]

     DBObject searchQuery = builder.get();  
     DBObject sortBy = BasicDBObjectBuilder.start("$natural", 1).get();

Here we are requesting the results to retrieved in the same order they are stored in the collection ($natural operator).

3.. Preparing a tailable cursor for a custom query.

     DBCursor cursor = collection  
         .find(searchQuery)  
         .sort(sortBy)  
         .addOption(Bytes.QUERYOPTION_TAILABLE)  
         .addOption(Bytes.QUERYOPTION_AWAITDATA);

4. Retrieving and notifying subscribers.

 while (cursor.hasNext()) {  
     // Java Observer Design Pattern can come useful here.   
     result = //Convert cursor result into a value object;  
     //publishResults will call the notifyObservers() to notify all the subscribers  
     publishResults(result);  
 }

Key points:

1. Cost of cursor establishment

Remember that tailable cursors are expensive. But once established it can be used pretty smoothly to get latest data through a particular channel.

However, we need to be smart not to break the harmony by disconnecting regularly by returning from cursor. cursor.hasNext() will block until a result becomes available. Thats why I have proposed the observer pattern in Java to notify the subscribers of a particular listening channel whenever the data is available.

2. How does Tailable cursor mechanism work in Mongo DB?

Something that I also need to get more info would be that. For now, my interpretation is that it uses underline polling mechanism that goes beneath our implementation. I am looking forward to get more knowledge about it.

All and all, for applications where we need to support async requests and long polling, I guess MongoDB tailable cursors do become handy.

How would tailable cursors peform in production? Next up, I will try to update this with results of the load testing in my application and how the chatting with capped collections have performed.

Cheers..!!
Chinthaka

[1]. https://dev.mysql.com/doc/refman/5.5/en/triggers.html

[2]. http://docs.mongodb.org/manual/tutorial/use-capped-collections-for-fast-writes-and-reads/

[3]. http://docs.mongodb.org/manual/tutorial/create-an-auto-incrementing-field/

"Tech Coaster" by Chinthaka Dharmasiri

Sunday, April 20, 2014

Learning MongoDB - Event Triggering with Tailable Cursors using Java Driver

No comments:

Post a Comment

My Stack Overflaw Flair