Tag Archives: mongodb

Deploy MongoDB using Ansible

I’ve recently had some people ask how I deploy MongoDB. For a while I used their excellent online tool to deploy and monitor my clusters. Unfortunately they changed direction and I couldn’t afford their new tools, so I turned to Ansible. In order more easily share the process, I posted a simple example that you […]

Read more

Using Java to work with Versioned Data

A few days ago I wrote about how to structure version details in MongoDB. In this and subsequent articles I’m going to present a Java based approach to working with that revision data. I have published all this work as an open source repository on github. Feel free to fork it: https://github.com/dwatrous/mongodb-revision-objects Design Decisions To […]

Read more

Representing Revision Data in MongoDB

The need to track changes to web content and provide for draft or preview functionality is common to many web applications today. In relational databases it has long been common to accomplish this using a recursive relationship within a single table or by splitting the table out and storing version details in a secondary table. […]

Read more

MongoDB Aggregation for Analytics

I’ve been working on generating analytics based on a collection containing statistical data. My previous attempt involved using Map Reduce in MongoDB. Recall that the data in the statistics collection has this form. { "_id" : ObjectId("5e6877a516832a9c8fe89ca9"), "apikey" : "7e78ed1525b7568c2316576f2b265f55e6848b5830db4e6586283", "request_date" : ISODate("2013-04-05T06:00:24.006Z"), "request_method" : "POST", "document" : { "domain" : "", "validationMethod" : "LICENSE_EXISTS_NOT_EXPIRED", […]

Read more

MongoDB Map Reduce for Analytics

I have a RESTful SaaS service I created which uses MongoDB. Each REST call creates a new record in a statistics collection. In order to implement quotas and provide user analytics, I need to process the statistics collection periodically and generate meaningful analytics specific to each user. This is just the type of problem map […]

Read more

Install SSL Enabled MongoDB Subscriber Build

10gen offers a subscriber build of MongoDB which includes support for SSL communication between nodes in a replicaset and between client and mongod. If the cost of a service subscription is prohibitive, it is possible to build it with SSL enabled. After download, I followed the process below to get it running. For a permanent […]

Read more

Big Data Cache Approaches

I’ve had several conversations recently about caching as it relates to big data. As a result of these discussions I wanted to review some details that should be considered when deciding if a cache is necessary and how to cache big data when it is necessary. What is a Cache? The purpose of a cache […]

Read more

MongoDB monitoring with mongostat

Another tool for monitoring the performance and health of a MongoDB node is mongostat. You’ll recall that mongotop shows the time in milliseconds that a mongo node spent accessing (read and write) a particular collection. mongostat on the other hand provides more detailed information about the state of a mongo node, including disk usage, data […]

Read more

MongoDB monitoring with mongotop

In the process of tuning the performance of a MongoDB replica set, it’s useful to be able to observe mongod directly, as opposed to inferring what it’s doing by watching the output of top, for example. For that reason MongoDB comes with a utility, mongotop. The output of mongotop indicates the amount of time the […]

Read more

MongoDB ReadPreference

MongoDB connections accommodate a ReadPreference, which in a clustered environment, like a replicaset, indicates how to select the best host for a query. One major consideration when setting the read preference is whether or not you can live with eventually consistent reads, since SECONDARY hosts may lag behind the PRIMARY. Some of the options you […]

Read more