Daniel Watrous on Software Engineering

A Collection of Software Problems and Solutions

Posts tagged replica set

Software Engineering

MongoDB monitoring with mongostat

Another tool for monitoring the performance and health of a MongoDB node is mongostat. You’ll recall that mongotop shows the time in milliseconds that a mongo node spent accessing (read and write) a particular collection.

mongostat on the other hand provides more detailed information about the state of a mongo node, including disk usage, data throughput, index misses, locks, etc. However, the data is general to the mongo node and doesn’t indicate which database or collection the status refers to. As you would expect, both utilities, mongotop and mongostat, are required to get a full picture of the state of a node and which databases/collections are most affected.

Here’s some sample output from mongostat for two different servers, a PRIMARY and a SECONDARY.

insert  query update delete getmore command flushes mapped  vsize    res faults      locked db idx miss %     qr|qw   ar|aw  netIn netOut  conn        set repl       time
    *0    208     *0     *0       0     4|0       0  5.76g  12.1g    87m      0 documents:0.0%          0       0|0     0|0    16k    70k    23 wildcatset  SEC   15:28:29
    *0    197     *0     *0       0     1|0       0  5.76g  12.1g    87m      0 documents:0.0%          0       0|0     0|0    15k    65k    23 wildcatset  SEC   15:28:30
    *0    215     *0     *0       0     3|0       0  5.76g  12.1g    87m      0 documents:0.0%          0       0|0     0|0    16k    71k    23 wildcatset  SEC   15:28:31
    *0    198     *0     *0       0     1|0       0  5.76g  12.1g    87m      0 documents:0.0%          0       0|0     0|0    15k    65k    23 wildcatset  SEC   15:28:32
    *0    227     *0     *0       0     3|0       0  5.76g  12.1g    87m      0 documents:0.0%          0       0|0     0|0    17k    75k    23 wildcatset  SEC   15:28:33
insert  query update delete getmore command flushes mapped  vsize    res faults      locked db idx miss %     qr|qw   ar|aw  netIn netOut  conn        set repl       time
     0    269      0      0       1       2       0  5.56g  11.8g    85m      0 documents:0.0%          0       0|0     0|0    20k    89k    43 wildcatset  PRI   15:28:32
     0    216      0      0       0       2       0  5.56g  11.8g    85m      0 documents:0.0%          0       0|0     0|0    16k    72k    43 wildcatset  PRI   15:28:33
     0    227      0      0       0       3       0  5.56g  11.8g    85m      0 documents:0.0%          0       0|0     0|0    17k    77k    43 wildcatset  PRI   15:28:34
     0    235      0      0       0       2       0  5.56g  11.8g    85m      0 documents:0.1%          0       0|0     0|0    18k    77k    43 wildcatset  PRI   15:28:35
     0    214      0      0       0       2       0  5.56g  11.8g    85m      0 documents:0.0%          0       0|0     0|0    16k    71k    43 wildcatset  PRI   15:28:36
Software Engineering

MongoDB monitoring with mongotop

In the process of tuning the performance of a MongoDB replica set, it’s useful to be able to observe mongod directly, as opposed to inferring what it’s doing by watching the output of top, for example. For that reason MongoDB comes with a utility, mongotop.

The output of mongotop indicates the amount of time the mongod process spend reading and writing to a specific collection during the update interval. I used the following command to run mongotop on an authentication enabled replica set with a two second interval.

[watrous@d1t0156g ~]# mongotop -p -u admin 2
connected to: 127.0.0.1
Enter password:
 
                              ns       total        read       write           2013-01-11T23:41:51
                          admin.         0ms         0ms         0ms
            admin.system.indexes         0ms         0ms         0ms
         admin.system.namespaces         0ms         0ms         0ms
              admin.system.users         0ms         0ms         0ms
 coursetracker.system.namespaces         0ms         0ms         0ms
document_queue.system.namespaces         0ms         0ms         0ms

The output doesn’t refresh in the same way top does. Instead it aggregates, similar to running tail -f. When I began my experiment I could immediately see the resulting load:

                              ns       total        read       write           2013-01-11T23:41:19
                   documents.nav        60ms        60ms         0ms
               documents.product        53ms        53ms         0ms
                          admin.         0ms         0ms         0ms
            admin.system.indexes         0ms         0ms         0ms
         admin.system.namespaces         0ms         0ms         0ms
              admin.system.users         0ms         0ms         0ms
 coursetracker.system.namespaces         0ms         0ms         0ms
 
                              ns       total        read       write           2013-01-11T23:41:21
                   documents.nav        82ms        82ms         0ms
               documents.product        54ms        54ms         0ms
                          admin.         0ms         0ms         0ms
            admin.system.indexes         0ms         0ms         0ms
         admin.system.namespaces         0ms         0ms         0ms
              admin.system.users         0ms         0ms         0ms
 coursetracker.system.namespaces         0ms         0ms         0ms
 
                              ns       total        read       write           2013-01-11T23:41:23
                   documents.nav        63ms        63ms         0ms
               documents.product        45ms        45ms         0ms
                          admin.         0ms         0ms         0ms
            admin.system.indexes         0ms         0ms         0ms
         admin.system.namespaces         0ms         0ms         0ms
              admin.system.users         0ms         0ms         0ms
 coursetracker.system.namespaces         0ms         0ms         0ms

A related performance utility is mongostat.

Verified load balancing

Before running the experiment I set the ReadPreference to nearest. As a restult I expected to see a well balanced, but asymmetrical distribution between nodes in my replica set with all hosts responding to queries. That’s exactly what I saw.

Software Engineering

MongoDB ReadPreference

MongoDB connections accommodate a ReadPreference, which in a clustered environment, like a replicaset, indicates how to select the best host for a query. One major consideration when setting the read preference is whether or not you can live with eventually consistent reads, since SECONDARY hosts may lag behind the PRIMARY.

Some of the options you can choose include:

  • PRIMARY: This will ensure the most consistency, but also concentrates all your queries on a single host.
  • SECONDARY: This will distribute your queries among secondary nodes and may lag in consistency with the primary
  • primaryPreferred: Reads from the primary whenever it is available, but will fail over to secondary if necessary.
  • secondaryPreferred: Reads from the secondary whenever it is available, but will fail over to primary if necessary.
  • nearest: Reads from the nearest node as determined by the lowest latency between the client and node.

Here’s a great StackOverflow discussion about ReadPreference.

Integration

Setting the ReadPreference requires use of a MongoOptions object. That can then be used to create the Mongo object.

options = new MongoOptions();
options.setReadPreference(ReadPreference.nearest());
Mongo mongo = new Mongo(mongoNodesDBAddresses, options);
Software Engineering

MongoDB Using Replica Sets as a Backup

MongoDB implements a form of replication they call replica sets. Referring to a repica set instead of just calling it replication is a helpful distinction that becomes more obviously useful when you are introduced to sharding in MongoDB since each shard should be comprised of a set of replicas unique to that shard, but we’ll get to that later.

For now, I want to show you how easy it is to setup and use replica sets. The setup and initial ‘recovery’ of data to all replicas in the set is quite simple. In this video I walk you through the entire process. The video is HD, so be sure to watch it full screen to get all the details.

 

To install MongoDB see my hands on introduction.