Increase Munin Resolution to sub-minute

I previously explained how to get one-minute resolution in Munin. The process to get sub-minute resolution in Munin is more tricky. The main reason it’s more tricky is that cron only runs once per minute, which means data must be generated and cached in between cron runs for collection when cron runs.
In the case where a single datapoint is collected each time cron runs, the time at which cron runs is sufficient to store the data in rrd. With multiple datapoints being collected on a single cron run, it’s necessary to embed a timestamp with each datapoint so the datapoints can be properly stored in rrd.
For example, the load plugin which produces this for a one minute or greater collection time:
load.value 0.00 |
Would need to produce output like this for a five (5) second collection time:
load.value 1426889886:0.00 load.value 1426889891:0.00 load.value 1426889896:0.00 load.value 1426889901:0.00 load.value 1426889906:0.00 load.value 1426889911:0.00 load.value 1426889916:0.00 load.value 1426889921:0.00 load.value 1426889926:0.00 load.value 1426889931:0.00 load.value 1426889936:0.00 load.value 1426889941:0.00 |
Caching mechanism
In one example implementation of a one second collection rate, a datafile is either appended to or replaced using standard Linux file management mechanisms.
This looks a lot like a message queue problem. Each message needs to be published to the queue. The call to fetch is like the subscriber who then pulls all available messages, which empties the queue from the cache window.
Acquire must be long running
In the case of one minute resolution, the data can be generated at the moment it is collected. This means the process run by cron is sufficient to generate the desired data and can die after the data is output. For sub-minute resolution, a separate long running process is required to generate and cache the data. There are a couple of ways to accomplish this.
- Start a process that will only run util the next cron runs. This would be started each time cron fetched the data
- Create a daemon process that will produce a stream of data.
A possible pitfall with #2 above is that it would continue producing data even if the collection cron was failing. Option #1 results in more total processes being started.
Example using Redis
Redis is a very fast key/value datastore that runs natively on Linux. The following example shows how to use a bash script based plugin with Redis as the cache between cron runs. I can install Redis on Ubuntu using apt-get.
sudo apt-get install -y redis-server |
And here is the plugin.
#!/bin/bash # (c) 2015 - Daniel Watrous update_rate=5 # sampling interval in seconds cron_cycle=1 # time between cron runs in minutes pluginfull="$0" # full name of plugin plugin="${0##*/}" # name of plugin redis_cache="$plugin.cache" graph="$plugin" section="system:load" style="LINE" run_acquire() { while [ "true" ] do sleep $update_rate datapoint="$(cat /proc/loadavg | awk '{print "load.value " systime() ":" $1}')" redis-cli RPUSH $redis_cache "$datapoint" done } # -------------------------------------------------------------------------- run_autoconf() { echo "yes" } run_config() { cat << EOF graph_title $graph graph_category $section graph_vlabel System Load graph_scale no update_rate $update_rate graph_data_size custom 1d, 10s for 1w, 1m for 1t, 5m for 1y load.label load load.draw $style style=STACK EOF } run_fetch() { timeout_calc=$(expr $cron_cycle \* 60 + 5) timeout $timeout_calc $pluginfull acquire >/dev/null & while [ "true" ] do datapoint="$(redis-cli LPOP $redis_cache)" if [ "$datapoint" = "" ]; then break fi echo $datapoint done } run_${1:-fetch} exit 0 |
Restart munin-node to find plugin
Before the new plugin will be found and executed, it’s necessary to restart munin-node. If the autoconfig returns yes data collection will start automatically.
ubuntu@munin-dup:~$ sudo service munin-node restart munin-node stop/waiting munin-node start/running, process 4684 |
It’s possible to view the cached values using the LRANGE command without disturbing their collection. Recall that calling fetch will remove them from the queue, so you want to leave Munin to call that.
ubuntu@munin-dup:~$ redis-cli LRANGE load_dynamic.cache 0 -1 1) "load.value 1426910287:0.13" 2) "load.value 1426910292:0.12" 3) "load.value 1426910297:0.11" 4) "load.value 1426910302:0.10" |
That’s it. Now you have a Munin plugin with resolution down to the second.