Nagios - Plugin to check CPU temperature! I like to monitor every aspect of my linux servers and one of these aspects is the CPU temperature. Most common way to find out your CPU temperature for pretty much any linux distribution is to use the package lm_sensors. It’s quite simple to install it:
Debian/Ubuntu:
apt-get install lm-sensors
CentOS/RedHat/Fedora:
yum install lm_sensors
Suse/OpenSuse:
zypper install lm_sensors
For other distros you can check their official documentation, see what’s the default tool used to install packages.
I was able to find a few bash scripts based on lm_sensors but most of them where only taking into consideration the temperature from a single core of the CPU. My approach is a little different. Most of my servers have 2 CPUs so I would like to know the temperature on any of them. Also, if let’s say the server has 4 CPUs, then I want to be able to run the exact same script and get the temperature for each of the 4 CPUs.
Because I was not able to find anything to suit my needs, I’ve decided to write my own nagios plugin, in Python, based on lm_sensors package and the python pysensors. You cand find the plugin here: check_cpu_temp.
As I said, you’ll need to install lmsensors package and the python module pysensors (pip install pysensors
).
As for most programs on linux, running the plugin with -h (help) will display some basic usage instructions:
./check_cpu_temp -h
usage: check_cpu_temp [-h] [-w WARN] [-c CRIT]
Nagios plugin to check CPU(s) temperature(s)
optional arguments:
-h, --help show this help message and exit
-w WARN, --warn WARN Check temperature against a custom HIGH value
-c CRIT, --crit CRIT Check temperature against a custom CRIT value
Then simply run the plugin:
./check_cpu_temp
The output should be something like this:
OK - CPU(s) temperature(s): 29°C 33°C; high=80.0; crit=96.0
By default, if you run the pugin with no arguments, HIGH and CRIT temperatures are the default ones for your CPU(s). You can also use your own CUSTOM values if you like using the -w (warning) and -c (critical) arguments. Assuming you would want 50 degrees Celsius for HIGH and 60 degrees Celsius for CRIT, the syntax becomes this:
./check_cpu_temp -w 50 -c 60
I mostly use CentOS 7.x for my servers, so the example bellow is related to CentOS:
NRPE (nrpe.cfg):
command[check_cpu_temp]=/usr/bin/sudo /usr/lib64/nagios/plugins/check_cpu_temp $ARG1$
Nagios command definition (commands.cfg):
define command {
command_name check_cpu_temp
command_line $USER1$/check_nrpe -H $HOSTADDRESS$
}
Host service definition:
define service {
use generic-service
host_name srv1
service_description CPU Temperature
check_command check_nrpe!check_cpu_temp!"-w 60 -c 75"
notifications_enabled 1
}
Enjoy!