2

I need to monitor the status of nagios service (Because, whenever I make any changes in configuration and apply the new configuration, I would find nagios service stopped). If found in 'stopped' state, it should get started automatically.

I tried writing a small shell script and added it to cron, but found that the script would execute every time even if the service is running. The script is:

#!/bin/bash

service nagios status | grep running

if [ $? -ne 0 ]
then
        service nagios start
fi

Whenever the service is in 'stopped' state, the output of service nagios status shows No lock file found in /usr/local/nagios/var/nagios.lock. Should I monitor the /usr/local/nagios/var/nagios.lock file using inotify-tools or is there some better alternative to this?

  • 1
    Yes, there are better alternatives. They are called process supervisors. Modern init systems (upstart and systemd) have them built-in. – jordanm Jan 14 '15 at 07:05

1 Answers1

1

is there some better alternative to this?

Yes. Use a proper service manager and junk that /etc/init.d/nagios script. At best, you're using a System 5 rc script in compatibility mode under something like upstart or systemd, in which case you'll not be getting some of the useful service management mechanisms that aren't available in compatibility mode as this person at AskFedora did not. At worst, you're running things under System 5 rc and you don't really have a hope of doing halfway decent service management with that script.

There are plenty of service management systems available. I'm not going to go into the details of installing them, because that's way beyond the scope of this answer. Instead, I'll focus simply on how to get the nagios dæmon up and running within them.

The daemontools family

The daemontools family of service management toolsets includes:

The major thing that you need here is a program that becomes the dæmon. For nagios, this is a 2- or 3-liner. One can mix and match the toolsets. Here are some suitable 2-liners, using several different toolsets:

  • A run file with the nosh toolset:
    #!/bin/nosh
    setuidgid nagios
    nagios
    and a restart file that causes unconditional automatic restart:
    #!/bin/sh
    exec true
    Just for kicks, I've added a pre-built service bundle for nagios to nosh, that will be available in version 1.13, that is pretty much this with a couple of standard frills such as dependency information.
  • A run file with the runit toolset:
    #!/bin/sh -e
    exec chpst -u nagios nagios
  • A run file with the s6 toolset:
    #!/command/execlineb -P
    s6-setuidgid nagios
    nagios
  • A run file with the daemontools, daemontools-encore, or freedt toolsets:
    #!/bin/sh -e
    exec setuidgid nagios nagios
  • An rc.main file with the perp toolset:
    #!/bin/sh -e
    exec 2>&1
    start() { exec runuid nagios nagios; }
    reset() { exit 0; }
    eval "$1" "$@"

This program to run the individual service is the only thing particular to this service. The rest doesn't vary from service to service. One queries service status the same way across all services, with a command such as svstat. Enabling and disabling automatic startup at bootstrap is a matter of symbolic links. Manually starting and stopping a service is a matter of svc -u and svc -d. And so forth.

nosh has several shims if one needs them, such as systemctl status and initctl status. Importantly, it has a service shim, so if you really like service nagios status you can keep using it. ☺ But forget about that whole nasty looking at a lockfile business, and the whole notion of ad-hoc monitoring. You don't deal in anything like that with a proper service manager in place. The service manager does the monitoring, and keeps proper track of the dæmon process.

systemd

There isn't a systemd service unit file for nagios that comes in the box. But a lot of people have already written their own:

systemd doesn't come with a shim service command, but some Linux distributions have one from another source.

Further reading

Stephen Kitt
  • 434,908
JdeBP
  • 68,745