A process to check if other processes are running?

Question

I am currently deploying computers in my client's house.

I am running the following scripts:

ngrok (an ssh forward tunneling daemon)
heartbeat.py (a script which sends a heartbeat signal to loggly which confirms my computer is alive)
metrics.py (a script which logs all the environmental data such as temp, disk space to loggly)

So in my tests so far, metrics.py is somewhat unstable (meaning it crashes occasionally).

Is there a package in *NIX which does the following?

check every X second on whether a process is running
if #1 is not true, run it
Do this for a list of process

you appear to be describing init – Jasen Dec 20 '16 at 03:29 — Jasen, Dec 20 '16 at 03:29

score 1 · Answer 1 · answered Dec 20 '16 at 03:48

1

Much as I dislike systemd, I have to admit it can definitely do that.

Not all init systems support automatically restarting failed processes.

However, note that checking whether a process is still "running" is only the most rudimentary health check you can do. It's better if the program's main loop can check for some kind of "are you still alive?" message and reply to it. Then you know it hasn't got stuck in an infinite loop, or stuck waiting for I/O that won't complete.

answered Dec 20 '16 at 03:48

DepressedDaniel

4,229

I'm not aware of any init that can't restart processes. sysv init (processes defined in /etc/inittab) and upstart both can. – Jasen Dec 20 '16 at 20:24
1

@Jasen https://engineeringblog.yelp.com/2016/01/dumb-init-an-init-for-docker.html – DepressedDaniel Dec 20 '16 at 20:32

score 0 · Answer 2 · answered Dec 20 '16 at 04:45

Probaly, simple script can help:

ps -axu | grep '[n]grok' 2>&1 1>/dev/null || bash -c "ngrok"

The script above checks the running state of ngrok, if it is not running, execute the command to start it. The brackets in grep command help to filter out the grep command itself in the result.

Add this to you cron config file, it will be checked periodically.

NOTE:

You may need to add some delay between the check and restart, also an upper retry limit is need to prevent situations that ngrok does have some critical error and cannot start again.

A process to check if other processes are running?

2 Answers2