How does kill by pgid affect new subprocesses?

Question

I have a very complicated application, which has its init-script and main service runs as a daemon. I want to stop it fast and gracefully, so I used this in stop() function:

group_id=$(ps -o pgid= $(cat $pidfile))
if [ ! -z $group_id ]; then
    kill -- -$group_id
success
fi

It sends terminate signal to all subprocessess/threads. But, what if some of them spawn new subprocess? It seems like these new subprocesses don't finish their job, but I need to. If I send terminate signal to parent daemon, all subprocessess finish their job successfully, but in shell it shows that service has been already stopped (look at my previous topic - How to make init script print "OK" only when all subprocesses are stopped?) Probably, I miss something very obvious, but still I don't know how to fix this with minimal efforts. Please, give some ideas at least.

score 1 · Accepted Answer · edited Apr 13 '17 at 12:36

Killing by process group ID is atomic. A process won't escape by forking in at wrong time. For this purpose, you can treat kill, fork and setpgid as atomic operations that are executed in order. If a process goes if (!fork()) setpgid() then the child will be killed if kill is executed after fork but before setpgid.

It seems like these new subprocesses don't finish their job

All the subprocesses receive the signal. That's the point of process groups.

If I send terminate signal to parent daemon, all subprocessess finish their job successfully, but in shell it shows that service has been already stopped

That's to be expected. Actually kill returns to the script before the process is even killed: the target process might be presently blocking signals. On a multiprocessor machine you might observe the killed process doing something after kill -9 has returned.

If the shell script is checking that the master process has died, that only says that the master process has died. There's no way to wait for a process or process group to die except from the parent of the process or process group leader. There's no convenient way to check that all the processes in a process group have died if you don't set anything up when starting the master process.

The normal way to deal with such situations is to have a throttle command in your daemon, that makes it shut down cleanly (including handling any work done by subprocesses) and only then report that the shutdown is complete and exit. Look at Apache HTTPD for example.

Alternatively, you might be able to detect the process tree if you prepare beforehand. See How to get the pid to the last background app

How does kill by pgid affect new subprocesses?

1 Answers1