0

I'm trying to understand the need for using nohup with background commands in ssh. My shell is csh on CentOS.

  1. The background command below continues running even after ssh exits. I was expecting this to only happen if nohup was prefixing the command.

    What's the scenario where nohup would be needed?

     ssh host 'sleep 80 >& /dev/null &'
    
  2. I also tried an interactive shell, and the PID of the background job still exists on host after the ssh exits.

     ssh host sleep 80 >& /dev/null & exit
    
  3. I also tried terminating the interactive session with kill -HUP PID rather than exit, and the PID of the background job still exists on host after the ssh exits.

Anything I'm doing wrong?

neo_coder
  • 3
  • 3

1 Answers1

1

No, a background process group (job) is NOT killed by default when the session leader (the shell) exits, or when its controlling terminal is torn down.

There are only some special cases when that happens:

(1) The background job is stopped, in which case it will be sent a SIGHUP/SIGCONT pair of signals by the kernel. If the SIGHUP signal is not caught or ignored by a process, the process will terminate.

The definition of a stopped job is: any job containing a stopped process. A process sleeping on a blocking system call like nanosleep(2) or read(2) is NOT considered stopped.

(2) A process tries to read or write to the terminal which no longer exists, and exits (of its own volition) because of the errors it gets when trying to do so.

(3) The job is actually a foreground job. The kernel sends a SIGHUP signal to the foreground process group when the session leader / controlling process (ie the shell) terminates. The controlling process is itself signaled with SIGHUP when its controlling terminal is torn down, which usually causes it to terminate.

Even commands started with & are actually part of the foreground process group when they're started from a shell with no job control (which in most shells --but not in csh-- is the default when running scripts and subshells).

(4) You're using a shell like bash or zsh, which goes out of its way to send a SIGHUP signal to all its jobs when it's itself signaled with SIGHUP (per point 3. above, the shell being the controlling process), or simply when it exits (the latter only the default in zsh, but not the default and subject to the shopt huponexit option in bash).

The csh shell (either the real csh or tcsh) does not have that behaviour from bash or zsh. In tcsh (but NOT in the real csh) you can start a command with the hup builtin in order to have it hup'ed when the shell exits:

tcsh% hup sleep 3600 &
tcsh% exit
$ pgrep sleep
[nothing]

(5) Your init system goes out of its way to cleanup any terminated user sessions. In its default configuration, systemd will signal all the processes from a scope with a SIGTERM followed by a SIGKILL after a delay, so nohup will NOT help you with that anyway. Also, systemd's idea of a scope doesn't match a Unix process session, so running a command with setsid(1) will not let it escape it, either.

You can probably change systemd's behaviour by tweaking away from their defaults the KillUserProcesses=yes, KillMode=control-group, KillSignal=SIGTERM and SendSIGKILL=yes options.

  • Thanks. So, in real csh: what's the scenario where nohup would be needed? Is it not needed for real csh? – neo_coder Apr 07 '20 at 17:57
  • (1) and (3). And (5) with a specially configured systemd. –  Apr 07 '20 at 17:59
  • FWIW, notice that nohup is a builtin in csh, and does NOT do some of the actions that a standalone nohup program may do (like redirect stdout and stderr, and maybe even stdin). It's a perfectly fine idea to read the csh(1) manpage, too. And in general, nohup is not some magic tool -- it just starts a program with SIGHUP ignored. The standalone nohup is also subject to nasty races. –  Apr 07 '20 at 18:38
  • If you find anything dubious in this answer, please put up directly. Don't let it "float". They were already similar questions like this or this. –  Apr 07 '20 at 18:51
  • Especially this comment applies here too. –  Apr 07 '20 at 18:58
  • Thanks for the pointers! ->For csh, "all processes detached with ‘&’ are effectively nohup'ed." – neo_coder Apr 08 '20 at 19:16
  • For the case (1) you mentioned: I tried that (kiling PID using HUP signal) which is "3." in my question, but the PID of the background job still exists on host after the ssh exits. This leaves cases (3) and (5) that you mentioned. Is that correct? – neo_coder Apr 08 '20 at 19:19
  • Not really. The "effectively nohupped" from the csh(1) manpage is a bit of hyperbole. If the background process is stopped, it will be hupped (and terminated if it doesn't catch the HUP signal). To see that in action, run ssh -t localhost csh, and inside the csh, run sleep 3600 & followed by kill -TSTP $!, and then either exit the shell (ignoring the warning about stopped jobs) or break the ssh connection by pressing <Enter>, ~ and .. The stopped sleep process will be killed. Thence, the case (1) applies to csh, too. –  Apr 08 '20 at 22:09
  • I tried that and can't get it to follow the behavior you're describing. Anything I'm doing wrong? https://www.picpasteplus.com/v.php?i=6c74116315

    Also, I'm concerned about the ssh from the origin host getting killed (machine going down). The background task is still there on the destination host, without explicit nohup.

    – neo_coder Apr 09 '20 at 00:42
  • You didn't actually exit the csh shell. csh warned you that there are stopped jobs and refused to exit. You should enter exit again. Or simply break the ssh connection with ~. at the beginning of the line, as described (no, that won't kill the ssh server ;-)) Instead of ssh you can use script(1), a terminal emulator, a tmux or screen window, etc. –  Apr 09 '20 at 00:45
  • Thank you, exiting with "~." is now showing the behavior you described with the ssh -t.
    For non-interactive ssh sessions like the one I mentioned in "1.": ssh host 'sleep 80 >& /dev/null &'. Is nohup needed?
    Regarding the concern about the ssh from the origin host getting killed (machine going down). The background task is still there on the destination host, without explicit nohup. Does that mean nohup is not needed in this scenario?
    – neo_coder Apr 09 '20 at 01:42
  • @neo_coder yes, it's needed even for non-interactive sessions IF the async/bg process may be stopped and your login shell does job control even when in non-interactive mode. Like csh does ;-) Or, a bit different from your 1. example, if the bg process is not stopped and the shell is NOT interactive and is not doing job control, but you're running ssh with the -t option. –  Apr 09 '20 at 02:52
  • Instead of trying to tell apart all those special cases, just run daemons as daemons (detached from the terminal, in their own session, with their standard fds redirected to /dev/null, etc). It makes NO SENSE to pretend that the background jobs (an abstraction intended for interactive shells), or the shell's & syntax should be used to spawn long-running, non-interactive processes. –  Apr 09 '20 at 02:58