1

When I spawn a background job in Bash 5.1 and immediately send it a signal, this signals seems to get lost. Short demo:

$ cat simple.cc 
#include <signal.h>
#include <unistd.h>
#include <stdlib.h>

static void handler(int, siginfo_t , void ) { write(2, "Got SIGINT\n", 11); exit(0); }

int main() { struct sigaction act; act.sa_flags = SA_SIGINFO | SA_RESTART; act.sa_sigaction = handler; if (sigaction(SIGINT, &act, nullptr)) { exit(1); }

    while (true) {}

}

$ g++ -Wall -Wextra simple.cc -o simple

$ ./simple & kill -SIGINT $! [1] 540434

$ # nothing happens $

$ kill -SIGINT 540434 Got SIGINT

My assumption is that at the time the signal hits, the forked background process still executes Bash. Bash tries forward SIGINT to the foreground process alas there is none and thus SIGINT is dropped.

My questions:

  1. Is this correct? How can I validate this is what actually happens?
  2. Assume I cannot modify the program to be run in the background. How can I make sure it is already active and will process signals? A simple sleep 1 would help here but I'm looking for proper synchronization. Note that I do not care if the intended signal handlers are set up, just that the correct binary runs.

1 Answers1

1

Note: this answer requires /proc.


[…] SIGINT is dropped. […] Is this correct?

Basically yes, although I'm not sure your description is accurate ("forward SIGINT"?). When the main (in your example: interactive) bash forks and the execution of the line moves beyond &, there is a process with PID $! and the signal gets to it. The problem is the process is initially another bash about to exec to ./simple. Apparently it's the other bash who gets the signal. It doesn't exit though, replaces itself with ./simple, but then the signal is already "used up".

I'm neither a programmer nor a *nix expert. I'm not sure my description is accurate. Even if it's not totally accurate, the rest of this answer may be useful.


How can I validate this is what actually happens?

A time window when the issue can occur is pretty narrow. Usually it's enough to delay kill to "solve" the problem. Theoretically any delay, no matter how big, may turn out to be not big enough in a particular try. You seem to understand this when you write about "proper synchronization".

kill is a builtin in Bash and it's fast enough to trigger the issue. An external kill (e.g. /bin/kill) can be used. It is spawned as a separate process, this takes time and in my tests it never triggered the issue.

I tried to catch the other bash before it replaces itself with ./simple; I tried to examine /proc/$!/exe. Unfortunately ls -l or readlink are too slow, they are not builtins.

A useful builtin is test or its synonym [. My bash is /bin/bash and I can do this:

./simple & [ "/proc/$!/exe" -ef /bin/bash ] && echo gotcha

(While testing this, don't forget to killall simple or so.) See help test in Bash to learn what -ef does.

Let's see how many tests I can perform before bash turns into simple:

./simple & while [ "/proc/$!/exe" -ef /bin/bash ]; do echo a; done

In my tests the number of echoed strings was about five. Initially I tried to count them by piping to wc -l but this introduced delay and the result was 0.

The time window when the main code continues and the background process is not yet ./simple is indeed pretty narrow, but it definitely exists.


Assume I cannot modify the program to be run in the background. How can I make sure it is already active and will process signals? […] Note that I do not care if the intended signal handlers are set up, just that the correct binary runs.

Test /proc/$!/exe in a loop. Exit the loop after bash becomes simple. The test doesn't have to be super fast. Mind this:

  • while [ "/proc/$!/exe" -ef /bin/bash ]; do :; done hardcodes /bin/bash.
  • until [ "/proc/$!/exe" -ef ./simple ]; do :; done will loop indefinitely if simple exits fast enough or is never executed (e.g. due to some error).

I think this is a reasonable approach:

./simple & while [ "/proc/$!/exe" -ef "/proc/$$/exe" ]; do :; done; kill -s INT "$!"

In my tests the signal kept getting to simple before the handler was set up; each time the main shell informed me later the job was interrupted. You wrote "I do not care if the intended signal handlers are set up, just that the correct binary runs". So yeah, this is how you can do it.