1

I have a sequence of processes in a pipeline:

cat haystack | grep needle | my_process | less

My understanding is that the shell forks and runs execve for each command concurrently.

In my_process, how could I determine if grep and less have fully started (execve'd), assuming I have already determined their PIDs?

ojb
  • 11
  • 4
    What is rhat supposed to be good for? – Hauke Laging Jan 17 '18 at 08:09
  • check for EOF on stdin and trap SIGPIPE for stdout. and, in bash at least, if you want to know the exit status of each command in the most-recently executed pipeline, examine the PIPESTATUS array. – cas Jan 17 '18 at 08:37
  • @HaukeLaging for establishing a side channel to pipes in a way that is backwards compatible with other processes – ojb Jan 17 '18 at 09:52
  • 1
    And what is this side-channel actually for? It cannot be for the stated purpose, because overlaying a new process image file is not the same point as the point that a program actually starts doing its main work, let alone the (potentially several) points at which a program like less sets up and tears down its full-screen TUI. – JdeBP Jan 17 '18 at 11:08
  • @JdeBP I need it to determine when the neighbouring process(es) are also an instance of my_process. I don't know how many processes there will be beforehand and cannot modify the shell. I was doing this by checking /proc//exe however there are problems when the neighbour is between fork and execve. I was going to use the answer to wait until the execve had completed since this would indicate that /proc//exe is valid; it no longer points to the shell executable. – ojb Jan 17 '18 at 11:28

3 Answers3

2

Well, on Linux, you could check /proc/PID/exe:

(p=$BASHPID; /bin/ls -l /proc/$p/exe; exec /bin/ls -l /proc/$p/exe)
... 0 Jan 17 10:34 /proc/17816/exe -> /bin/bash
... 0 Jan 17 10:34 /proc/17816/exe -> /bin/ls

But I can't really see what that's good for, the shell won't read/write to the pipe before the exec, so the pipeline works seamlessly even if there is a small window of time before the exec. And really, it's a small window of time, I wouldn't be surprised if the exec had already happened by the time you can even get there to check what program is running.

ilkkachu
  • 138,973
2

This is what inheriting open file descriptors is for.

Make a FIFO. Open a close-on-exec write-only file descriptor to it in the parent shell. All of the fork()ed children will inherit it, and then close it when they execve(). Open a read-only file descriptor to it in the process that needs to detect the execve(), or have that process inherit an already-open read-only file descriptor. When the write-only ends are closed by the execve(), the read-only end will return EOF.

For detecting individual execve()s, generalize to multiple FIFOs. Indeed, at that point you can not bother with FIFOs and just use a second set of pipes, with their write file descriptors set to close-on-exec.

Since you have not explained what this is actually for, working out how to build this into what you are actually trying to do is your task alone.

JdeBP
  • 68,745
1

Only Linux and with zsh,

$ autoload zsh/stat
$ (zstat +link /proc/*/fd/0(e'{[[ $REPLY -ef /proc/self/fd/1 ]] &&
    reply=$REPLY:h:h/exe}')) | cat
/bin/cat

That gives you the path of the executable of process(es) that have the same pipe as that subshell's stdout open on their stdin. So above, at that point, cat had already been executed.