How do I wait on a program started in another shell

Question

I have a program that does a large a amount of work (takes about 4-5 hours) that gets started by cron when all the data it works with becomes available. Sometimes, when I am waiting for it to finish, I would like to be able to have another (interactive) program start when it finishes. the wait call looks promising but will only wait for children.

The only other method I can conceive of is to create a file from cronjob, and then use inotifywait on that process, when the files deleted, your 2nd process (running inotifywait) can start. — slm, Dec 05 '13 at 15:07
I think you could do something along these lines using ipc as well, interprocess communication. man ipc. — slm, Dec 05 '13 at 15:09
Though you aren't watching a service to restart it, God, Monit, or one of the other pkgs mentioned in this Q&A would do the job too: http://unix.stackexchange.com/questions/75785/how-to-set-proper-monitoring-of-my-services-in-a-automated-way-so-that-if-one-c/75811#75811 — slm, Dec 05 '13 at 15:16
@slm can inotify be used on any filesystem for example wait on close_write on /proc/pid/fd/1? — hildred, Dec 05 '13 at 15:21
Not entirely sure, but that might be another way of doing it 8-). There was some Q that I remember where there was some new kernel functionality related to process monitoring similar to the inotifywait (for files). This new feature was for events on process, I thought but many of the Q&A start to run together in my mind 8-). I think the A was provided by me or Gilles. — slm, Dec 05 '13 at 15:22
This was the Q&A I was thinking of. fatrace is the facility but it's meant to watch I/O usage. http://unix.stackexchange.com/questions/86875/determining-specific-file-responsible-for-high-i-o/87290#87290 — slm, Dec 05 '13 at 15:29

Emmanuel · Accepted Answer · 2013-12-19T13:43:57.407

17

I Definitely prefer the EDIT #3 solution (see bellow).

if its not in the same shell use a while loop with condition on ps -p returning true. Put a sleep in the loop to reduce processor usage.

while ps -p <pid> >/dev/null 2>&1
do
   sleep 10
done

or if your UNIX is supporting /proc (for instance HP-UX still does not).

while [[ -d /proc/<pid> ]]
do 
    sleep 10
done

If you want a timeout

timeout=6  # timeout after 1mn  
while ((timeout > 0)) && ps -p <pid> >/dev/null 2>&1
do
   sleep 10
   ((timeout -= 1))
done

EDIT #1

There is an other way : don't use cron. Use the batch command to stack your jobs.

For instance you could daily stacks all your jobs. Batch can be tuned to allow some parallelism so a blocked job will not stops the all stack (It depends on the operating system).

EDIT #2

Create a fifo in your home directory:

$ mkfifo ~/tata

at the end of your job:

echo "it's done" > ~/tata

at the start of the other job (the one who is waiting):

cat ~/tata

It's not polling it is old good blocking IO.

EDIT #3

Using signals:

At the begin of the script(s) who is(are) waiting :

echo $$ >>~/WeAreStopped
kill -STOP $$

at the end of your long job :

if [[ -f ~/WeAreStopped ]] ; then
    xargs kill -CONT < ~/WeAreStopped
    rm ~/WeAreStopped
fi

edited Dec 19 '13 at 13:43

answered Dec 05 '13 at 14:36

Emmanuel

4,187

Polling, yuck! Lovely decision: waste processor time or my time for tuning the sleep. There has to be a better answer. – hildred Dec 05 '13 at 14:44
:) Polling is good for you, polling is stateless, polling is reliable. – Emmanuel Dec 05 '13 at 14:51
polling is slow, poling wastes processor time. – hildred Dec 05 '13 at 14:53
With an exec time of 0.01s the 1440 ps executed during the 4 hours would have consumed 14.4s. Much less than a scheduler managing jobs dependencies :D – Emmanuel Dec 05 '13 at 15:01
I think this is likely the best option: http://stackoverflow.com/questions/1058047/wait-for-any-process-to-finish, even though it's hacky. – slm Dec 05 '13 at 15:04
@slm I'm quite proud of my last edit :) – Emmanuel Dec 05 '13 at 15:27
Nice addition!! – slm Dec 05 '13 at 15:31
@hildred Polling should be ok for you, unless your system is overloaded. – peterph Dec 05 '13 at 16:08
@peterph why do you think this job takes hours? It runs at about 98% cpu – hildred Dec 05 '13 at 16:12
@hildred just edited a multi process solution – Emmanuel Dec 05 '13 at 17:10
With while ps …, beware that there is a small chance that the PID will be reused at exactly the wrong time, causing a false positive. – Gilles 'SO- stop being evil' Dec 05 '13 at 23:23
@Gilles to reduce the chances it happens while ps -p <pid>|grep -q <process name> . Of course that will not reduce the processor usage. – Emmanuel Dec 06 '13 at 06:43
@hildred normally whether a 10 hour job finishes a minute earlier or later is insignificant - if it's not, you're bound for problems, since such a delay can easily be caused by dozen of other things. If you do poll properly, the overhead is tiny - e.g. use [[ -d /proc/PID ]]. It doesn't incur significant I/O operations (pseudo file system) nor process spawning (shell internal command) apart from sleep. – peterph Dec 18 '13 at 20:42
@peterph I added your suggestion, but why the double [ ] ? – Emmanuel Dec 18 '13 at 21:28
@Emmanuel [[ is a shell built-in (at least in bash), [ is a binary (although some shells will silently translate it to the former). – peterph Dec 19 '13 at 10:20
@peterph Thank you :) I saw that on some OS [ was linked to test (BSD I think) but didn't realize that [[ had to a be a built-in. Otherwise it would have been an unknown command. – Emmanuel Dec 19 '13 at 13:50

score 5 · Answer 2 · answered Dec 05 '13 at 15:13

You can modify your cron job to use some flag.

Instead of

2  2 * * *           /path/my_binary

You can use

2  2 * * *           touch /tmp/i_m_running; /path/my_binary; rm /tmp/i_m_running

And just monitor this file in script or even manually. If it exists, then your program is running; otherwise feel free to do whatever you want.

The script sample:

while [[ -f /tmp/i_m_running ]] ; do
   sleep 10 ;
done
launch_whatever_you_want

In case you don't like to use sleep, you can modify the script and run it via cron once per X minutes.

In that case script sample will be:

[[ -f /tmp/i_m_running ]] && { echo "Too early" ; exit ; }
launch_whatever_you_want

This way is a little bit easier, as you don't have to find the PID of your cron process.

score 4 · Answer 3 · edited Apr 13 '17 at 12:36

There's no facility for a process to wait for another process to finish, except for a parent to wait for one of its child processes to finish. If you can, launch the program through a script:

do_large_amount_of_work
start_interactive_program

If you can't do that, for example before you want to start the large amount of work from a cron job but the interactive program from the context of your session, then make this

do_large_amount_of_work
notify_completion

There are several ways to implement notify_completion. Some desktop environments provide a notification mechanism (Open a window on a remote X display (why "Cannot open display")? may be useful). You can also make one using file change notifications. On Linux, the file change notification facility is inotify.

do_large_amount_of_work
echo $? >/path/to/finished.stamp

To react to the creation of /path/to/finished.stamp:

inotifywait -e close_write -q /path/to/finished.stamp
start_interactive_program

If you can't change the way do_large_amount_of_work is invoked, but you know what file it modifies last, you can use the same mechanism to react when that file is closed. You can also react to other events such as the renaming of a file (see the inotifywait manual for a list of possibilities).

score 0 · Answer 4 · answered Jun 22 '19 at 13:12

Have the script being triggered by the cronjob invoke an empty shell script into which you can insert follow-up tasks if you need them.

It's very similar to Gilles's approach.

cronjob-task.sh contains:

# do_large_amount_of_work

./post-execute.sh

where post-execute.sh is usually empty, unless you see that you need to trigger a follow-up task.

score 0 · Answer 5 · answered Dec 05 '22 at 18:01

0

while [ ! -z `pgrep nameoflongprocess` ]
do
  sleep 1
done
start script

pgrep will return an empty string if the long process is no longer running. Assumed here is that there is only one such process running at a given time.

answered Dec 05 '22 at 18:01

user3817445

101

How do I wait on a program started in another shell

5 Answers5

EDIT #1

EDIT #2

EDIT #3

Linked