1

I have a cron entry that gets stuck which I cannot break out of. The cron job launches a script which runs our nightly jobs for our application system. The script submits asynchronous jobs in the job scheduler of our application system. We have a few steps that require waiting for completion before going on to the next job. To control that, we test the return code from the app server. If the status of the job is greater than 59, it's still running. Hence the following code (there is no "getjobstatus" command, but assume it returns a numeric value to jobstatus below):

jobstatus=0
while [[ $jobstatus -lt 60 ]]
do
   sleep 5
   jobstatus=`getstatus whatever`
done

I've tried to kill the "sleep" process but it keeps coming back. How do I prevent that and kill it for good?

Jeff Schaller
  • 67,283
  • 35
  • 116
  • 255
Leon
  • 11
  • The application prints something to standard out? The return code is something else, and usually instead hides in $? – thrig Aug 21 '17 at 19:55
  • You want to kill the PPID (parent PID) of the sleep 5 (or possibly jobstatus=$(getstatus whatever)) process. – Deathgrip Aug 21 '17 at 19:55
  • The actual problem is either: (a) the jobstatus for that instance is still less than 60, or (b) you actually want to kill one particular "stuck" cron job (whose jobstatus is also returning less than 60). Given that the async job on the app server is still stuck, I'd advise you to check into that, and then let us know if it's (b) that you're after here. – Jeff Schaller Aug 23 '17 at 01:00
  • Leon, if any of the answers satisfied your question, please accept it using the checkmark. Thank you! – Jeff Schaller Aug 27 '17 at 12:06

3 Answers3

1

As Deathgrip mentioned in a comment, you must kill the parent process of sleep which you can find with some variation of ps | grep sleep (ps has many implementations so I won't guess at exactly what flags you need to provide or what your output looks like. Check man ps)

What's going on? When you kill sleep it terminates but that doesn't affect execution of the containing script...it just moves to the next line and the loop keeps looping. There are a couple ways you could modify the script so killing sleep would work. One of the simplest is to do this: sleep 5 || exit.

Alternatively, if you're using Bash (others?), you can just add this line at the beginning: set -e. This tells Bash to terminate the script if any command within fails (i.e. returns non-zero status). But that impacts everything in your script and has a number of other pitfalls so best to avoid that solution. (You could add set -e on the line before sleep and disable with set +e immediately after it but that's getting kind of ridiculous since you can just use the first solution I mentioned.)

B Layer
  • 5,171
1

If you have a sleep process in your sights, and that sleep process is part of a parent shell script that runs in a loop, such as you have, then you will have to kill that parent process to prevent that (particular) sleep command from being started again.

Take the PID of that sleep process and find the parent process, then kill it:

$ ps -e -opid,comm,args|grep sleep
424241 sleep  sleep 999 # for example
$ ps -oppid= -p SLEEP-PID-HERE-SUCH-AS-424241-ABOVE
424242 # for example
$ kill $(ps -oppid= -p SLEEP-PID-HERE-SUCH-AS-424241-ABOVE)
Jeff Schaller
  • 67,283
  • 35
  • 116
  • 255
-1

I think you might want something like this: How to stop the loop bash script in terminal?

So in your case if you use SIGTERM instead of INT since as a cron job I'm assuming you're trying to kill it with "kill" instead of ctrl+C:

jobstatus=0
trap "exit" SIGTERM
while [[ $jobstatus -lt 60 ]]
do
    sleep 5
    jobstatus=`getstatus whatever`
done

Should allow the whole script to be killed rather than just the sleep commands

Sorry for making this a separate answer instead of just a comment, but I don't have enough points to leave comments yet.

ison
  • 19
  • @Leon indicated the script is being executed out of cron. It's not attached to a terminal to enter Ctrl-C from. – Deathgrip Aug 21 '17 at 22:46