1

I have a job dispatcher bash shell script containing below codes:

for (( i=0; i<$toBeDoneNum; i=i+1 ))
do
    while true
    do
            processNum=`ps aux | grep Checking | wc -l`
            if [ $processNum -lt $maxProcessNum ]; then
                break
            fi
            echo "Too many processes: Max process is $maxProcessNum."
            sleep $sleepSec
    done
    java -classpath ".:./conf:./lib/*" odx.comm.cwv.main.Checking $i
done

I run the script like this to be in the background:

./dispatcher.sh &

I want to terminate this dispatcher process with kill -9. But I didn't record the pid of the dispatcher process at the first time. I used jobs, jobs -l, jobs -r and jobs -s. Nothing showed. Even this fg cannot bring the process to foreground.

fg
bash: fg: current: no such job

But I think this dispatcher process is still running because it still continues to assign java program to run (already used top and ps -ef to check). How should I terminate this job dispatcher bash shell script process?

antzshrek
  • 350

3 Answers3

4

You should be able to just kill the script by name using the pkill command.

$ pkill -9 dispatcher.sh

excerpt from man page

pgrep, pkill - look up or signal processes based on name and other 
               attributes

OPTIONS
       -signal
       --signal signal
              Defines the signal to send to each matched process.  Either 
              the numeric or the symbolic signal name can be used.  (pkill 
              only.)

See the man page for pkill for more info.

Finding processes

If you find that you no longer know a process's process ID (PID) you can find it a couple of ways.

pgrep

You can use pgrep to find a process by name.

$ pgrep dispatcher.sh
12345

You can then perform a kill -8 12345.

ps

Most people learned to find PID's using ps. You can look for your process in the output like this.

$ ps -eaf | grep [d]ispatcher.sh
saml      2735     1  0 Jan11 ?        00:02:50 dispatcher.sh

The PID is the 2nd column in the output (typically). The above trick where I wrap the first letter of the process I'm looking for eliminates the grep from showing up in the results. Try it without the square brackets to see what I mean.

slm
  • 369,824
  • The thing I cannot realize is that even jobs, pgrep and ps cannot find the dispatcher process. But new jobs are still being assigned. Really strange. – Marcus Thornton Jan 13 '14 at 04:03
  • Looks like you made yourself a little form bomb type of script. Look for while and for processes. – slm Jan 13 '14 at 04:15
  • What do you mean by form bomb? Does my syntax go wrong? – Marcus Thornton Jan 13 '14 at 05:52
  • Sorry that should've said fork bomb. http://unix.stackexchange.com/questions/89003/where-is-the-fork-on-the-fork-bomb – slm Jan 13 '14 at 05:55
  • @MarcusThornton Given the code you've posted, I can't think of anything to add to slm's answer. If you need more help, post the whole script. Also post ps l lines for some of these Java processes, look at the PPID line and see what their parent is. – Gilles 'SO- stop being evil' Jan 13 '14 at 21:55
  • Personally, I think we should educate other users, that e.g. -9 is equivalent to -KILL and should be preferred as it is self-explanatory. Further, I think you should not recommend SIGKILL as the first aid. Rather than that I would post in this order: pkill -HUP, pkill -TERM, pkill -KILL in a non-specific answer. – Vlastimil Burián Apr 14 '17 at 06:32
0

If you can see the Java processes spawned by the dispatcher in the output of the ps, try to check the PPID of these processes. Check if it is a bash process, and if so, try killing it.

0

Given your full script it seems what you are trying to do is simply to have a given number of jobs running in parallel. Using GNU Parallel it looks like this:

# -q to avoid lib/* expanding
seq 0 $toBeDoneNum |
  parallel -j $maxProcessNum -q java -classpath ".:./conf:./lib/*" odx.comm.cwv.main.Checking
Ole Tange
  • 35,514