8

I need to create a while loop that if dmesg returns some/any value, then it should kill a determined process.

Here is what I have.

#!/bin/bash
while [ 1 ];
do

BUG=$(dmesg | grep "BUG: workqueue lockup" &> /dev/null)

    if [ ! -z "$BUG" ]; then
   killall someprocessname

else
    break
    fi
    done

I don't know if instead of ! -z I should do [ test -n "$BUG" ]

I think with -n it says something about expecting a binary.

I don't know if the script will even work because the BUG lockup halts every process, but still there are few more lines in dmesg until the computer gets completely borked - maybe I can catch-up and kill the process.

Rui F Ribeiro
  • 56,709
  • 26
  • 150
  • 232
Nobody
  • 83
  • 5
  • 2
    You take the whole dmesg, so once one occurrence of the searched string occurs, you willn seee it each time, and thus killall at each loop! (In addition to the other things @l0b0's mentionned such as the lack of sleeping/pacing, etc) – Olivier Dulac Apr 24 '18 at 06:20

2 Answers2

12

Some issues:

  • You are running this in a busy loop, which will consume as much resources as it can. This is one instance where sleeping could conceivably be justified.
  • However, recent versions of dmesg have a flag to follow the output, so you could rewrite the whole thing as (untested)

    while true
    do
        dmesg --follow | tail --follow --lines=0 | grep --quiet 'BUG: workqueue lockup'
        killall someprocessname
    done
    
  • The code should be indented to be readable.
  • It is really strange, but [ is the same as test - see help [.
l0b0
  • 51,350
  • 1
    Did you mean to add -q so that grep -q 'searchstring exits from the dmesg --follow and thus let the next line be reached as soon as it sees one occurence of the search string? Without it your loop will not reach the killall nor loop? – Olivier Dulac Apr 24 '18 at 06:24
  • 1
    And even with -q, I fear you'll killall a lot, if dmesg --follow shows a few lines of dmesg context (and thus shows the previous occurence(s)), hence my proposed aanswer as a variant. – Olivier Dulac Apr 24 '18 at 06:29
  • @OlivierDulac The latter issue should be taken care of with the tail. – l0b0 Apr 24 '18 at 20:53
  • What does tail --lines=0 do? I know what it means for any other value. – Joe Apr 28 '18 at 00:53
  • 1
    @Joe It's in the man page - with --follow it follows (that is, prints) only lines arriving after the command starts. – l0b0 Apr 29 '18 at 05:32
9

A variant of @l0b0's answer:

dmesg --follow | awk '
   /BUG: workqueue lockup/  { system ("killall someprocessname") ; rem="done at each occurrence. You could add further things, like print to a logfile, etc.,"
        }'

This let's awk do the looping, which has some advantages:

  • it will work until that process dies.
  • It also do not call more than 1 killall per occurence of the searchstring "BUG: workqueue lockup", which improves upon the other answer.

To test: You can put this into a script named thescript, and do nohup thescript &, so that thescript will keep running even after you quit your session.

Once you are satisfied it works, kill it, and then you can (instead of running it each time in a shell with nohup) transform it into a daemon script that you can then have started in your current runlevel.

ie: using another script as a model (you need to have at least the start, stop and status sections), you can modify thescript appropriately and then place it within /etc/rc.d/init.d, and have a symlink to it named Sxxthescript under the appropriate(s) /etc/rc.d/rcN, N being a number for your normal runlevel (see the top lines of who -a to know the current run-level). And have the appropriate Kxxthescript symlinks too, in every (or almost every) runlevels, so that the script is appropriately killed when switching runlevels.

Or do "the appropriate things" to have it run/stopped via systemd or any equivalent system your distribution uses.

  • @Nobody: I am glad. Don't forget to "accept" (green checkmark) whichever answers seems the best in your opinion, unless you feel it needs to stay open to allow further answers (or modifications of the current answers). – Olivier Dulac Apr 26 '18 at 14:46
  • both answers are correct wish I could select both. – Nobody Apr 27 '18 at 22:06