When GNU grep
tries to write its result, it will fail with a non-zero exit status, because it has nowhere to write the output, because the SSH connection is gone.
This means that the if
statement is always taking the else
branch.
To illustrate this (this is not exactly what's happening in your case, but it shows what happens if GNU grep
is unable to write its output):
$ echo 'hello' | grep hello >&- 2>&-
$ echo $?
2
Here we grep
for the string that echo
produces, but we close both output streams for grep
so that it can't write anywhere. As you can see, the exit status of GNU grep
is 2 rather than 0.
This is particular to GNU grep
, grep
on BSD systems won't behave the same:
$ echo 'hello' | grep hello >&- 2>&- # using BSD grep here
$ echo $?
0
To remedy this, make sure that the script does not generate output. You can do this with exec >/dev/null 2>&1
. Also, we should be using grep
with its -q
option since we're not at all interested in seeing the output from it (this would generally also speed up the grep
as it does not need to parse the whole file, but in this case it make very little difference in speed since the file is so small).
In short:
#!/bin/sh
# redirect all output not redirected elsewhere to /dev/null by default:
exec >/dev/null 2>&1
while true; do
date >sdown.txt
ping -c 1 -W 1 myserver.net >pingop.txt
if ! grep -q "64 bytes" pingop.txt; then
mutt -s "Server Down!" myemail@address.com <sdown.txt
break
fi
sleep 10
done
You may also use a test on ping
directly, removing the need for one of the intermediate files (and also getting rid of the other intermediate file that really only ever contains a datestamp):
#!/bin/sh
exec >/dev/null 2>&1
while true; do
if ! ping -q -c 1 -W 1 myserver.net; then
date | mutt -s "Server Down!" myemail@address.com
break
fi
sleep 10
done
In both variations of the script above, I choose to exit the loop upon failure to reach the host, just to minimise the number of emails sent. You could instead replace the break
with e.g. sleep 10m
or something if you expect the server to eventually come up again.
I've also slightly tweaked the options used with ping
as -i 1
does not make much sense with -c 1
.
Shorter (unless you want it to continue sending emails when the host is unreachable):
#!/bin/sh
exec >/dev/null 2>&1
while ping -q -c 1 -W 1 myserver.net; do
sleep 10
done
date | mutt -s "Server Down!" myemail@address.com
As a cron job running every minute (would continue sending emails every minute if the server continues to be down):
* * * * * ping -q -c 1 -W 1 >/dev/null 2>&1 || ( date | mail -s "Server down" myemail@address.com )
:
do? It would make sense to me it it were a semicolon;
... – Ned64 Aug 11 '19 at 12:21:
does nothing. This is what it is designed to do. Here, instead of inverting the test, they use it to do a no-op beforeelse
. – Kusalananda Aug 11 '19 at 12:22-q
ingrep
command! – Baraujo85 Aug 12 '19 at 16:37if
command usage,grep
command options, such as-q
, which solved my problem, the explanation of colon and semicolon usage, some difference between the operation ofgrep
command in BSD and GNU systems! – Baraujo85 Aug 12 '19 at 16:45