213

I want to write logic in shell script which will retry it to run again after 15 sec upto 5 times based on "status code=FAIL" if it fails due to some issue.

15 Answers15

239
for i in 1 2 3 4 5; do command && break || sleep 15; done

Replace "command" with your command. This is assuming that "status code=FAIL" means any non-zero return code.


Variations:

Using the {..} syntax. Works in most shells, but not BusyBox sh:

for i in {1..5}; do command && break || sleep 15; done

Using seq and passing along the exit code of the failed command:

for i in $(seq 1 5); do command && s=0 && break || s=$? && sleep 15; done; (exit $s)

Same as above, but skipping sleep 15 after the final fail. Since it's better to only define the maximum number of loops once, this is achieved by sleeping at the start of the loop if i > 1:

for i in $(seq 1 5); do [ $i -gt 1 ] && sleep 15; command && s=0 && break || s=$?; done; (exit $s)
Alexander
  • 9,850
  • 13
    Just a note, this works because the && is evaluated before the || because of operator precedence – gene_wood Sep 04 '15 at 19:50
  • Shellcheck says: "Note that A && B || C is not if-then-else. C may run when A is true. [SC2015]" – Mausy5043 Dec 29 '18 at 10:25
  • 2
    @Mausy5043, for this case it does not matter, since s=0 is true, and break breaks the loop. – Alexander Mar 02 '19 at 17:30
  • 1
    super helpful.. my vpn used to disconnect every now and then and it super frustrating to connect it every time.. for now I'm using for i in {1..20}; do vpnon; done. there are many issue though, one being Ctrl+C doesn't work, but I'll figure them out later. – Krishna Sep 07 '21 at 05:53
  • If your vpnon command does not handle ctrl+c, adding "; sleep 3" after "vpnon" gives you three seconds where you can press ctrl+c. – Alexander Sep 07 '21 at 07:41
  • @Alexander: How to retry shell script command when any exception occurs (SQL exception, Java exception) or shell script execution Failed due to pgbouncer etc.? – venkat Jun 08 '23 at 19:14
  • @venkat The program that has the exception needs to exit with an error code other than 0 for the shell to notice. – Alexander Oct 20 '23 at 18:16
168

This script uses a counter n to limit the attempts at the command to five. If the command is successful, break ends the loop.

n=0
until [ "$n" -ge 5 ]
do
   command && break  # substitute your command here
   n=$((n+1)) 
   sleep 15
done
hardfork
  • 103
suspectus
  • 6,010
58
function fail {
  echo $1 >&2
  exit 1
}

function retry {
  local n=1
  local max=5
  local delay=15
  while true; do
    "$@" && break || {
      if [[ $n -lt $max ]]; then
        ((n++))
        echo "Command failed. Attempt $n/$max:"
        sleep $delay;
      else
        fail "The command has failed after $n attempts."
      fi
    }
  done
}

Example:

retry ping invalidserver

produces this output:

ping: unknown host invalidserver
Command failed. Attempt 2/5:
ping: unknown host invalidserver
Command failed. Attempt 3/5:
ping: unknown host invalidserver
Command failed. Attempt 4/5:
ping: unknown host invalidserver
Command failed. Attempt 5/5:
ping: unknown host invalidserver
The command 'ping invalidserver' failed after 5 attempts

For a real-world, working example with complex commands, see this script.

28

GNU Parallel has --retries:

parallel --retries 5 --delay 15s ::: ./do_thing.sh

Example:

parallel -t --retries 5 --delay 0.1s 'echo {};exit {}' ::: {0..10}
Ole Tange
  • 35,514
  • This doesn't work. --retries is for retries on different machines: "If a job fails, retry it on another computer on which it has not failed. Do this n times. If there are fewer than n computers in --sshlogin GNU parallel will re-use all the computers. This is useful if some jobs fail for no apparent reason (such as network failure)" – James Moore May 08 '20 at 16:31
  • 4
    Have you tested? --retries is both for local and remote jobs. But for remote jobs GNU Parallel tries to retry the job on another server if possible (maybe this job and that server just do not like eachother for some unknown reason). – Ole Tange May 08 '20 at 16:45
  • Turns out I was confused by the documentation and by one undocumented feature (at least I didn't see doc) of retries - if you fail, and you have retries turned on, you don't see stderr until the last fatal error. – James Moore May 08 '20 at 18:07
  • 2
    The --help of parallel could be better. In general it's a nicely developed tool, certainly superior to the alternatives with a few minor downsides. (At least since the citation nagscreen is gone, that was a no-go for me.)

    Btw: --delay is not delaying only for retries, it delays everything. afaik there is no functionality yet to delay only on error (would be useful

    – John Jan 07 '21 at 04:21
  • @john You should feel free to submit at bug report with a better --help text. https://savannah.gnu.org/bugs/?func=additem&group=parallel Version 20201222 has --delay 123auto which will start out at 123, but adjust the delay up and down depending on whether jobs succeed or fail. – Ole Tange Jan 07 '21 at 12:58
  • @OleTange No offense meant, it seems you are the maintainer ? I don't have the time to make a good bug report.

    I just noticed that --help is very short in comparison to usual tools and doesn't cover the majority of options. The manpage is quite good but it's also huge and as all unix manpages with their 1980 touch not that pleasant to traverse. Usually --help lists a one-liner for each argument and manpage is for deeper study of one option.

    Imho the tool needs --retry-delay, at least in all my cases of parallelization I want no usual delay but a significant delay on error.

    – John Jan 07 '21 at 17:30
14

Here is my favorite one line alias / script

    alias retry='while [ $? -ne 0 ] ; do fc -s ; done'

Then you can do stuff like:

     $ ps -ef | grep "Next Process"
     $ retry

and it will keep running the prior command until it finds "Next Process"

Jeff
  • 141
14

Here is function for retry

function retry()
{
        local n=0
        local try=$1
        local cmd="${@: 2}"
        [[ $# -le 1 ]] && {
        echo "Usage $0 <retry_number> <Command>"; }

        until [[ $n -ge $try ]]
        do
                $cmd && break || {
                        echo "Command Fail.."
                        ((n++))
                        echo "retry $n ::"
                        sleep 1;
                        }

        done
}

retry $*

Output :

[test@Nagios ~]$ ./retry.sh 3 ping -c1 localhost
PING localhost (127.0.0.1) 56(84) bytes of data.
64 bytes from localhost (127.0.0.1): icmp_seq=1 ttl=64 time=0.207 ms

--- localhost ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.207/0.207/0.207/0.000 ms

[test@Nagios ~]$ ./retry.sh 3 ping -c1 localhostlasjflasd
ping: unknown host localhostlasjflasd
Command Fail..
retry 1 ::
ping: unknown host localhostlasjflasd
Command Fail..
retry 2 ::
ping: unknown host localhostlasjflasd
Command Fail..
retry 3 ::
Rahul Patil
  • 24,711
10

Having a need to do this multiple times, the scripting was getting out of hand, so I created a dedicated tool for this called retry.

retry --until=success --times=5 --delay=15 command ...

If you need multiple commands, you can use sh -c, e.g.

retry -- sh -c 'date && false'

Retry is available here: https://github.com/minfrin/retry

Gajus
  • 105
5

I use this script that makes the retries of a given command, the benefit of this script is that if fails all retries it will preserve the exit code.

#!/usr/bin/env bash

if [ $# -ne 3 ]; then
    echo 'usage: retry <num retries> <wait retry secs> "<command>"'
    exit 1
fi

retries=$1
wait_retry=$2
command=$3

for i in `seq 1 $retries`; do
    echo "$command"
    $command
    ret_value=$?
    [ $ret_value -eq 0 ] && break
    echo "> failed with $ret_value, waiting to retry..."
    sleep $wait_retry
done

exit $ret_value

Probably it can get simplier

padilo
  • 151
  • 1
  • 2
5

You can use the loop command, available here, like so:

$ loop './do_thing.sh' --every 15s --until-success --num 5 

Which will do your thing every 15 seconds until it succeeds, for a maximum of five times.

4

See below Example :

n=0
while :
do
        nc -vzw1 localhost 3859
        [[ $? = 0 ]] && break || ((n++))
        (( n >= 5 )) && break

done

I'm trying to connect port 3389 on localhost, it will retry until 5 times fail , if success then it will break the loop.

$? it's exist status of command if it zero means command successfully run , if other than zero means command fai

Seems little bit complicated, may be someone do it better than this.

Rahul Patil
  • 24,711
1

slight mods for one-liner that has an increasing delay in seconds for the next execution

 DELAYS=(0 1 3 5); (for i in 1 2 3 4; do sleep ${DELAYS[$i]}; <COMMAND> && break || [ $i -lt 4 ] && echo "retry in ${DELAYS[$i+1]}s"; done)

example (git push used for the error)

✗  DELAYS=(0 1 3 5); (for i in 1 2 3 4; do sleep ${DELAYS[$i]}; git pish && break || [ $i -lt 4 ] && echo "retry in ${DELAYS[$i+1]}s"; done)
git: 'pish' is not a git command. See 'git --help'.

The most similar command is push retry in 1s git: 'pish' is not a git command. See 'git --help'.

The most similar command is push retry in 3s git: 'pish' is not a git command. See 'git --help'.

The most similar command is push retry in 5s git: 'pish' is not a git command. See 'git --help'.

The most similar command is push

replace git pish with proper git push


✗  DELAYS=(0 1 3 5); (for i in 1 2 3 4; do sleep ${DELAYS[$i]}; git push && break || [ $i -lt 4 ] && echo "retry in ${DELAYS[$i+1]}s"; done)
Everything up-to-date
DmitrySemenov
  • 785
  • 7
  • 16
0

Here's a recursive retry function for functional programming purists:

retry() {
  cmd=$1
  try=${2:-15}       # 15 by default
  sleep_time=${3:-3} # 3 seconds by default

  # Show help if a command to retry is not specified.
  [ -z "$1" ] && echo 'Usage: retry cmd [try=15 sleep_time=3]' && return 1

  # The unsuccessful recursion termination condition (if no retries left)
  [ $try -lt 1 ] && echo 'All retries failed.' && return 1

  # The successful recursion termination condition (if the function succeeded)
  $cmd && return 0

  echo "Execution of '$cmd' failed."

  # Inform that all is not lost if at least one more retry is available.
  # $attempts include current try, so tries left is $attempts-1.
  if [ $((try-1)) -gt 0 ]; then
    echo "There are still $((try-1)) retrie(s) left."
    echo "Waiting for $sleep_time seconds..." && sleep $sleep_time
  fi

  # Recurse
  retry $cmd $((try-1)) $sleep_time
}

Pass it a command (or a function name) and optionally a number of retries and a sleep duration between retries, like so:

retry some_command_or_fn 5 15 # 5 tries, sleep 15 seconds between each
  • This doesn't work for commands more than one word long: cmd="echo blah blah" ...

    line 10: [: blah: integer expression expected ...

    Neither does it work for pipes, etc.

    – Mercury00 Mar 14 '19 at 18:05
  • I don't think any functional programming purist will touch bash, just saying... – qwr May 09 '22 at 00:38
0

Answering this question as existing answers fail to,

  1. Doesn't throw Error Code.
  2. By doing exit errCode, Bash doesn't honor certain traps such as trap somefunc ERR
COMMAND="SOMECOMMAND"
TOTAL_RETRIES=3

retrycount=0
until [ $retrycount -ge $((TOTAL_RETRIES-1)) ]
do
   $COMMAND && break
   retrycount=$((retrycount+1))
   sleep 1
done

if [ $retrycount -eq $((TOTAL_RETRIES-1)) ]
then
    $COMMAND
fi
0
    # Retries a given command given number of times and outputs to given variable
    # $1 : Command to be passed : handles both simple, piped commands
    # $2 : Final output of the command(if successfull)
    # $3 : Number of retrial attempts[Default 5]
    function retry_function() {
        echo "Command to be executed : $1"
        echo "Final output variable : $2"
        echo "Total trials [Default:5] : $3"
        counter=${3:-5}
        local _my_output_=$2 #make sure passed variable is not same as this
        i=1
        while [ $i -le $counter ]; do
            local my_result=$(eval "$1")
            # this tests if output variable is populated and accordingly retries,
            # Not possible to provide error status/logs(STDIN,STDERR)-owing to subshell execution of command
            # if error logs are needed, execute the same code, outside function in same shell
            if test -z "$my_result"
            then
                echo "Trial[$i/$counter]: Execution failed"
            else
                echo "Trial[$i/$counter]: Successfull execution"
                eval $_my_output_="'$my_result'"
                break
            fi
            let i+=1
        done
    }

    retry_function "ping -c 4 google.com | grep \"min/avg/max\" | awk -F\"/\" '{print \$5}'" avg_rtt_time
    echo $avg_rtt_time

 - To pass in a lengthy command, pass a method echoing the content. Take care of method expansion accordingly in a subshell at appropriate place.
 - Wait time can be added too - just before the increment!
 - For a complex command, youll have to take care of stringifying it(Good luck)
0

This is an old question but I found myself returning to this often. My use case was to have a one liner that can retry a command up to n times that can be used on with kubernetes pod (of course if it will work for bash script).

TRY=6; until [ $TRY -eq 0 ] || <your command over here> ; do echo $TRY; echo "<output message>"; TRY=$(expr $TRY - 1); sleep 15; done;

The one liner is a bit hard to get your head around, but it can be very helpful.