46

In a Bash script, I'm trying to store the options I'm using for rsync in a separate variable. This works fine for simple options (like --recursive), but I'm running into problems with --exclude='.*':

$ find source
source
source/.bar
source/foo

$ rsync -rnv --exclude='.*' source/ dest
sending incremental file list
foo

sent 57 bytes  received 19 bytes  152.00 bytes/sec
total size is 0  speedup is 0.00 (DRY RUN)

$ RSYNC_OPTIONS="-rnv --exclude='.*'"

$ rsync $RSYNC_OPTIONS source/ dest
sending incremental file list
.bar
foo

sent 78 bytes  received 22 bytes  200.00 bytes/sec
total size is 0  speedup is 0.00 (DRY RUN)

As you can see, passing --exclude='.*' to rsync "manually" works fine (.bar isn't copied), it doesn't work when the options are stored in a variable first.

I'm guessing that this is either related to the quotes or the wildcard (or both), but I haven't been able to figure out what exactly is wrong.

3 Answers3

75

In general, it's a bad idea to demote a list of separate items into a single string, whether a list of command line options or a list of pathnames. The reason is that somewhere down the line, you will need to split that string up into separate things again, and this isn't easy to do correctly when the string contains quoting and spacing that is or is not significant for the semantic interpretation of the data in the string.

Using an array instead:

rsync_options=( -rnv --exclude='.*' )

or

rsync_options=( -r -n -v --exclude='.*' )

and later...

rsync "${rsync_options[@]}" source/ target

This way, the quoting of the individual options is maintained (as long as you double quote the expansion of ${rsync_options[@]}). It also allows you to manipulate the individual entries of the array easily, would you need to do so, before calling rsync.

In any POSIX shell, one may use the list of positional parameters for this:

set -- -rnv --exclude='.*'

rsync "$@" source/ target

Again, double quoting the expansion of $@ is critical here.

Tangentially related:


The issue is that when you put the two sets of options into a string, the single quotes of the --exclude option's value become part of that value. Hence,

RSYNC_OPTIONS='-rnv --exclude=.*'

would have worked¹... but it's better (as in safer) to use an array or the positional parameters with individually quoted entries. Doing so would also allow you to use things with spaces in them if needed, and avoid having the shell perform filename generation (globbing) on the options.


¹ provided that $IFS is not modified and that there's no file whose name starts with --exclude=. in the current directory, and that the nullglob or failglob shell options are not set.

Kusalananda
  • 333,661
7

@Kusalananda has already explained the basic problem and how to solve it, and the Bash FAQ entry linked to by @glenn jackmann also provides a lot of useful information. Here's a detailed explanation of what's happening in my problem based on these resources.

We'll use a small script that prints each of its arguments on a separate line to illustrate things (argtest.bash):

#!/bin/bash

for var in "$@"
do
    echo "$var"
done

Passing options "manually":

$ ./argtest.bash -rnv --exclude='.*'
-rnv
--exclude=.*

As expected, the parts -rnv and --exclude='.*' are split into two arguments, as they are separated by unquoted whitespace (this is called word splitting).

Also note that the quotes around .* have been removed: the single quotes tell the shell to pass their content without special interpretation, but the quotes themselves are not passed to the command.

If we now store the options in a variable as a string (as opposed to using an array), then the quotes are not removed:

$ OPTS="--exclude='.*'"

$ ./argtest.bash $OPTS
--exclude='.*'

This is because of two reasons: the double quotes used when defining $OPTS prevent special treatment of the single quotes, so the latter are part of the value:

$ echo $OPTS
--exclude='.*'

When we now use $OPTS as an argument to a command then quotes are processed before parameter expansion, so the quotes in $OPTS occur "too late".

This means that (in my original problem) rsync uses the exclude pattern '.*' (with quotes!) instead of the pattern .* -- it excludes files whose name starts with single quote followed by a dot and ends with a single quote. Obviously that's not what was intended.

A workaround would have been to omit the double quotes when defining $OPTS:

$ OPTS2=--exclude='.*'

$ ./argtest.bash $OPTS2
--exclude=.*

However, it's a good practice to always quote variable assignments because of subtle differences in more complex cases.

As @Kusalananda noted, not quoting .* would also have worked. I had added the quotes to prevent pattern expansion, but that wasn't stricly necessary in this special case:

$ ./argtest.bash --exclude=.*
--exclude=.*

It turns out that Bash does perform pattern expansion, but the pattern --exclude=.* doesn't match any file, so the pattern is passed on to the command. Compare:

$ touch some_file

$ ./argtest.bash some_*
some_file

$ ./argtest.bash does_not_exit_*
does_not_exit_*

However, not quoting the pattern is dangerous, because if (for whatever reason) there was a file matching --exclude=.* then the pattern gets expanded:

$ touch -- --exclude=.special-filenames-happen

$ ./argtest.bash --exclude=.*
--exclude=.special-filenames-happen

Finally, let's see why using an array prevents my quoting problem (in addition to the other advantages of using arrays to store command arguments).

When defining the array, word splitting and quote handling happens as expected:

$ ARRAY_OPTS=( -rnv --exclude='.*' )

$ echo length of the array: "${#ARRAY_OPTS[@]}"
length of the array: 2

$ echo first element: "${ARRAY_OPTS[0]}"
first element: -rnv

$ echo second element: "${ARRAY_OPTS[1]}"
second element: --exclude=.*

When passing the options to the command, we use the syntax "${ARRAY[@]}", which expands each element of the array into a separate word:

$ ./argtest.bash "${ARRAY_OPTS[@]}"
-rnv
--exclude=.*
  • 1
    This stuff had me confused for a long time, so a detailed explanation like this is helpful. – Joe Aug 03 '18 at 21:54
0

When we write functions and shell scripts, in which arguments are passed in to be processed, the arguments will be passed int numerically-named variables, e.g. $1, $2, $3

For example:

bash my_script.sh Hello 42 World

Inside my_script.sh, commands will use $1 to refer to Hello, $2 to 42, and $3 for World

The variable reference, $0, will expand to the current script's name, e.g. my_script.sh

Don't play the whole code with commands as variables.

Keep in mind:

1 Avoid using all-capitals variable names in scripts.

2 Don't use backquotes, use $(...) instead, it nests better.

if [ $# -ne 2 ]
then
    echo "Usage: $(basename $0) DIRECTORY BACKUP_DIRECTORY"
    exit 1
fi

directory=$1
backup_directory=$2
current_date=$(date +%Y-%m-%dT%H-%M-%S)
backup_file="${backup_directory}/${current_date}.backup"

tar cv "$directory" | openssl des3 -salt | split -b 1024m - "$backup_file"