7

In my directory I have two files with space, foo bar and another file. I also have two files without space, file1 and file2.

The following script works:

for f in foo\ bar another\ file; do file "$f"; done

This script also works:

for f in 'foo bar' 'another file'; do file "$f"; done

But the following script doesn't work:

files="foo\ bar another\ file"
for f in $files; do file "$f"; done

Not even this script works:

files="'foo bar' 'another file'"
for f in $files; do file "$f"; done

But, if the files do not contain space, the script works:

files="file1 file2"
for f in $files; do file "$f"; done

Thanks!

Edit

Code snippet of my script:

while getopts "i:a:c:d:f:g:h" arg; do
  case $arg in
    i) files=$OPTARG;;
    # ...
  esac
done

for f in $files; do file "$f"; done

With files without spaces, my script works. But I would like to run the script passing files with spaces as argument in one of these ways:

./script.sh -i "foo\ bar another\ file"
./script.sh -i foo\ bar another\ file
./script.sh -i "'foo bar' 'another file'"
./script.sh -i 'foo bar' 'another file'
Pedro Siqueira
  • 183
  • 1
  • 1
  • 6

4 Answers4

13

If you're using bash you can use an array for this

#!/bin/bash
files=('foo bar' 'another file' file1 'file2')
for f in "${files[@]}"; do file -- "$f"; done

Quoting is required for file names containing whitespace or any other character that is special in the syntax of the shell such as ;, |, &, * and many more.

Double-quoting is also required around expansions in list contexts at least (like "${files[@]}" and "$f" here) for file names containing globbing characters (*, ?, [, and \ in some versions of bash, and more if extglob is enabled) or characters of $IFS (space, tab and newline by default).

It's optional (but I'd recommend it) for file names that contain none of those and no non-ASCII character.

If the list of files comes from the current working directory you can use wildcards as you'd expect, e.g. files=(*f*) to match any file or directory with f in its name, though you'd want shopt -s nullglob beforehand to avoid getting a literal *f* if there's no match. (But then you could probably just use for f in *f*; do...done and avoid the array entirely).

The -- marker for file tells it that any subsequent parameter is a filename - even if it starts with a dash.

Read more with man bash (search for Arrays) or just info bash arrays.

Chris Davies
  • 116,213
  • 16
  • 160
  • 287
6

There's a bit of a difference between the four script invocations you posted.

./script.sh -i "'foo bar' 'another file'"
./script.sh -i "foo\ bar another\ file"

The above two both pass to the script -i as the first argument, and a single string as the second. In the first one, that's 'foo bar' 'another file', and in the second, it's foo\ bar another\ file. In the shell language, both are valid ways to present the two strings (or filenames) foo bar and another file. But the quote and backslash processing only applies when the strings are on a raw command line, not when they're inside a variable, as they end up in the string.

./script.sh -i foo\ bar another\ file
./script.sh -i 'foo bar' 'another file'

On the other hand, these two pass a total of three arguments: -i, foo bar, and another file.

The difference is somewhat important in that it's much easier to deal safely with distinct arguments. You just need to keep them intact, and don't have to process the quotes and escapes embedded within.

Also, importantly, running something like script ./*.txt will pass the filenames as distinct arguments.

E.g. this would just call file on both files if called as script 'foo bar' another\ file:

#! /bin/sh -
for f in "$@"; do
    file -- "$f"
done

Or, even simpler and more portable as for loops over the positional parameters by default:

#! /bin/sh -
for f do
    file -- "$f"
done

But you have the getopts there, too. And with the filenames as distinct arguments, only the first would appear as the argument to -i. Here, there's basically two common options.

Either have the user use the -i option repeatedly, collecting the filenames to an array, so:

#! /bin/bash -
files=()
while getopts "i:" arg; do
  case $arg in
    (i) files+=("$OPTARG");;
  esac
done

for f in "${files[@]}"; do file -- "$f"; done

and run as script -i "foo bar" -i "another file". (Running script -i file1 file2 would have file2 ignored.) Similarly you could add another array to collect filenames given through another option.

Or, have the option set the "mode" the script works in, and take the filenames as a list distinct from the options. getopts leaves all the arguments intact, you'll just have to drop the ones it processed with shift. So:

#! /bin/bash -

unset -v mode while getopts "ie" arg; do case $arg in (i) mode=i;; (e) mode=e;; (*) : handle usage error;; esac done shift "$((OPTIND - 1))"

case "$mode" in (i) for f do file -- "$f"; done;; (e) echo "do something else with the files";; (*) echo "error: mode not specified" >&2;; esac

and then run it as script -i "foo bar" "another file" or script -i -- -file-starting-with-dash- -other-file- or script -i -- *.

I'm assuming here that you are also doing something else with getopts other than taking the -i in, since otherwise you could just drop the flag entirely. :) But what other options you have, somewhat affects what the most sensibly (or customary) solution is.

Also if your loop only calls file on the files, you could just run file -- "${files[@]}" or file -- "$@" and skip the loop.


However, if you want to be able to this:

script -i foo bar -e doo daa

and have the script do one thing for the files foo and bar, and another thing for doo and daa, then that's a bit of a different issue. It can be done, sure, but getopts might not be the tool for that.

See also:

And of course:

for the issues with trying to deal with multiple distinct arbitrary strings (filenames) within a single variable.

ilkkachu
  • 138,973
5

For your command line parsing, arrange with the pathname operands to always be the last ones on the command line:

./myscript -a -b -c -- 'foo bar' 'another file' file[12]

The parsing of the options would look something like

a_opt=false b_opt=false c_opt=false
while getopts abc opt; do
     case $opt in
         a) a_opt=true ;;
         b) b_opt=true ;;
         c) c_opt=true ;;
         *) echo error >&2; exit 1
    esac
done

shift "$(( OPTIND - 1 ))"

for pathname do # process pathname operand "$pathname" here done

The shift will make sure to shift off the handled options so that the pathname operands are the only things left in the list of positional parameters.

If that's not possible, allow the -i option to be specified multiple times and collect the given arguments in an array each time you come across it in the loop:

pathnames=() a_opt=false b_opt=false c_opt=false
while getopts abci: opt; do
     case $opt in
         a) a_opt=true ;;
         b) b_opt=true ;;
         c) c_opt=true ;;
         i) pathnames+=( "$OPTARG" ) ;;
         *) echo error >&2; exit 1
    esac
done

shift "$(( OPTIND - 1 ))"

for pathname in "${pathnames[@]}"; do # process pathname argument "$pathname" here done

This would be called as

./myscript -a -b -c -i 'foo bar' -i 'another file' -i file1 -i file2
Kusalananda
  • 333,661
  • It worked! Thanks! But what if I want to pass a wildcard like ./script -a -b -c -i "*.mp4", and process all files ending ".mp4"? – Pedro Siqueira Apr 08 '21 at 17:59
  • 2
    @PedroSiqueira I would avoid this if possible. You would have to find a safe way to expand that globbing pattern to a list of pathnames, and allow for it to also not be a pattern (or, if you have a separate option for that). One could possibly plug it into find, but that would be fiddly. You could definitely ask a separate question about that, because it's most definitely non-trivial to do safely. – Kusalananda Apr 08 '21 at 18:04
-2

Bash is not too good at dealing with variables with space in them. You have to use the "rename" functionality to help you with this a bit like below:

Use below if you are on a debian like system

sudo apt-get install rename

Use below if you are on a red hat like system

sudo yum install install rename

Move into the directory where your files are

cd your_target_directory

Rename all the files/variables containing the space character. With this step no more problems with the space

rename 's/ /_/g' *

Proceed without issue from here... enjoy

Chris Davies
  • 116,213
  • 16
  • 160
  • 287
  • 5
    Note that spaces is just one of several problematic characters that are valid in filenames. Others includes tabs, newlines, characters that are part of globbing patterns (such as * and ? etc.), control characters, emojis and other non-latin characters (visible and invisible). It's easier to write code that correctly handles all valid filenames, than to rename files that broken code can't handle. – Kusalananda Apr 08 '21 at 20:28