9

Is there a good clean conventional way to do this?

For example, if there are any ".gz" files in a certain directory, I want to unzip them. But if there aren't, I don't want to see any error.

If I use gzip -d /mydir/*.gz, I get the error:

gzip: /mydir/*.gz: No such file or directory

If I first shopt -s nullglob and then gzip -d /mydir/*.gz, I get the following:

gzip: compressed data not read from a terminal. Use -f to force decompression.
For help, type: gzip -h

I have one method I know will work, which I'm posting as an answer. I'm wondering if there's a better/cleaner way.

POSIX compatibility is a bonus, but not required.

Wildcard
  • 36,499
  • 1
    for bash this answer is very good using compgen command: http://stackoverflow.com/questions/2937407/test-whether-a-glob-has-any-matches-in-bash – AnyDev Jul 26 '16 at 12:52

3 Answers3

11

With bash:

shopt -s nullglob
files=(/mydir/*.gz)
((${#files[@]} == 0)) || gzip -d -- "${files[@]}"

With zsh:

files=(/mydir/*.gz(N))
(($#files == 0)) || gzip -d -- $files

Note that in zsh, without (N), like in pre-Bourne shells, csh or tcsh, if the glob doesn't match, the command is not run, you'd only do the above to avoid the resulting error message (of no match found as opposed to gzip failing on the expanded glob in the case of bash or other Bourne-like shells). You can achieve the same result with bash with shopt -s failglob.

In zsh, a failing glob is a fatal error that causes the shell process where it's expanded (when not interactive) to exit. You can prevent your script from exiting in that case either by using a subshell or using zsh error catching mechanism ({ try-block; } always { error-catching; }), (or by setting the nonomatch (to work like sh), nullglob or noglob option of course, though I wouldn't recommend that):

$ zsh -c 'echo zz*; echo not output'
zsh:1: no matches found: zz*
$ zsh -c '(echo zz*); echo output'
zsh:1: no matches found: zz*
output
$ zsh -c '{echo zz*;} always {TRY_BLOCK_ERROR=0;}; echo output'
zsh:1: no matches found: zz*
output
$ zsh -o nonomatch -c 'echo zz*; echo output'
zz*
output

Note that if the glob is passed to an external command, it's only the process that was forked to execute the command that exits:

$ zsh -c '/bin/echo zz*; echo still output and the previous command exited with $?'
zsh:1: no matches found: zz*
still output and the previous command exited with 1

With ksh93

ksh93 eventually added a mechanism similar to zsh's (N) glob qualifier to avoid having to set a nullglob option globally:

files=(~(N)/mydir/*.gz)
((${#files[@]} == 0)) || gzip -d -- "${files[@]}"

POSIXly

Portably in POSIX sh, where non-matching globs are passed unexpanded with no way to disable that behaviour (the only POSIX glob related option is noglob to disable globbing altogether), the trick is to do something like:

set -- /mydir/[*].gz /mydir/*.gz
case $#$1$2 in
   '2/mydir/[*].gz/mydir/*.gz') : no match;;
   *) shift; gzip -d -- "$@"
esac

The idea being that if /mydir/*.gz doesn't match, then it will expand to itself (/mydir/*.gz). However, it could also expand to that if there was one file actually called /mydir/*.gz, so to differentiate between the cases, we also use the /mydir/[*].gz glob that would also expand to /mydir/*.gz if there was a file called like that.

As that's pretty awkward, you may prefer using find in those cases:

find /mydir/. ! -name . -prune ! -name '.*' \
   -name '*.gz' -type f -exec gzip -d {} +

The ! -name . -prune is to not look into subdirectories (some find implementations have -mindepth 1 -maxdepth 1 as an equivalent). ! -name '.*' is to exclude hidden files like globs do.

Another benefit is that it still works if the list of files is too big to fit in the limit of the size of arguments to an executed command (find will run several gzip commands if need to avoid that, ksh93 (with command -x) and zsh (with zargs) also have mechanisms to work around that).

Another benefit is that you will get error messages if find cannot read the content of /mydir or can't determine the type of the files (globs would just silently ignore the problem and act as if the corresponding files don't exist).

A small down side is that you lose the exact value of gzip's exit status (if any one gzip invocation fails with a non-zero exit status, find will still exit with a non-zero exit status (though not necessarily the same) though, so that's good enough for most use cases).

Another benefit is that you can add the -type f to avoid trying to uncompress directories or fifos/devices/sockets... whose name ends in .gz. Except in zsh (*.gz(.) for regular files only), globs cannot filter by file types, you'd need to do things like:

set --
for f in /mydir/*.gz
  [ -f "$f" ] && [ ! -L "$f" ] && set -- "$@" "$f"
done
[ "$#" -eq 0 ] || gzip -d -- "$@"
3

One way to do this is:

shopt -s nullglob
for f in /mydir/*.gz; do
  gzip -d /mydir/*.gz
  break
done

The for loop with nullglob set on will only execute the loop at all if the glob has an expansion, and the unconditional break statement ensures that the gzip command will only be executed once.

It's a bit funny because it uses a for loop as an if, but it works.

Wildcard
  • 36,499
1

The simplest solution would be

for f in *.gz; do
  test -f "$f" && gzip -d -- "$f"
done

or

for f in *.gz; do
  [ -f "$f" ] && gzip -d -- "$f"
done

or even

for f in *.gz; do
  if [ -f "$f" ]; then
    gzip -d -- "$f"
  fi
done

I'm all for verbosity if it help explain what the code does. The various shells have compact syntax for doing all sorts of things, but it doesn't beat the above when it comes to documenting what you're doing (and portability). I'm thinking primarily about shell scripting here, and situations in which your scripts may be used and modified by others.

EDIT:

If you just want to see if the glob has an expansion:

if [ "$(printf '%s' *.gz)" = "*.gz" ] && [ ! -f "*.gz" ]; then
    # From Stéphane Chazelas: "There are no gzipped files that I
    # can tell, or if there's one, it's called *.gz and it is not a
    # regular file (after symlink resolution) or at least I can't tell
    # if it's a regular file or not."

    echo "There are no gzipped files"
else
    echo "There are gzipped files"
fi

See also Stéphane's further comments below regarding exotic file names.

Kusalananda
  • 333,661
  • For gunzip this may work, but for some commands, e.g. tar, it's necessary to run the command itself on the entire list of files at once rather than one by one. Although you could use for f in *.gz; do [ -e "$f" ] && gzip -d *.gz; break; done, I suppose. That obviates any need for "nullglob." – Wildcard Jul 26 '16 at 07:11
  • @Wildcard The most common use case for creating a tar archive is to archive a directory rather than individual files. – Kusalananda Jul 26 '16 at 07:56
  • 1
    Note that [ -f "$file" ] tests whether $file is accessible (as in one can do a stat() on it) and is of type regular after symlink resolution, so it will give true for symlinks to regular files, and false for symlinks to inaccessible files or in the case of /mydir/*.gz if you don't have search permission to /mydir – Stéphane Chazelas Jul 26 '16 at 09:53
  • 2
    Stricktly speaking, the "There are no gzipped files" should be "There are no gzipped files that I can tell, or if there's one, it's called *.gz and it is not a regular file (after symlink resolution) or at least I can't tell if it's a regular file or not". See the *.gz, [*].gz trick to avoid having to rely on stat(). – Stéphane Chazelas Jul 26 '16 at 10:00
  • 1
    A downside is that it runs one gzip command per file. That also means that if gunzip fails for one file (except the last one), that won't be reflected in the exit status (though in most cases, you'll still get an error message). (a problem that affects find approaches as well) – Stéphane Chazelas Jul 26 '16 at 10:02
  • Note that with Unix compliant echos, "$(echo *.gz)" would expand to *.gz if there was a file called *.gz\canything.gz in the current directory (and possibly others provided they sort after that one). See also \052.gz, *\056gz and so on. More generally, you can't use echo for arbitrary data. – Stéphane Chazelas Jul 26 '16 at 11:03
  • @StéphaneChazelas I hadn't noticed tho OB marker on -a and -o, thanks! As for your last comment, I need to stop somewhere. I'll leave a note in the answer instead. – Kusalananda Jul 26 '16 at 11:08
  • 2
    @Kusalananda, or you could follow the POSIX recommendation of using printf instead of echo for arbitrary data. IMO, it's better to encourage best practice. – Stéphane Chazelas Jul 26 '16 at 11:47