0

The Script is to compress and archive any files greater than 20 MB in a given directory:

#!/bin/bash

#Variables BASE=/home/sengh/scripts DEPTH=1 #how deep to go in the find operation RUN=0

#Check if the directory is present or not if [ ! -d $BASE ] then echo "directory does not exist: $BASE" exit 1 fi

#Create 'archive' folder if not present if [ ! -d $BASE/archive ] then mkdir $BASE/archive fi

#Find the list of files larger than 20 MB for i in 'find $BASE -maxdepth $DEPTH -type f -size +20M' do if [ $RUN -eq 0 ] then echo "[$(date "+%Y-%m-%d %H:%M:%S")] archiving $i ==> $BASE/archive" gzip $i || exit 1 mv $i.gz $BASE/archive || exit 1 fi done

I'm getting the following output:

[2024-01-26 23:55:06] archiving find $BASE -maxdepth $DEPTH -type f -size +20M ==> /home/sengh/scripts/archive gzip: invalid option -- 'x' Try `gzip --help' for more information.

I tried going through the man page of gzip but it didn't help as I'm using no such option of 'x'. Please help.

  • 2
    Your loop construct is malformed - you are passing literal string find /home/sengh/scripts -maxdepth 1 -type f -size +20M to gzip, where -maxdepth is being parsed as a set of gzip short options – steeldriver Jan 26 '24 at 18:59
  • 2
    ... but before simply "fixing" the quotes, please read Why is looping over find's output bad practice? – steeldriver Jan 26 '24 at 19:00
  • Also, with archive being under BASE, I expect that after four runs you will see files named something.gz.gz.gz.gz in your archive directory. – zwets Jan 26 '24 at 19:06
  • 2
    seeing it says "archiving find $BASE -maxdepth ..." from the line that prints the value of $i should be a rather strong hint that your loop doesn't really get any output from find (and probably doesn't even run it). It might be a good idea to also recap how e.g. quotes work in the shell (and perhaps other pages on mywiki.wooledge.org might also help) – ilkkachu Jan 26 '24 at 19:58
  • 1
    You can find and fix many errors - including the one you're asking about - by pasting your code into https://shellcheck.net/ and reviewing its suggestions – Chris Davies Jan 27 '24 at 11:42

1 Answers1

0

It's easier with zsh:

#! /bin/zsh --
mkdir -p -- ~/scripts/archive || exit

PROMPT4='[%D{%F %T}] ' # for the xtrace output to be prefixed by the # current time in [%F %T] strftime format

set -o xtrace -o noclobber # xtrace provides a trace of each command # being executed. noclobber prevents # redirection clobbering files. for file (~/scripts/*(ND.LM+20)) gzip -c -- $file > ~/archive/scripts/$file.gz && rm -f -- $file || exit

The (ND.LM+20) part are glob qualifiers, which further qualify the glob expansion:

  • N applies nullglob for that one glob only, so that the glob expands to nothing instead of returning an error if there's no match. You generally want to use it in for loop lists.
  • D applies dotglob for that one glob only to include hidden files (like find does by default)
  • LM+20 is the equivalent of GNU find's -size +20M to restrict the expansion to files whose size rounded up to an integer number of mebibytes is strictly greater than 20, so files that are 20971521 byte Long or longer.
  • . like -type f is to restrict to regular file only to the exclusion of any other type of file such as symlinks, fifos, directories, devices...

Doing the same with the same level of reliability with bash is quite cumbersome, and needs recent versions of GNU utilities.

#! /bin/bash --
mkdir -p -- ~/scripts/archive || exit

PS4='[\D{%F %T}] ' # for the xtrace output to be prefixed by the # current time in [%F %T] strftime format

set -o noclobber # prevents redirection clobbering files.

while IFS= read -rd '' -u3 file; do (set -o xtrace gzip -c -- "$file" > ~/archive/scripts/"$file".gz) && rm -f -- "$file" ) 3<&- || exit done 3< <( find -H -files0-from <(printf '%s\0' ~/scripts/archive)
-mindepth 1 -maxdepth 1 -type f -size +20M -print0 | sort -z )

As always, you need -- to separate options from non-option arguments to make sure that those non-option arguments are not treated as options if they start with -. However for find, that doesn't help as even after --, if they start with - (or are (, ), !) they are still treated as predicates. With BSD find, it's just a matter of using find -f "$dir" instead of find -- "$dir", but GNU find doesn't support that -f. Since version 4.9 however, it supports -files0-from to pass the list of files from a file.

Some remarks about your code:

  • in bash, parameter expansions must be quoted or otherwise they undergo split+glob. See When is double-quoting necessary? and many other Q&As on the subject here. shellcheck would also help you spot this kind of beginner mistake.
  • '...' is for strong quotes. 'find ...' is the literal string find .... But even if you replaced these single quotes with backticks (`...`), that would still be the wrong way to go. See Why is looping over find's output bad practice? for details.
  • Doing things like if condition-not-met; then do-something-that-would-make-sure-the-condition-is-met; fi like your if [ ! -d ...]; then mkdir... in general is best avoided as it introduces TOCTOU race conditions. With -p, mkdir creates the directory (and all leading directory components) only if it didn't exist and only fails if the directory is not there after it returns.

Here, it's the combination of the first and second point that cause gzip to return that error.

The loop is looping over one single value which is find $BASE -maxdepth.... As you forgot to quote $i in gzip $i, $i is subject to split+glob and with the default value of $IFS, gzip is passed find, $BASE, -maxdepth... separate arguments in of the contents of $i as one argument.

GNU utilities have that misfeature that options are still recognised even after non-option arguments (which makes it all the more important not to forget those -- delimiters), so -maxdepth are treated as options to gzip, and that's treated as single letter options combined together so the same as -m -a -x -d...

-m is a currently undocumented option that tells gzip/gunzip not to preserve the modification time, -a, also undocumented on Unix-like systems and short for --ascii is only relevant on Microsoft Windows (and its name is quite misleading), -x is not among the supported options, even the undocumented ones, so you get that error.