1

I am trying to search for a particular string within zipped files but cannot get the 'xargs' syntax correct.

The files unzip/zip correctly but xargs is actually searching for nothing (we are looking for failed TLS EMails).

Can anyone give me some pointers about the correct xargs syntax?

for filename in $( ls -1 ${HOST}-mail-2018${1}[0-9][0-9]* )
do
  filetype=${filename##*.}
  case $filetype in
    bz2)
        unzipper="bzip2 -d "
        zipper="bzip2"
        unzfile=${filename%.${filetype}}
        ;;
    gz)
        unzipper="gzip -d "
        zipper="gzip "
        unzfile=${filename%.${filetype}}
        ;;
    xz)
        unzipper="xz -d "
        zipper="xz "
        unzfile=${filename%.${filetype}}
       ;;
    *)
        echo "Unknown compression type for file $filename"
        break
        ;;
   esac
        #  Testing:    echo $unzipper $zipper $unzfile
        echo $unzipper $zipper $filename $unzfile

   eval ${unzipper} ${filename}
   grep 'Cannot .*TLS' ${unzfile} | sed 's/^.*]: //' | sed 's/:.*//' |  xargs fgrep
   eval ${zipper} ${unzfile}
done
exit 0
Rui F Ribeiro
  • 56,709
  • 26
  • 150
  • 232
mintster
  • 23
  • 2
  • It's not clear why you're using xargs at all. Isn't grep PATTERN | sed ... enough? Also, you shouldn't zip and unzip back the files; simply gzip -cd file | grep PATTERN or bzip -cd file | grep .. (or zgrep, bzgrep) will do. And you shouldn't determine file type from the extension: use type=\file -i filename`` then case $type in application/gzip;*) ...;; application/x-bzip2;*) ... esac. –  Oct 16 '18 at 10:02

1 Answers1

3
for filename in "$HOST-mail-2018$1"[0-9][0-9]*; do
    case $filename in
        *.gz)  g=zgrep  ;;
        *.bz2) g=bzgrep ;;
        *.xz)  g=xzgrep ;;
        *) printf 'Unknown filetype for "%s"\n' "$filename" >&2
           exit 1   # or continue or break
    esac

    "$g" 'Cannot .*TLS' "$filename"
done

Each compression tool comes with a corresponding grep tool. These are, for gzip, bzip2 and xz, called zgrep, bzgrep and xzgrep respectively. By using these there is no need to explicitly uncompress and recompress the files.

The script above picks the correct grep depending on the file suffix. One could arguably use plain grep for any unknown suffix. See below for how to do this without looking at filename suffixes (using the file tool).

Note how we don't need to use ls to loop over the set of files, and that the variable expansions need to be double quoted.

I ignored the sed calls that you have in your code as I don't know what the purpose of these are. I also removed the exit 0 at the end of the code as it would mask any other exit status of the script if it exited the loop.


Using the MIME-type of a file to select the correct grep tool:

for filename in "$HOST-mail-2018$1"[0-9][0-9]*; do
    case $( file -b -i "$filename" ) in
        text/plain*)          g=grep   ;;
        application/x-gzip*)  g=zgrep  ;;
        application/x-bzip2*) g=bzgrep ;;
        application/x-xz*)    g=xzgrep ;;
        *) printf 'Unknown filetype for "%s"\n' "$filename" >&2
           exit 1   # or continue or break
    esac

    "$g" 'Cannot .*TLS' "$filename"
done

This would correctly pick the correct grep tool regardless of what the filename suffix was (as long as the file was of one of the supported filetypes). I've also added plain grep for ordinary text files.

Related:

Kusalananda
  • 333,661
  • Thank you - you have made my day - my shell scripting skills are extremely limited and I was trying to butcher another script to get what I needed. What you have provided is much simpler! – mintster Oct 16 '18 at 10:15
  • 1
    please don't encourage determining the file type from the extension; that's what file -i is for. –  Oct 16 '18 at 10:19
  • @mosvy Added that now. – Kusalananda Oct 16 '18 at 10:26
  • that should be application/x-gzip*), etc -- at least on my machine, file -bi also appends a ; charset=binary to the output. –  Oct 16 '18 at 10:36
  • @mosvy Not on mine, but ok. – Kusalananda Oct 16 '18 at 10:38
  • it really does on vanilla installations of debian 9.5, centos7 and freebsd-11.2 –  Oct 16 '18 at 10:47