1

Is it possible to grep inside a bunch of archives with regular expressions on both the names of the files they contain and the contents of those files? I would like to know which files in which archives match the pattern. I'm on OS X, if it matters.

Emre
  • 305

1 Answers1

1

If you need to use more expressive grep-style file patterns:

tar -OT <(tar -tf /path/to/file.tar | grep 'FILE_PATTERN') -xf /path/to/file.tar \
    | grep 'CONTENT_PATTERN'

-O specifies the output to be stdout, and -T specifies a file containing names to extract, when used in conjunction with -x.

If simpler pathname expansion is good enough, you can replace the process substitution (<( ... )) with a simpler echo line, this avoids having to read run tar on the file twice:

tar -OT <(echo 'FILE_PATTERN') -xf /path/to/file.tar \
    | grep 'CONTENT_PATTERN'

If you want to also see the filenames, add the -v flag (personally I will go for -xvf), but then you'll also need to modify CONTENT_PATTERN to grep for the filenames again. I'll leave this as an exercise for the reader...

It gets a bit tricky, and you'll probably have to use awk for a little more output processing... The matching filenames will be displayed per line, so unfortunately there's no clear-cut delimiter here. Assuming filenames will not be repeated as contents:

tar ... | awk '/^FILLE_AWK_PATTERN$/{f=$0;next}...'

That sets the awk variable f to be every new filename encountered and skips to the next line. Then,

tar ... | awk '...$f&&/CONTENT_AWK_PATTERN/{print $f;$f=""}'

Once we see a matching line, we print $f and reset our filename until the next file is 'encountered'.

Putting it together:

tar -OT <(echo 'FILE_PATTERN') -xf /path/to/file.tar \
    | awk '/^FILLE_AWK_PATTERN$/{f=$0;next};$f&&/CONTENT_AWK_PATTERN/{print $f;$f=""}'
h.j.k.
  • 1,253