My answer is similar to @RomanPerekhrest's answer. The main difference is that it takes advantage of the fact that you can get awk
to process the entire input in one go by setting the record separator (RS
) to something that will never match anything in the input (e.g. ^$
). In other words, slurp in the entire file and search it as if it was a single string.
e.g.
find . -type f -exec \
awk -v RS='^$' '/foo/ && /bar/ && /baz/ { print FILENAME }' {} +
This will list all files beneath the current directory (.
) that contain ALL of the regular expressions foo
, bar
, and baz
. If you need any or all the regular expressions to be treated as whole words, surround them with word-boundary anchors \<
and \>
- e.g. \<foo\>
.
This also runs faster because it doesn't fork awk
once for every file. Instead, it runs awk
with as many filename arguments as will fit into the command line buffer (typically 128K or 1 or 2M characters on modern-ish systems)....e.g. if find
discovers 1000 files, it will only run awk
once instead of 1000 times.
Note: This requires a version of awk
that allows RS
to be a regular expression. See Slurp-mode in awk? for more details and an example of how to implement a limited form of "slurp mode" reading in other versions of awk.
Also Note: This will read the entire contents of each file found into memory, one at a time. For truly enormous files, e.g. log files that are tens of gigabytes or larger in size, this may exceed available RAM or even RAM+SWAP. As unlikely as it is, if it happens it can cause serious problems (e.g. on Linux, the kernel will start killing random processes if it runs of of RAM and SWAP).