0

(self-migrated from ask-ubuntu because it's linux-related, not ubuntu, and my os isn't ubuntu)

I'm trying to make a grep that looks like this:

grep -r 2019 | grep -riv FAILED | grep -rl DSL

I want to get filenames (-l) of files containing 2019 in them, AND NOT (-v) containing FAILED AND containing DSL.

Here, only the last grep is executed. I understand it's because of the -r, so each grep greps on all files instead of the previous result. But I can't figure out how to make it work without -r.

Maybe there's another way to use multiple patterns on a grep but with "positive" and "negative" match I haven't found anything.

2 Answers2

4

The last grep in the pipeline would be reading from the previous grep (if it hadn't used the -r option, see later), so it would have no idea from what file the data came from, which in turn means it can't report the pathname of the file.

Instead, consider using find like so:

find . -type f \
    -exec grep -q 2019 {} \; \
    -exec grep -q DSL {} \; \
    ! -exec grep -qi FAILED {} \; \
    -print

This would take each regular file from the current directory and any subdirectory (recursively) and test whether it contains the strings 2019, DSL, and FAILED (case insensitively). It would print the pathnames of file that contain the first two string but that does not contain the third.

If a file does not contain 2019 the other two tests will not be carried out, and if it does not contain DSL, the last test will ont be carried out, etc.

Note that instead of grep -v -qi FAILED I'm using a negation of grep -qi FAILED as the third test. I'm not interested in whether the file contains lines not containing FAILED, I'm interested in whether the file contains FAILED, and in that case I'd like to skip this file.

Related:


The issue with your pipeline,

grep -r 2019 | grep -riv FAILED | grep -rl DSL

is that the last grep will look recursively in all the files in the current directory and below and will ignore the input from the previous stages of the pipeline. The two initial grep invocations may produce some data, but they would fail to forward this through the pipeline and will eventually be killed when the last grep is done.

Also, as I already noted above, the middle grep would not find files that does not contain FAILED, it would find files that contain lines with things other than FAILED. Incidentally, it would also ignore the input from the preceding grep.

Kusalananda
  • 333,661
2

With GNU grep (-r is already a GNU extension) and GNU xargs or compatible:

grep -rlZ 2019 . |
  xargs -r0 grep -LiZ FAILED |
  xargs -r0 grep -l DSL

You need xargs to be able to pass the list of files output by one grep as arguments to the next grep. And -Z for that list of files to be NUL-delimited. To report the list of files that don't contain FAILED, it's -L (a GNU extension as well), not -vl which report the files that contain at least one line that doesn't match.

That should limit the number of grep invocations to a minimum, and for a large number of files could leverage up to three processors concurrently.

  • I'm not familiar with xargs so I don't know what I'm doing wrong but when I try your command I get xargs: illegal option -- r and xargs: Usage: xargs: [-t] [-p] [-e[eofstr]] [-E eofstr] [-I replstr] [-i[replstr]] [-L #] [-l[#]] [-n # [-x]] [-s size] [cmd [args ...]] – Teleporting Goat Jul 24 '19 at 09:08
  • @mosvy, it would mean loading files entirely in memory (which you can only do with -z if the files don't contain NUL bytes). Also note that -P is not always enabled and is still considered experimental in GNU grep. – Stéphane Chazelas Jul 24 '19 at 09:10
  • 1
    @TeleportingGoat, your xargs doesn't appear to be GNU xargs. You can drop -r as it's only there to prevent running commands for empty inputs. But from the usage, it looks like it doesn't support -0 either so it can probably not be used reliably. What system are you on? – Stéphane Chazelas Jul 24 '19 at 09:13
  • @mosvy, no I meant that for grep -Pr to be able to report files that contain as a whole DSL, 2019 and not FAILED (using lookaround operators), you'd need to use -z and assume the files don't contain NUL characters. – Stéphane Chazelas Jul 24 '19 at 09:15
  • ok right, got it. –  Jul 24 '19 at 09:17
  • @StéphaneChazelas Solaris 10 1/13 – Teleporting Goat Jul 24 '19 at 09:26
  • @TeleportingGoat, xargs is not usable reliably on Solaris, you may want to install the GNU tools there. If you have a grep that supports -r and -L, you probably have some GNU tools already. Maybe GNU xargs is available as gxargs. – Stéphane Chazelas Jul 24 '19 at 10:03
  • @StéphaneChazelas It's a company VM, I can't install anything. I'll have to use what's available. – Teleporting Goat Jul 24 '19 at 12:19