Run `grep` excluding a file in a specific path

Question

I want to exclude the file ./test/main.cpp from my search.

Here's what I'm seeing:

$ grep -r pattern --exclude=./test/main.cpp
./test/main.cpp:pattern
./lib/main.cpp:pattern
./src/main.cpp:pattern

I know it is possible to get the output that I want by using multiple commands in a pipes-and-filters arrangement, but is there some quoting/escaping that will make grep understand what I want natively?

A solution based on filtering the output doesn't scale well because it needlessly searches the file before excluding the associated results. The issue is magnified if I want to exclude entire directories (with --exclude-dir). That's why I would like to make grep perform the exclusion natively. — Brent Bradburn, May 20 '15 at 15:10
https://www.gnu.org/software/grep/manual/grep.html#File-and-Directory-Selection, https://en.wikipedia.org/wiki/Glob_(programming) — Brent Bradburn, Jul 23 '17 at 01:34

MichalH · Accepted Answer · 2015-05-20T15:23:35.043

10

grep can't do this for file in one certain directory if you have more files with the same name in different directories, use find instead:

find . -type f \! -path './test/main.cpp' -exec grep pattern {} \+

edited May 20 '15 at 15:23

answered May 20 '15 at 15:15

MichalH

2,379

Why are you escaping \! and \+? It seems to work fine without the backslashes. – Brent Bradburn May 20 '15 at 15:29
@nobar I'm used to it because some characters are shell keywords so you'll never be surprised because nothing can happen if they are escaped. – MichalH May 20 '15 at 15:32
"grep can't do this, use find instead" -- perfect. – Brent Bradburn May 20 '15 at 15:36

score 4 · Answer 2 · answered May 20 '15 at 15:20

I don't think it's possible with GNU grep. You don't need pipes though.

With find:

find . ! -path ./test/main.cpp -type f -exec grep pattern {} +

With zsh:

grep pattern ./**/*~./test/main.cpp(.)

(excludes hidden files, just as well to exclude the .git, .svn...).

score 3 · Answer 3 · edited May 20 '15 at 21:58

3

I could write a book : "The lost art of xargs". The find ... -exec … '; launches a grep for each file (but the variant with -exec … + doesn't). Well, we're wasting CPU cycles these days so why not, right? But if performance and memory and power is an issue: use xargs:

find . -type f \! -path 'EXCLUDE-FILE' -print0 | xargs -r0 grep 'PATTERN'

GNU's find's -print0 will NUL-terminate its output and xargs' -0 option honors that format as input. This ensures whatever funny characters your file has, the pipeline won't get confused. The -r option makes sure there's no error in case find finds nothing.

Note, you can now do things like:

find . -type f -print0 | grep -z -v "FILENAME EXCLUDE PATTERN" | 
  xargs -r0 grep 'PATTERN'

GNU grep's -z does the same thing as xargs' -0.

edited May 20 '15 at 21:58

Gilles 'SO- stop being evil'

829,060

answered May 20 '15 at 16:50

Otheus

6,138

3

Some interesting notes, but I'm not sure you're correct about the performance issue. As I understand it find -exec (cmd) {} + works the same as xargs and find -exec (cmd) {} \; works the same as xargs -n1. In other words, your statement is only correct if the \; version is used. – Brent Bradburn May 20 '15 at 17:14
3

Piping into xargs is less efficient than using -exec … + (albeit marginally). None of the answers here even mention -exec … \;. – Gilles 'SO- stop being evil' May 20 '15 at 21:59
1

Well, s--t. I date myself. Thanks for the comments and corrections. I thought the + was a typo. Oh look, -exec ... + added in Jan 2005. Yeah, I'm not out of date ... at ... all. – Otheus May 21 '15 at 19:27

cuonglm · Answer 4 · 2015-05-21T01:25:35.800

2

If your find supports -path which was added to POSIX in 2008 but still missing in Solaris:

find . ! -path ./test/main.cpp -type f -exec grep pattern /dev/null {} +

edited May 21 '15 at 01:25

answered May 20 '15 at 15:09

cuonglm

153,898

1

I don't think that will work becuase nobar wants main.cpp in other directories – Eric Renouf May 20 '15 at 15:10
1

won't your pattern exclude main.cpp from all the other directories too? That would not be desirable – Eric Renouf May 20 '15 at 15:16
@EricRenouf: Oh, my mistake, a mis-reading. Updated my answer. – cuonglm May 20 '15 at 15:24
@Gilles: Why -path is not POSIX? – cuonglm May 21 '15 at 01:15
Ah, sorry, my mistake, it's been added in 2008. Still missing from Solaris though. – Gilles 'SO- stop being evil' May 21 '15 at 01:21
@Gilles: Thanks for the information, I updated my answer. – cuonglm May 21 '15 at 01:25

score 1 · Answer 5 · edited May 23 '17 at 12:39

For the record, here's the approach that I prefer:

grep pattern $(find . -type f ! -path './test/main.cpp')

By keeping the grep at the beginning of the command, I think this is a little more clear -- plus it doesn't disable grep's color highlighting. In a sense, using find in a command-substitution is just a way of extending/replacing the (limited) file-search subset of grep's functionality.

To me, the find -exec syntax is kind of arcane. One complexity with find -exec is the (sometimes) need for escaping various characters (notably if \; is used under Bash). Just for the purposes of putting things into familiar contexts, the following two commands are basically equivalent:

find . ! -path ./test/main.cpp -type f -exec grep pattern {} +
find . ! -path ./test/main.cpp -type f -print0 |xargs -0 grep pattern

If you want to exclude subdirectories, it may be necessary to use a wildcard. I don't fully understand the schema here -- talk about arcane:

grep pattern $(find . -type f ! -path './test/main.cpp' ! -path './lib/*' )

One further note to generalize find-based solutions for use in scripts: The grep command-line should include the -H/--with-filename option. Otherwise it will change the output formatting under the circumstance that there happens to be only one filename in the search results from find. This is notable because it doesn't appear to be necessary if using grep's native file-search (with the -r option).

...Even better, though, is to include /dev/null as a first file to search. This solves two problems:

It ensures that if there is one file to search, grep will think there are two and use the multiple-file output mode.
It ensures that if there are no files to search, grep will think there is one file and not hang waiting on stdin.

So the final answer is:

grep pattern /dev/null $(find . -type f ! -path './test/main.cpp')

You shouldn't use the output of find in a command substitution. This breaks if there are file names containing spaces or other special characters. Use find -exec, it's robust and easy to use. — Gilles 'SO- stop being evil', May 20 '15 at 21:55
@Gilles: Very good point -- also the output could possibly exceed the command-line size limits of some programs. Caveat emptor. — Brent Bradburn, May 20 '15 at 22:08
Ugh. 'find' syntax is terribly difficult. '-o' is an "or" operator (also '-or' on Linux), but it's typical usage (for example with '-prune') doesn't map conceptually to the notion of a logical or. It's a functional or rather than logical or. — Brent Bradburn, Feb 03 '17 at 18:03
Another way to exclude subdirectories based on matching a name: find -iname "*target*" -or -name 'exclude' -prune. Well, it sort of works -- the pruned directory will be listed, but not searched. If you don't want it listed, you can append a sort of redundant ! -name 'exclude' — Brent Bradburn, Feb 03 '17 at 18:12

Run `grep` excluding a file in a specific path

5 Answers5

Linked