17

I want to exclude the file ./test/main.cpp from my search.

Here's what I'm seeing:

$ grep -r pattern --exclude=./test/main.cpp
./test/main.cpp:pattern
./lib/main.cpp:pattern
./src/main.cpp:pattern

I know it is possible to get the output that I want by using multiple commands in a pipes-and-filters arrangement, but is there some quoting/escaping that will make grep understand what I want natively?

  • A solution based on filtering the output doesn't scale well because it needlessly searches the file before excluding the associated results. The issue is magnified if I want to exclude entire directories (with --exclude-dir). That's why I would like to make grep perform the exclusion natively. – Brent Bradburn May 20 '15 at 15:10
  • 1
    --exclude specifies glob not a path – PersianGulf May 20 '15 at 15:11
  • 2
    https://www.gnu.org/software/grep/manual/grep.html#File-and-Directory-Selection, https://en.wikipedia.org/wiki/Glob_(programming) – Brent Bradburn Jul 23 '17 at 01:34

5 Answers5

10

grep can't do this for file in one certain directory if you have more files with the same name in different directories, use find instead:

find . -type f \! -path './test/main.cpp' -exec grep pattern {} \+

MichalH
  • 2,379
4

I don't think it's possible with GNU grep. You don't need pipes though.

With find:

find . ! -path ./test/main.cpp -type f -exec grep pattern {} +

With zsh:

grep pattern ./**/*~./test/main.cpp(.)

(excludes hidden files, just as well to exclude the .git, .svn...).

3

I could write a book : "The lost art of xargs". The find ... -exec … '; launches a grep for each file (but the variant with -exec … + doesn't). Well, we're wasting CPU cycles these days so why not, right? But if performance and memory and power is an issue: use xargs:

find . -type f \! -path 'EXCLUDE-FILE' -print0 | xargs -r0 grep 'PATTERN'

GNU's find's -print0 will NUL-terminate its output and xargs' -0 option honors that format as input. This ensures whatever funny characters your file has, the pipeline won't get confused. The -r option makes sure there's no error in case find finds nothing.

Note, you can now do things like:

find . -type f -print0 | grep -z -v "FILENAME EXCLUDE PATTERN" | 
  xargs -r0 grep 'PATTERN'

GNU grep's -z does the same thing as xargs' -0.

Otheus
  • 6,138
  • 3
    Some interesting notes, but I'm not sure you're correct about the performance issue. As I understand it find -exec (cmd) {} + works the same as xargs and find -exec (cmd) {} \; works the same as xargs -n1. In other words, your statement is only correct if the \; version is used. – Brent Bradburn May 20 '15 at 17:14
  • 3
    Piping into xargs is less efficient than using -exec … + (albeit marginally). None of the answers here even mention -exec … \;. – Gilles 'SO- stop being evil' May 20 '15 at 21:59
  • 1
    Well, s--t. I date myself. Thanks for the comments and corrections. I thought the + was a typo. Oh look, -exec ... + added in Jan 2005. Yeah, I'm not out of date ... at ... all. – Otheus May 21 '15 at 19:27
2

If your find supports -path which was added to POSIX in 2008 but still missing in Solaris:

find . ! -path ./test/main.cpp -type f -exec grep pattern /dev/null {} +
cuonglm
  • 153,898
1

For the record, here's the approach that I prefer:

grep pattern $(find . -type f ! -path './test/main.cpp')

By keeping the grep at the beginning of the command, I think this is a little more clear -- plus it doesn't disable grep's color highlighting. In a sense, using find in a command-substitution is just a way of extending/replacing the (limited) file-search subset of grep's functionality.


To me, the find -exec syntax is kind of arcane. One complexity with find -exec is the (sometimes) need for escaping various characters (notably if \; is used under Bash). Just for the purposes of putting things into familiar contexts, the following two commands are basically equivalent:

find . ! -path ./test/main.cpp -type f -exec grep pattern {} +
find . ! -path ./test/main.cpp -type f -print0 |xargs -0 grep pattern

If you want to exclude subdirectories, it may be necessary to use a wildcard. I don't fully understand the schema here -- talk about arcane:

grep pattern $(find . -type f ! -path './test/main.cpp' ! -path './lib/*' )

One further note to generalize find-based solutions for use in scripts: The grep command-line should include the -H/--with-filename option. Otherwise it will change the output formatting under the circumstance that there happens to be only one filename in the search results from find. This is notable because it doesn't appear to be necessary if using grep's native file-search (with the -r option).

...Even better, though, is to include /dev/null as a first file to search. This solves two problems:

  • It ensures that if there is one file to search, grep will think there are two and use the multiple-file output mode.
  • It ensures that if there are no files to search, grep will think there is one file and not hang waiting on stdin.

So the final answer is:

grep pattern /dev/null $(find . -type f ! -path './test/main.cpp')
  • You shouldn't use the output of find in a command substitution. This breaks if there are file names containing spaces or other special characters. Use find -exec, it's robust and easy to use. – Gilles 'SO- stop being evil' May 20 '15 at 21:55
  • @Gilles: Very good point -- also the output could possibly exceed the command-line size limits of some programs. Caveat emptor. – Brent Bradburn May 20 '15 at 22:08
  • Ugh. 'find' syntax is terribly difficult. '-o' is an "or" operator (also '-or' on Linux), but it's typical usage (for example with '-prune') doesn't map conceptually to the notion of a logical or. It's a functional or rather than logical or. – Brent Bradburn Feb 03 '17 at 18:03
  • Another way to exclude subdirectories based on matching a name: find -iname "*target*" -or -name 'exclude' -prune. Well, it sort of works -- the pruned directory will be listed, but not searched. If you don't want it listed, you can append a sort of redundant ! -name 'exclude' – Brent Bradburn Feb 03 '17 at 18:12