0

This SE answer made me realize I can exclude directories using regular expressions with grep. But it's not working for me when using |:

results=$(grep --color=always -rnF "$searchTerm" . --exclude-dir='.[^.]*|node_modules')
results=$(grep --color=always -rnF "$searchTerm" . --exclude-dir='.[^.]*\|node_modules')

Is there another way to write this regex so that I exclude directories that start with a period and node_modules?

wyc
  • 133

2 Answers2

2

At least for GNU grep, --exclude appears to expect a glob pattern not a regex - the answer to your linked question alludes to that where it says "Please note that the meaning of --exclude-dir is different for pcregrep and grep. Read the corresponding manuals for details.". Specifically:

In man grep:

  --exclude-dir=GLOB
          Skip any command-line directory with a name suffix that  matches
          the   pattern   GLOB.   When  searching  recursively,  skip  any
          subdirectory whose base name matches GLOB.  Ignore any redundant
          trailing slashes in GLOB.

In man pcregrep:

   --exclude=pattern
             Files (but not directories) whose names match the pattern are
             skipped  without  being processed. This applies to all files,
             whether listed on the command  line,  obtained  from  --file-
             list, or by scanning a directory. The pattern is a PCRE regu‐
             lar expression, and is matched against the final component of
             the  file  name,  not the entire path.

At least in GNU grep, you can however use --exclude-dir multiple times if you want to exclude multiple patterns:

--exclude-dir='.?*' --exclude-dir='node_modules'

I've changed the .[^.]* to .?* as:

  • There's no point trying not to exclude .. if that was the intention, as you're searching in the current directory here (omitting the target files with -r defaults to . in recent versions of GNU grep, except that I find that with GNU grep 3.11 at least, even --exclude-dir='*' fails to exclude it, so even --exclude-dir='.*' would be enough with that version if you didn't specify any target directory).
  • that would fail to exclude directories named ..foo or ...
  • the POSIX glob equivalent of regex [^.] is [!.] (though [^.] is often supported as well at least on GNU systems).
steeldriver
  • 81,074
1

Why not using find ?

tree -a
.
├── node_modules
│   ├── a
│   ├── b
│   └── c
├── .test
│   ├── 1
│   ├── 2
│   └── 3
└── x
    ├── 001
    └── 002

3 directories, 8 files

find . \( ! -path './node_modules*' -a ! -path './.*' \)
.
./x
./x/002
./x/001

Finally:

find . \( ! -path './node_modules*' -a ! -path './.*' \) \
    -exec grep --color=always -nF "$searchTerm" {} +