2

Say, I have some files, and I want to find the files among them which contain a string but do not contain another.

grep being line based, conditions such as grep -q printf file && grep -vq '#include <stdio.h>' file will not work.

How should I go about doing this?

(I am on Debian, so answers specifically targeted at GNU versions of tools are fine.)

  • I'd look at something along the lines of a grep -r ... --null ... wanted-str piped to "xargs --null grep -v unwanted-str" (assuming GNU grep & xargs, for the null support) – Jeff Schaller Jan 16 '17 at 16:17
  • grep being line based, conditions such as grep -q printf file && grep -vq '#include <stdio.h>' file will not work While grep is line matching tool, it supports regular expressions for matching specific words and patterns. It really depends on what you want to do with it. – Sergiy Kolodyazhnyy Jan 16 '17 at 16:27
  • grep -L to get files that do not contain a match. You can also add -q: -q is line based -L is file based. – ctrl-alt-delor Jan 16 '17 at 17:04

4 Answers4

3

grep -vl would report the name of the files that have at least one line that match the pattern. Here you want the files where none of the line match the pattern. GNU grep (as found on Debian) has a -L option for that:

grep -rlZ printf . | xargs -r0 grep -FL '#include <stdio.h>'

With any POSIX grep, you could just negate grep -q:

find . -type f -exec grep -q printf {} \; \
               ! -exec grep -Fq '#include <stdio.h>' {} \; \
               -print

A lot less efficient as that means running one to two grep instances on every regular file.

0

Combine find with bash -c instead of a script. We take file path and store it into file variable, then pass it further to other commands. First grep -q will check if there is one word/pattern that you want is present. Using its exit status, && will pass it on to second grep -q. If that command doesn't find a match, that means the string is not found, thus using its exit status , we pass it on to echo via || operator.

In the example below, only file2.txt contains abra but not cadabra word.

$ find -type f -exec bash -c 'file="$@";grep -q "abra" "$file"  &&  grep -q "cadabra" "$file" || echo "$file" ' sh "{}" >
./file2.txt
$ ls                                                                                                                     
file1.txt  file2.txt  file 3.txt
$ cat file1.txt
abra cadabra
$ cat file2.txt                                                                                                          
abra
$ cat file\ 3.txt                                                                                                        
abra cadabra
0

It's quite easy:

for fname in ./*.c; do
  if grep -q -F "printf" "$fname" && ! grep -q -F "#include <stdio.h>" "$fname"; then
     printf 'File "%s" needs to include stdio.h\n' "$fname"
  fi
done

This will look through all C source files in the current directory and report any file that uses printf() without including the stdio.h header.

The header may be included indirectly though, so to avoid false positives, you could pass the code through the C preprocessor and look for the header in the preprocessed output (this seems to work with gcc and clang):

for fname in ./*.c; do
  if grep -q -F "printf" "$fname" && cc -E "$fname" | ! grep -q "^#.*stdio\.h\""; then
     printf 'File "%s" needs to include stdio.h\n' "$fname"
  fi
done
Kusalananda
  • 333,661
0

If I read the requirement correctly you want all files matching $PAT_INCL minus the files matching $PAT_EXCL.

Conceptually this is just set subtraction. There's not a very good standard utility for set operations in unix, but comm works.

comm -23 <(grep --files-with-match "$PAT_INCL"  * | sort) \
         <(grep --files-with-match "$PATH_EXCL" * | sort)

This can be made a bit more efficient by only grepping through the matching files in the second grep:

# Assuming filenames without whitespace
grep --files-with-match "$PAT_INCL" * | sort > incl_files
grep --files-with-match "$PAT_EXCL" $(cat incl_files) | sort > excl_files
comm -23 incl_files excl_files