5

I want to search all files inside a directory and its subdirectories for lines containing a certain string, but I want to exclude those results that contain a different certain string in the line immediately after it.

For example, this:

foo1 searchString bar
foo1 excludeString bar

foo2 searchString bar
something else

foo3 searchString bar

foo3 excludeString bar

foo4 searchString bar

should return this:

foo2 searchString bar
foo3 searchString bar
foo4 searchString bar

I know that -A prints multiple lines, and that -v excludes results. But my current approach of grep -r -A 1 "searchString" | grep -v "excludeString" obviously can't work.

Is there a way to tell the second grep that it should also remove the previous line if it finds a match? Or some other way how I might achieve this?

Performance isn't my primary concern; It would be nice if the command is relatively easy to remember though.

tim
  • 162
  • PS: I did see this question, but 1. it doesn't search recursively in directories and 2. The two search strings depend on each other, while mine are fixed. I'm hoping that that might simplify things. – tim Aug 05 '15 at 14:24
  • Check out this answer for one grep picking up where another one left off. You can use -m to maintain a "cursor" from the first grep, then execute the second grep. Putting this in a loop could achieve the functionality you want. – WAF Aug 05 '15 at 20:06

2 Answers2

8

You can use perl compatible regular expressions grep:

$ pcregrep -M '(searchString.*\n)(?!.*excludeString)' file
foo2 searchString bar
foo3 searchString bar
foo4 searchString bar

It searches searchString followed by any char ., repeated zero or more times *, followed by new line \n only if there is not (?!) pattern .*excludeString next to it. Option -M is present in order to match multi lines.

jimmij
  • 47,140
6

With sed:

sed '/searchString/!d;$!N;/\n.*excludeString/!P;D' infile

How it works:

  • /searchString/!d deletes the line if it doesn't match searchString and reads in a new line, starting the command cycle over again (i.e. the remaining commands are no longer executed)
  • if the line matches searchString, sed executes $!N;/\n.*excludeString/!P;D - see HERE how it works; the difference is that here, it is looking for the pattern excludeString after the \newline character so that a line matching both searchString and excludeString is still printed if it's not followed by a line matching excludeString; if there was no line matching both searchString and excludeString (i.e. known input) then you could drop the \n.* part and run:
    sed '/searchString/!d;$!N;/excludeString/!P;D' infile
don_crissti
  • 82,805
  • 2
    It took me ten minutes with the man page to understand the above, and I imagine that many of our users find it harder than I did — after all, we’ve had members say that they have trouble reading man pages (and who can blame them?). As you know, we’re looking for long answers that provide some explanation and context.  I’ve had correct one-line answers that were less cryptic that this downvoted for lack of explanation.  Would you care to dissect your answer, saying what each part does? (For example, I needed a couple of minutes just to figure out why your last command was D rather than d.) – G-Man Says 'Reinstate Monica' Aug 05 '15 at 18:34
  • 1
    @G-Man - thanks for your input. You're right, I've added a short explanation. Let me know if it's better now or is it still cryptic. – don_crissti Aug 05 '15 at 19:21