8

Is there any way to use grep to search for entry matching multiple patterns in any order using single condidion?

As showed in How to run grep with multiple AND patterns? for multiple patterns i can use

grep -e 'foo.*bar' -e 'bar.*foo'

but i have to write 2 conditions here, 6 conditions for 3 patterns and so on... I want to write single condition if possible. For finding patterns in any order you can suggest to use:

grep -e 'foo' | grep -e 'bar' # at least i do not retype patterns here

and this will work but i would like to see colored output and in this case only bar will be highlighted.

I would like to write condition as easy as

awk '/foo/ && /bar/'

if it is possible for grep (awk does not highlight results and i doubt it can be done easily).

agrep can probably do what i want, but i wonder if my default grep (2.10-1) on ubuntu 12.04 can do this.

Kirill
  • 995

5 Answers5

5

If your version of grep supports PCRE (GNU grep does this with the -P or --perl-regexp option), you can use lookaheads to match multiple words in any order:

grep -P '(?=.*?word1)(?=.*?word2)(?=.*?word3)^.*$'

This won't highlight the words, though. Lookaheads are zero-length assertions, they're not part of the matching sequence.

I think your piping solution should work for that. By default, grep only colors the output when it's going to a terminal, so only the last command in the pipeline does highlighting, but you can override this with --color=always.

grep --color=always foo | grep --color=always bar
Barmar
  • 9,927
  • I was only able to get the whole line highlighted using GREP_COLORS. Is request to have only the matching words highlighted? – Andrew Mar 04 '15 at 21:20
  • No, this will not highlight the words. Lookaheads are zero-length assertions, so they don't actually form part of the match. So the option to highlight the match will highlight the part of the line that matches ^.*$, which is the whole line. – Barmar Mar 04 '15 at 21:32
4

New command based on old answer

#!/bin/bash
# Build search pipeline
n=$#
SEARCH=""
while [ $n -gt 0 ]; do
    if [ -n "$SEARCH" ]; then
            SEARCH="|${SEARCH}"
    fi
    SEARCH="grep --color=always '${!n}'${SEARCH}"
    n=$[ $n - 1 ]
done
# Execute all greps in sequence
/bin/bash -c "${SEARCH}"

So, as requested:

I would like to write condition as easy as awk '/foo/ && /bar/'

yagrep SEARCH grep color < /usr/local/bin/yagrep

SEARCH="grep --color=always '${!n}'${SEARCH}"

Old answer

A complex option would be to generate all permutations of the patterns, supply them to grep and hope that the regexp compiler generates a reasonably optimized search tree.

But even with a small number of patterns, say six, the permutations would be 6!, or 720. It makes for an awkward command line.

But seeing as you seem to have no quarrels with piped grep except

and this will work but I would like to see colored output

then, provided that:

  • the patterns do not overlap
  • the patterns do not contain term control characters

an acceptable solution would be to pipe several greps, each with one pattern, in order of increasing likelihood so as to minimize the load.

Then to ensure that the colorization works, you'll have to force grep into believing that its standard output is a terminal using faketerm or otherwise, or if available (it should be), you can use the --color=always option:

cat file | grep --color=always pattern1 \
         | grep --color=always pattern2 \
         ...
         | grep --color=always patternN 

(A nice twist would be to wrap the greps into a single string to be executed by a subshell, and generate the string programmatically using e.g. sed).

LSerni
  • 4,560
0

You can colorise output with sed

sed -n "/foo/{/bar/s/foo\|bar/\x1b[1;31m&\x1b[m/gp}"
Costas
  • 14,916
  • I suspected it is possible, but i think you agree it can't be applied manually when i want to search quickly for some substring using cmd line on some random machine :) – Kirill Mar 01 '15 at 16:24
  • @Derp You can arrange it like function in bashrc :sed -n "$(for k in "$@";do a=${a:+${a}{}/$k/;b=$b$s;s='}';c=${c:+$c\\\|}$k;done;printf "%s" "$a" 's/' "$c" '/\x1b[1;31m&\x1b[m/gp' "$b")" – Costas Mar 01 '15 at 16:58
0

The following is simple to remmember although the intent is not that clear:

cat file | grep pattern1 | grep pattern2 | grep pattern3 | grep -e pattern1 -e pattern2 -e pattern3
Oliver
  • 431
-4

You could do this with a single OR-based regexp. Given the file contents:

This is my file
It has many words
There are also many chars

The following grep command would highlight the 'my' and 'There' in a single argument:

grep -P -e '(there|my)' testfile

Simply separate each regex pattern with a | (pipe) to indicate OR.

wraeth
  • 458
  • Yes, it will highlight both patterns. But i need AND, not OR. In your example grep will match string with any of given pattern. Also i think in this case you can replace -P -e with -E. – Kirill Mar 01 '15 at 15:59