2

I know how to do it with grep, but the command doesn't work with zgrep

grep -E 'Pattern1.*Patter2' fileName

I'm using zgrep to match patterns inside a .json.gz file.
Because the files are too big, I want to zgrep BOTH pattern1 AND pattern2, the order doesn't matter.

Possible to achieve?

2 Answers2

0

grep has no logical AND operator. It is still possible to achieve the same result using a regex OR operator:

zgrep -E 'pattern1.*pattern2|pattern2.*pattern1' filename
mashuptwice
  • 1,383
0

I guess (but it's really not clear) that you want to find lines where both pattern1 and pattern2 are present, but in any order.

The straight-forward solution is zcat FileName | grep -E pattern1 | grep -E pattern2, but that means the whole decompressed file will be transferred across the first pipe.

If zgrep does handle this, and there is only a limited number of occurrences of pattern1 in the file zgrep -E pattern1 | grep -E pattern2 will work. (If pattern2 is the rarest, you might want to switch them.)

The solution @mashuptwice gives in an answer will work, but depending on the difficulties involved in pattern1 and pattern2 that might be hard to enter.

And not addressing the question: I've wanted to search for two strings in any order (but not being restrained by memory), and have found that perl -ale 'print if (/pattern1/ && /pattern2/)' (most languages allow something similar, but I like perl) is a good solution. (In your case you'd have to handle decompression in the script)