I am trying to search a directory using pcregrep
. I want to search using a long, multi-line string. Basically, I am trying to look through multiple code bases for plagiarism. So I want to be able copy/paste a code block from some code, and then search a directory for any exact matches.
The problem I'm having is that when I use pcregrep
with the -M
option (pcregrep -M
), it appears to treat each line break as a separate pattern.
So, when I take a code block that I know is unique to one file, I may still get multiple responses because some individual lines may be used elsewhere.
Here is what I am using:
pcregrep -FlMr "long, multi-line string" /directory/to/search/
What can I do to make sure that it will only return exact matches?
-F
option that causes this behavior I think ("Interpret each data-matching pattern as a list of fixed strings, separated by newlines, instead of as a regular expression."). Whether you can safely omit it will depend on whetherlong, multi-line string
may contain regex metacharacters. – steeldriver Jun 01 '23 at 17:21pcregrep -M "\Q$multiline_string\E"
(assuming the multiline string doesn't contain\E
or useperl
in slurp mode) – Stéphane Chazelas Jun 01 '23 at 20:51