I want to print the filename/s together with the matching pattern but only once even if the pattern match has multiple occurrence in the file.
E.g. I have a list of patterns; list_of_patterns.txt
and the directory I need to find the files is /path/to/files/*
.
list_of_patterns.txt:
A
B
C
D
E
/path/to/files/
/file1
/file2
/file3
Let say /file1
has the pattern A
multiple times like this:
/file1:
A
4234234
A
435435435
353535
A
(Also same goes to other files where there are multiple pattern match.)
I have this grep command running but it prints the filename every time a pattern matches.
grep -Hof list_of_patterns.txt /path/to/files/*
output:
/file1:A
/file1:A
/file1:A
/file2:B
/file2:B
/file3:C
/file3:B
... and so on.
I know sort can do this when you pipe it after the grep command grep -Hof list_of_patterns.txt /path/to/files/* | sort -u
but it only executes when grep is finished. In the real world, my list_of_patterns.txt
has hundreds of patterns inside. It takes sometimes an hour to finish the task.
Is there a better way to speedup the process?
UPDATE: some files have more than a hundred occurrences of matching pattern. E.g. /file4
has occurrences of pattern A
900 times. That's why it's taking grep
an hour to finish because it prints every occurrences of the pattern match together with the filename.
E.g. output:
/file4:A
/file4:A
/file4:A
/file4:A
/file4:A
/file4:A
/file4:A
/file4:A
... and so on til' it reach 900 occurrences.
I only want it to print only once.
E.g. Desired output:
/file4:A
/file1:A
/file2:B
/file3:A
/file4:B
grep
take an hour to process a few files. Are your files also very big or do you have many thousands of files to search in? – Kusalananda Feb 14 '18 at 06:43-m1
– Sundeep Feb 14 '18 at 06:45-m1
will cause exactly one output line per file, along with whatever pattern matched... not sure if OP wants one line for each matching pattern – Sundeep Feb 14 '18 at 06:51-m1
, grep will quit immediately after finding a matching line... – Sundeep Feb 14 '18 at 06:52sort -u
does. Like I said in my question but it waits for grep to finish. Is there a way grep could perform what sort can do? Or there are other command that can perform the task better and faster? – WashichawbachaW Feb 14 '18 at 07:14