I have a a file that contains more than a hundred thousand of IDs. Each ID is composed of 8~16 hexadecimal digits:
178540899f7b40a3
6c56068d
8c45235e9c
8440809982cc
6cb8fef5e5
7aefb0a014a448f
8c47b72e1f824b
ca4e88bec
...
I need to find the related files in a directory tree that contains around 2×109 files.
Given an ID like 6c56068d219144dd
, I can find its corresponding files with:
find /dir -type f -name '* 6[cC]56068[dD]219144[dD][dD] *'
But that takes at least two days to complete...
What I would like to do is to call find
with as much -o -iname GLOB
triplets as ARG_MAX
allows.
Here's what I've thought of doing:
sed -e 's/.*/-o -iname "* & *"' ids.txt |
xargs find /dir -type f -name .
My problem is that I can't force xargs
to take in only complete triplets.
How can I do it?
xargs
worked, but I was obviously wrong. Yet another reminder not to touch that utility again :-) – Kusalananda Aug 31 '23 at 20:13find | grep
might make sense – muru Sep 01 '23 at 13:45grep
– Fravadona Sep 01 '23 at 13:59