As long as the numbers in your example does not contain the delimiter that sed
is using (by default /
), the $p
in your code will be interpreted as a regular expression (with all what that means).
Your code:
search_dir='dummy'
filename='numbers.txt'
for entry in "$search_dir"/*
do
while read p;
do
sed -i '' "/$p/d" $entry
done < $filename
done
Here, you want to delete all lines in the files under $search_dir
that contains any of the numbers in $filename
. Whether this work or not depends on how your sed
treats -i ''
. With some implementations of sed
you would have to use -i
without an argument.
Related to sed -i
and portability: How can I achieve portability with sed -i (in-place editing)?
It is safer to write the result to a temporary file and then to move that file to the original filename:
for entry in "$search_dir"/*
do
while read p;
do
sed "/$p/d" "$entry" >"$entry.tmp" && mv "$entry.tmp" "$entry"
done <"$filename"
done
This ensures that it will work regardless of what sed
implementation you happen to be working with. In general, it's a bad idea to try to make in-place changes to files while testing out a script, so you may well want to comment out that mv
before you are happy with the way the script otherwise works.
This is still a bit unsafe as a general solution though since you're actually "using data as code" (the numbers are data, and you use them a part of your sed
script). This means that you easily could cause a syntax error in the sed
script by just inserting a /
in one of the numbers in your numbers file.
Since the operation is so simple, we may instead use grep
. This also gets rid of the inner while
loop:
for entry in "$search_dir"/*
do
grep -Fv -f "$filename" "$entry" >"$entry.tmp" && mv "$entry.tmp" "$entry"
done
This will cause grep
to read its patterns from $filename
and to apply these to the $entry
file. The -v
means we'll discard any line containing the pattern and -F
means grep
will not interpret the numbers as regular expressions but as fixed strings. With -f "$filename"
we get grep
to read the strings from $filename
.
If there may be directories under $search_dir
we would want to skip these:
for entry in "$search_dir"/*
do
[ ! -f "$entry" ] && continue
grep -Fv -f "$filename" "$entry" >"$entry.tmp" && mv "$entry.tmp" "$entry"
done
Another, even safer way to do your operation is to use awk
. Since with both the sed
and grep
solutions above, the number is matched anywhere on the line, it is conceivable that we might delete the wrong lines. With awk
it's easy to match just the second ~
-delimited field in the data:
for entry in "$search_dir"/*; do
[ ! -f "$entry" ] && continue
awk -F '~' 'NR==FNR { num[$0]; next } !($2 in num)' "$filename" "$entry" >"$entry.tmp" &&
mv "$entry.tmp" "$entry"
done
The awk
program first populates an associative array/hash with the numbers as keys, and then prints every line from the $entry
file whose second ~
-delimited column is not a key in that hash.
-i
in sed. is that a typo? – Siva Jun 17 '18 at 18:01sed -i '/'"$p"'/d'
– Siva Jun 17 '18 at 18:08-i
is the only issue here, GNU sed wants the backup filename suffix as part of the same argument, so you'd use-i.bak
, or just-i
for no backup. BSD sed is different here. – ilkkachu Jun 17 '18 at 18:12sed
, it's about the portability ofsed -i
and/or about working with DOS text files. – Kusalananda Jun 18 '18 at 10:26