find and sed (find and delete)

Question

I have a folder with few subfolders and files inside it. Also there is a CSV file containing the names of the subfolders and patterns that is there in the files inside the subfolder.

What I want to do is to read the CSV file using while loop inside the main folder and delete the matching patterns files using sed. I am using this bash shell script in Unix:

IFS=","
while read f1 f2
do
 find $f2/ -name vcs*.pv -exec sed -i '/$f1/d' {} +
done > export.csv

Error:

find: 'ap01\r/': No such file or directory.

CSV file:

S2AEC67X1,ap01

It is reading the f2 value properly but not doing the rest. I am keeping the CSV file inside the main directory which contains all the subfolders.

If you got the ap01 from the export.csv file to the script for processing, then you didn't use the > export.csv redirection. Post the script you actually used. — ilkkachu, Jan 20 '18 at 19:16
sorry about the mistake. Code : IFS="," while read f1 f2 do find $f2/ -name vcs*.pv -exec sed -i '/$f1/d' {} + done < export.csv — sharmili mukherjee, Jan 23 '18 at 09:51
Is find needed here? Wouldn’t just calling sed directly work, with “$f2/vcs*.pv”? — Guy, Jan 25 '18 at 02:43

score 1 · Answer 1 · answered Jan 20 '18 at 15:16

1

This: ap01\r

Indicates that there is a carriage return after the string api01. Try to remove that from the CSV file.

Update: also please read the comment from @RomanPerekhrest. You will want to change the > into a < in your while loop (if indeed you should even use a while loop, but that's another discussion entirely!).

answered Jan 20 '18 at 15:16

1

The quoting is a mess as well (it will potentially apply filename expansion to the -name pattern but will pass $f1 literally to sed) – steeldriver Jan 20 '18 at 15:21
@steeldriver Aye. Agreed. – Jan 20 '18 at 15:55
@maulinglawns i removed the CR and now the error is : find: `ap01/': No such file or directory. – sharmili mukherjee Jan 23 '18 at 10:03
if i manually run the command putting the values directly it works fine, like: find ap01/ -name vcs*.pv -exec sed -i 'S2AEC67X1/d' {} + – sharmili mukherjee Jan 23 '18 at 11:23

Kusalananda · Answer 2 · 2018-02-04T09:00:24.187

There are a few issues with the code and input file in this question:

The input file obviously have trailing carriage returns (\r) on each line. This is probably due to hit having been created on a Windows machine as a DOS text file. The usual way to get rid of these carriage returns is to run dos2unix on the file. See, for example, the question What is `^M` and how do I get rid of it?
All variable expansions should be double quoted. In your command you use $f2 unquoted as a path name to a directory. This would fail if $f2 contains spaces.
Single quotes stops the shell from expanding a variable, which means that your sed script is looking for lines matching the literal regular expression $f1. This regular expression will never match as the $ will only match at the end of a line, and there will be no line that ends and then contains the characters f1 on the same line. Double quoting the sed editing script will make the shell expand the $f1 variable before invoking sed.
The pattern vcs*.pv is supposed to be an argument to the -name option of find, but since it's unquoted it would expand to any name in the current directory that matches that globbing pattern. So if you had a file in the current directory whose name was vcs-test.pv, find would be invoked with -name vcs-test.pv and you would only ever find files with that name. If you had several matching names in the current directory, you would cause find to complain about unknown options.
The export.csv file is outputted to (and emptied before any output from the loop happens). You would want the loop to read from it. This involves changing > to <.

The script, corrected:

while IFS=',' read f1 f2; do
    find "$f2" -type f -name 'vcs*.pv' -exec sed -i "/$f1/d" {} +
done <export.csv

I have also added -type f to the find command line as we probably do not want to accidentally pick up directory names. I have also made it so that the IFS variable is set only for the read command.

This is a variation on the above in the case that all files are located directly below the top directory whose name you are reading from the CSV file:

while IFS=',' read dir pattern; do
    for name in "$dir"/vcs*.pv; do
        test -f "$name" && sed -i "s/$pattern/d" "$name"
    done
done <export.csv

The good thing with this is that you get rid of find. The bad thing is that you now have one invocation of sed per file (usually not an issue unless you have hundreds or more files).

The following is a variation on the above that deletes lines depending on the string read from the CSV file. The difference is that the pieces of code above all interpret the pattern as a regular expression, not as a fixed string. This matters if your string contains characters that are interpreted a "special" in a regular expression, such as ., *, [, ] etc.

while IFS=',' read dir string; do
    for name in "$dir"/vcs*.pv; do
        [ ! -f "$name" ] && continue
        grep -v -F -e "$string" "$name" >"$name.tmp" && mv -f "$name.tmp" "$name"
    done
done <export.csv

find and sed (find and delete)

2 Answers2