How to get filenames when using find and sed

Question

I am writing a script to apply sed on certain files and then list files that have been changed so that I know which have been modified.

This is how I am finding and then using sed:

find . -type f -a \( -name "*.txt" -o -name "*.git"\) -a -exec sed -i -e "s/"str1"/"str2"/g" {} +

How do I print the file name of the changed files? I would like to print it in a sorted order so it's easier to read.

When using only sed we can do this:

sed -i 's/$pattern/$new_pattern/w changelog.txt' $filename
if [ -s changelog.txt ]; then
  # CHANGES MADE, DO SOME STUFF HERE
else
  # NO CHANGES MADE, DO SOME OTHER STUFF HERE
fi

But how do I do this when using find and sed together? I checked the man page and tried a bunch of stuff but nothing worked.

I'm not exactly sure what you're trying to achieve, can you show a list of files, and what output you're expecting to see. — EightBitTony, Jan 21 '16 at 14:49
You just add -print after your -exec, it will only be executed if the -exec was successful e.g. find . -type f $ -name \*.git -o -name \*.txt $ -exec sed -i 'blah_blah' {} \; -print. Sure, you'll have to sort the output then. — don_crissti, Jan 21 '16 at 18:11
@don_crissti using print giving an error "-print: command not found". — KLMM, Jan 21 '16 at 18:34
@don_crissti yes this worked for printing the file but how can I get them in a sorted order because we are not storing then output in any variable and there is no flag that we can use to sort. How Can i achieve sorted order? — KLMM, Jan 21 '16 at 20:28
I found that your suggestion of using print, prints all the files having the extension .txt or .git. I only want to print files that have been modified not all that match the pattern — KLMM, Jan 21 '16 at 22:01
Ah, yes, sed -i is dumb and will "edit" the file even if nothing changes and report success... Add a -exec grep -q str1 {} \; before the existing -exec sed... That should do. Oh, and next time you reply, make sure you prepend my username with @ so the system notifies me e.g. @don_crissti otherwise I'll never know you replied (I just happened to return here) — don_crissti, Jan 23 '16 at 02:45

score 1 · Answer 1 · edited Aug 28 '19 at 21:14

Your sed command (with proper quoting):

sed 's/str1/str2/g'

This will change all occurrences of str1 into str2. A list of files containing str1 can be had from grep -l 'str1':

find . -type f \( -name '*.txt' -o -name '*.git' \) \
    -exec grep -l 'str1' {} \; \
    -exec sed -i 's/str1/str2/g' {} + >changelist.txt

Here, grep -l will provide a list of pathnames that will be redirected into changelist.txt. It will also act like a filter for sed so that sed is only run on files that contain the pattern. sed -i will then make the changes in the files (and remain quiet).

Alternatively, let find print the pathnames of the files that contain the string:

find . -type f \( -name '*.txt' -o -name '*.git' \) \
    -exec grep -q 'str1' {} \; \
    -print \
    -exec sed -i 's/str1/str2/g' {} + >changelist.txt

Understanding the -exec option of `find`

score 1 · Answer 2 · answered Sep 05 '18 at 12:11

sed -i rewrites the file (actually makes full new copies of the files) regardless of whether any of the s commands in the sed script succeeded or not.

Here, you'd want to avoid running sed -i on files that don't contain str1. With GNU tools:

find . -type f \( -name "*.txt" -o -name "*.git" \) -size +3c \
  -exec grep -lZ str1 {} + |
  while IFS= read -rd '' file; do
    sed -i 's/str1/str2/g' "$file" &&
      printf '%s\n' "$file"
  done

That runs one sed for each of the files that contain str1 and prints the file names if sed has been successful (for which there has been no error in creating the new version of the file).

Or you can run one grep and sed per file:

find . -type f \( -name "*.txt" -o -name "*.git" \) \( -size +3c \
  -exec grep -q str1 {} \; \
  -exec sed -i 's/str1/str2/g' {} \; \
  -printf '"%p" was modified\n' \
    -o -printf '"%p" was not modified\n"' \)

score 1 · Answer 3 · answered May 14 '21 at 15:40

I'm answering this in a slightly different way than Kusalananda for the fun of it. So if you like this one you should upvote his. This is a little different in that it shows you how to do multiple commands and handle a little more complexity while still making only one call to find.

The Answer

Grep will evaluate as True if it finds a match (i.e. $? == 0). So, grep -l 'str1' filename will be true if str1 is in filename. If we chain this command to the sed command with && we can ensure that sed only runs if grep matched.

The following command will output the filename only if sed is going to make changes:

grep -l 'str1' filename && sed -i 's/str1/str2/g' filename

You cannot use && in -exec directly so, we will wrap it in a call to bash.

find ./ -type f \( -name '*.txt' -o -name '*.git' \) \
    -exec bash -c "grep -l 'str1' {} && sed -i 's/str1/str2/g' {}" \; > changelist.txt

What makes this appreciably different than Kusalananda's answer is that sed won't even run if grep doesn't match str1. In Kusalananda's answer grep runs for every file and sed runs for every file. Depending on the number of files this could make a huge difference in execution time. For the OP's question though it probably won't make much difference at all.

You could simplify his answer by replacing grep -q with grep -l, replacing + with \;, and getting rid of the -print.

find . -type f \( -name '*.txt' -o -name '*.git' \) \
    -exec grep -l 'str1' {} \; \
    -exec sed -i 's/str1/str2/g' {} \; >changelist.txt

All of that is just nitpicking though. So what follows is my reason for using bash -c in find's -exec option. I hope that someone might find it useful.

The Reason for my Approach

I'm here because I wanted to use sed with find to print out a section of a logfile and print the name of the log file only if sed would output anything.

I have some logs that contain something like this:

    ---- lots of lines before ----
Failed:    0

Summary
( Cases/Passed/Failed)
Frequency Test           : (    69/    67/     2)
Carrier/Data Null Test   : (    14/    13/     1)
Total Harmonic Distortion: (     9/     9/     0)
Spur Test                : (     0/     0/     0)

failed Test
freq, rf2, 0.750e9, -70.0, pm 500,    pm 1.0
    ---- lots of lines after ----

I wanted to just print the test summary and the file name only if it sed detected the test summary.

So for a bunch of files I wanted output like this:

File: ./4662-0003-05132021-0953.log
Summary
--------------------------------------------------
( Cases/Passed/Failed)
Frequency Test           : (    69/     0/    69)
Carrier/Data Null Test   : (    14/     0/    14)
Total Harmonic Distortion: (     9/     9/     0)
Spur Test                : (     0/     0/     0)
File: ./4745-0001-05132021-1017.log
Summary

( Cases/Passed/Failed)
Frequency Test           : (    69/    68/     1)
Carrier/Data Null Test   : (    14/    14/     0)
Total Harmonic Distortion: (     9/     9/     0)
Spur Test                : (     0/     0/     0)

I achieved that with this command:

find ./ -type f -name '*.log' \
    -exec bash -c "grep -q Summary {} && echo 'File: {}' && sed -n '/Summary/,/Spur/p' {} && echo" \;

Breaking it down, nothing after grep -q Summary () will run if Summary doesn't appear in the log file. sed -n '/Summary/,/Spur/p' will only print out the section of the log between "Summary" and "Spur".

The difference between -exec cmd {} ; and -exec cmd +

You may be wondering why I used \; instead of +. If you use +, {} will be replaced with as many filenames as can fit on the command line. That is not what we want and in this case find will not even allow it.

From man find:

   -exec command {} +
          This variant of the -exec action runs the specified command on the selected files, but the command line is built  by
          appending  each selected file name at the end; the total number of invocations of the command will be much less than
          the number of matched files.  The command line is built in much the same way that xargs builds  its  command  lines.
          Only  one  instance  of  `{}' is allowed within the command.  The command is executed in the starting directory.  If
          find encounters an error, this can sometimes cause an immediate exit, so some pending commands may  not  be  run  at
          all.  This variant of -exec always returns true.

Conclusion

Sorry for the novel, but I hope it helps someone.

unxnut · Answer 4 · 2016-01-21T18:59:17.553

0

It should be easy enough to write a little script that does what you want and exec the script as an argument to find. You already have the script and if you replace $filename by $1, you have it. Your script will be of the form

#!/bin/bash
sed -i 's/$pattern/$new_pattern/' $1
echo $1 >> changelog

Let us call this script ed_notify. Now, you can run it on selected files by

cat changelog >> changelog.old
rm changelog
find . -type f -a \( -name "*.txt" -o -name "*.git"\) -a -exec ed_notify {} \;

edited Jan 21 '16 at 18:59

answered Jan 21 '16 at 14:51

unxnut

6,008

2

Please edit your answer so that it actually provides an answer. At the moment, this is a comment simply giving a suggestion. – terdon Jan 21 '16 at 15:06
@unxnut I am unable to understand your answer, I do have a high level idea of what needs to be done. Please provide some code solution, thanks – KLMM Jan 21 '16 at 17:06
@unxnut Is it possible to achieve this in a single script(will using routines work)? And how do we get sorted order? – KLMM Jan 21 '16 at 20:31
Since the filenames for the session are saved in changelog, all you have to do is sort changelog to get the file in sorted order at the end of the find command. – unxnut Jan 21 '16 at 21:11

How to get filenames when using find and sed

4 Answers4

Summary

failed Test