Recursively rename files by using a list of patterns and replacements

Question

I have the following file structure:

Some directory
- Some file.txt
- Another file here.log
- Yet another file.mp3
Another directory
- With some other file.txt
File on root level.txt
Another file on root level.ext

What I want to do now is run a little script that takes another file as input containing some type of pattern/replacement pairs in it to rename these files recursively according to them. So that every "another" (case insensitive) gets replaced with "foo" or every "some" with "bar."

I already tried a lot of things with iterating over files and reading said input file, but nothing worked like it should and I finally managed to accidentally overwrite my testing script. But there were a lot of ls, while, sed or mv in use.

The two things I couldn't resolve myself were how to handle whitespace in filenames and how to not handle files that were already renamed in a previous pattern match.

Maybe you can point me in the right direction?

Whitespace doesn't work because you didn't put double quotes around variable substitutions ("$foo"). See Why does my shell script choke on whitespace or other special characters? — Gilles 'SO- stop being evil', Apr 14 '17 at 21:54

score 1 · Answer 1 · answered Apr 13 '17 at 21:15

1

rPairs="/tmp/rename_pairs" \
find . -type f -exec sh -c '
   while read -r old new; do
      rename "s/$old/$new/i" "$@"
   done < "$rPairs"
' x {} +

Assuming that there are no nonASCII characters in your rename pairs file and also this file is placed away from the search path.

answered Apr 13 '17 at 21:15

Thanks for you answer, though it would fail if the same file matches multiple patterns in the input file. But ultimately you brought me in the right direction. – derbenni Apr 14 '17 at 09:32

score 1 · Answer 2 · answered Apr 14 '17 at 09:31

After Rakesh Sharma's answer I got in the right direction after experimenting a bit more and getting some sleep.

Finally I came up with the following script:

#!/bin/bash


while IFS=";" read pattern replacement
do
  if [[ ! -z $pattern ]]
  then
    echo "Checking files for pattern '$pattern'."

    find ./files -name "*$pattern*" -type f | while read fpath
    do
      fname=$(basename "$fpath")
      dname=$(dirname "$fpath")

      echo "  Found file '$fname' in directory '$dname'. Renaming to '${fname/$pattern/$replacement}'."
      mv -- "$fpath" "$dname/${fname/$pattern/$replacement}"
    done
  fi
done < patterns.csv

It reads the file pattern.csv and loops over its lines filling the $pattern and $replacement variables In the second step all files within a directory ./files are found, that match the current pattern. This has to be done to avoid trying to rename files again when a second pattern matches, since that would fail. Finally it only renames the file itself, not the directories containing it by using shell parameter substitution.

What's not working is replacing the matches case-insensitive, but I can live with that.

You need to realize that in the Unix-world, filenames are case-sensitive. So, what would happen if you were to rename in a case-insensitive manner, 2 different files, foo.log and Foo.log is that both would be mapped to newname.log, which you will surely agree is not a healthy situation to be in. I have posted another solution which hopefully ameliorates the situation and requires the presence of the rename utility written in Perl. — , Apr 14 '17 at 21:57
Yeah, I know. That was something I learned the hard way when switching from development in Windows to Unix years ago. But now I only see the good parts of that :) — derbenni, Apr 15 '17 at 09:05

score 1 · Accepted Answer · 2017-04-15T09:37:16.097

TOP="`pwd -P`" \
find . -type d -exec sh -c '
   for d
   do
      cd "$d" && \
         find . ! -name . -prune -type f -exec sh -c '\''
            while IFS=\; read -r pat repl
            do
               rename "s/$pat/$repl/g" "$@"
               N=$#
               for unmoved
               do
                  if [ -f "$unmoved" ]
                  then
                     set X ${1+"$@"} "$unmoved"
                     shift
                  fi
               done
               shift "$N"
               case $# in 0 ) break ;; esac
            done < patterns.csv
         '\'' x \{\} +
      cd "$TOP"
   done
' x {} +

Set up find to net directories only and have sh down them in a gulp. This minimizes the number of invocations of sh.
Set up find in each of these directories to net regular files, at a depth level of 1 only, and feed them to sh in a gulp. This minimizes the number of times the rename utility gets to be called.
Set up a while loop to read-in the various pattern <-> replacement pairs and apply them on all the regular files.
In the process of rename-ing we keep a note on whether a file was still standing after the rename process. If we find that a file still exists then that means, for some reason, it could not be renamed and hence would be tried in the next pat/repl iteration. OTOH, if the file was successfully renamed, then we DONT apply the next pat/repl iteration on this file by taking it away from the command line arguments list.

I really got you thinking about a solution here :D Your script works nearly flawlessly, except the warning x: 4: cd: can't cd to ./Another directory when letting it run on the file structure of my question. But the renaming took place nonetheless. — derbenni, Apr 15 '17 at 09:08
I've marked this as accepted answer, since I didn't think of performance beforehand and this is the much better script than my solution. — derbenni, Apr 15 '17 at 09:16
I never got down to testing this stuff at my end since was too lazy to setup a separate directory structure. All this was in my head only and hence the bug of cd "$PWD". I have placed a fix for that now by passing the launch directory to find. — , Apr 15 '17 at 09:39
What is type of patterns that you have in your patterns.csv file? This will have a bearing on the rename utility's s/// command behavior. — , Apr 15 '17 at 09:40
Now the scripts works flawlessly :) I tested the script now with the patterns Another;foo, Some;bar and File on root level;Root level. There won't be more "complicated" things like special characters or umlauts. — derbenni, Apr 15 '17 at 09:45

score 0 · Answer 4 · answered Apr 14 '17 at 22:10

The important point to keep in mind is that travesing through the directory tree is a slow process hence that is done only once. What we do is first make find look at only the directories in the tree. And foreach directory we go look for all regular files underneath them (no recursion here). We then apply the rename transformation on these filenames and at the same time keep a note on whether it succeeded or not. If successful, then we break out of the while loop thereby preventing the next patt/repl to be applied on this file.

tempd="`mktemp -d`" \
find . -type d -exec sh -c '
   cd "$1" && \
   for f in ./*
   do
      [ -f "$f" ] || continue
      while IFS=\; read -r patt repl
      do
         case $f in
            ./*"$patt"* )
               rename -v "s/$patt/$repl/g" "$f" 2>&1 | tee "$tempd/$f"
               case $(< "$tempf/$f") in "$f renamed "* ) break ;; esac ;;
         esac
      done < /tmp/patterns.csv
   done
' {} {} \;

This scripts renames the files of the above file structure exactly as I want it, but a lot of warnings like .: 11: .: cannot open /./Another file on root level.ext: No such file get shown. — derbenni, Apr 15 '17 at 09:13

Recursively rename files by using a list of patterns and replacements

4 Answers4