1

I have to do some global renaming in an Eclipse plugin project, although the fact that the code is Java and it's an Eclipse plugin is irrelevant.

Basically, the steps I need to perform are the following:

  1. find . -name "*" -type f | grep -v \.git | xargs sed -i 's/com.foo/org.bar/g
  2. for all directories (excluding .git) in src/com, rename to src/org
  3. for all directories (excluding .git) in src/org/foo, rename to src/org/bar
  4. for all file and directory names (excluding .git) with com.foo in the name, change that portion to org.bar.

I only have an explicit command line for the first step.

I could manage to do the other steps semi-manually, but I think it would be worthwhile to determine an automated method for doing this. The second and third steps are equivalent, but the fourth step is different.

I want to do this with pure bash command line, but I could settle for a simple bash or perl script.

Update:

Actually, I need to search for more than just src/com. I also need src/main/java/com and src/test/java/com.

For this, I've been experimenting with something like this:

find . -type d -name "com" | grep "\(/src/com\|/src/main/java/com\|/src/test/java/com\)" | xargs -n1 -i{} echo mv {} $(echo {} | sed -e 's/com/org/g')

This gets close, but the last sed is not matching, so it doesn't change it. It results in lines like this:

mv ./plugins/.../src/com ./plugins/.../src/com

2 Answers2

1

1

For your first step, in your command example, you do not need the * because that is not filtering any file. What you do need is to filter .git, as this:

$ find . -name '.git' -prune -o -type f -print

That will reject (-prune) any directory exactly named .git and everything inside it.

That removes the need of grep -v ".git". The sed could not be avoided, as it is editing the internals of a file, not something that find could do.

But still, we can simplify (quoting the dot inside sed):

$ find . -name '.git' -prune -o -type f -exec sed -i 's/com\.foo/org\.bar/g' '{}' \+

That will accumulate filenames (similar to xargs) to the sed command.

But even better:

$ find . -name '.git' -prune -o -type f -execdir sed -i 's/com\.foo/org\.bar/g' '{}' \+

Which will execute one instance of sed per each directory.

2, and 3

Let's define a function to execute the dir rename:

$ renamedir (){ find "$1" -name '.git' -prune -o -type d -exec bash -c 'mv "$1" "${1//"$2"/$3}"' sh '{}' "$2" "$3" \; ; }

Step 2:

  • for all directories (excluding .git) in src/com rename to src/org

    $ renamedir 'd/src/com' 'src/com' 'src/org'
    

Step 3:

  • for all directories (excluding .git) in src/org/foo rename to src/org/bar

    $ renamedir 'src/org/foo' 'ogr/foo' 'org/bar'
    

4

Step 4:

  • for all file and directory names (excluding .git) with "com.foo" in the name, change that portion to "org.bar".

    $ find . -name '.git' -prune -o -name "com.foo" -type f -exec bash -c 'mv "$1" "${1//"$2"/$3}"' sh '{}' "com.foo" "org.bar" \; ; }
    

All inside a (better formatted) script:

#!/bin/bash

# NOTE: There are three echo to avoid execution of commands.
#       If after testing, you decide that it is ok to execute commands,
#       remove the echo words.

# Step 1:
### for all files (excluding .git) in . change
###     file contents from com.foo to org.bar
find . -name '.git' -prune -o -type f -execdir echo sed -i 's/com\.foo/org\.bar/g' '{}' \+

renamedir (){ find "$1" -name '.git' -prune -o -type d -exec bash -c '
                  echo \
                  mv "$1" "${1//"$2"/$3}"
              ' sh '{}' "$2" "$3" \;
        }

# Step 2:
### for all directories (excluding .git) in src/com rename to src/org
renamedir 'd/src/com' 'src/com' 'src/org'

# Step 3:
### for all directories (excluding .git) in src/org/foo rename to src/org/bar
renamedir 'src/org/foo' 'org/foo' 'org/bar'

# Step 4:
### for all file and directory names (excluding .git)
### with "com.foo" in the name, change that portion to "org.bar".
find . -name '.git' -prune -o -name "*com.foo*" -type f -exec bash -c '
    echo \
    mv "$1" "${1//"$2"/$3}"
' sh '{}' "com.foo" "org.bar" \; ; }

To test the different versions of matching, create a whole tree of files like this (remove it later with rm -rf ./d/):

$ mkdir -p d/src/{{,.git}{,one},com/{{,.git}{,two},\ .git,.git/six},org/{{,.git}{,three},bar/{,.git}{,four},foo/{,.git}{,five}}}
$ touch d/src/{111,com/{222,.git/666},org/{333,bar/444,foo/555}}  
$ touch d/src/{.git/{aaa,one/aaaaa},one/aaa111,com/.git{/bbb,two/bbb222},org/{.git{/ccc,three/ccc333},bar/.git{/ddd,four/ddd444},foo/.git{/eee,five/eee555}}}               

Your original command (using GNU -z option to match find -print0) will match (in this directory):

$ find d/ -type f -print0 | grep -zZv \.git
find d/ -type f -print0 | grep -zZv \.git | xargs -0 printf '<%s>\n'
<d/src/org/333>
<d/src/org/foo/555>
<d/src/org/bar/444>
<d/src/com/222>
<d/src/111>

Only five files, those which have no relation to anything .git.


In general:

There are three main ways to match .git in find: -name, -regex and -path. Leaving aside -regex which beside being odd (uses emacs regex) is only needed for complex matches which here are not needed.

A find d/ -path "./.git" will match exactly that: only an inmediate .git directory.
Or (for the directory created) find . -path './d/src/.git' will match only ./d/src/.git

A find d/ -name ".git" will match exactly .git at the last level (basename) of a tree.

Both -name "*.git" and -path "*.git" will match exactly the same.
Any string that ends on .git.

What starts to get complex is when we add two asterisks: *.git*
In that case, -name will match only at the last level of the path (the basename of the file), but due to the two asterisks * will also match names with prefix (note the space in .git), and postfix ( note many like .gitone):

$ find d -name '*.git*'
d/src/org/.gitthree
d/src/org/foo/.git
d/src/org/foo/.gitfive
d/src/org/bar/.gitfour
d/src/org/bar/.git
d/src/org/.git
d/src/com/.gittwo
d/src/com/.git
d/src/com/ .git
d/src/.git
d/src/.gitone

will match .git at any level of the path, almost as grep works. Not exactly the same as grep, as grep could have complex Regex but here the match is against a simple "pattern". Note the d/src/com/.git/six below:

$ find d -path '*.git*'
d/src/org/.gitthree
d/src/org/foo/.git
d/src/org/foo/.gitfive
d/src/org/bar/.gitfour
d/src/org/bar/.git
d/src/org/.git
d/src/com/.gittwo
d/src/com/.git
d/src/com/.git/six
d/src/com/ .git
d/src/.git
d/src/.gitone

Then we can modify such matches in several ways, Using (not or !) which will reject what is matched by what follows:

Not (!)

If find d -name '.git' will match:

d/src/org/foo/.git
d/src/org/bar/.git
d/src/org/.git
d/src/com/.git
d/src/.git

Then find d -not -name '.git' (or, exactly the equivalent: find d ! -name '.git') will match all the others. Not printed here as the list is quite long.

But selecting only files is short (note the file 666):

$ find d -not -name '.git' -type f
d/src/org/333
d/src/org/foo/555
d/src/org/bar/444
d/src/com/222
d/src/com/.git/666
d/src/111

And -path in find d -not -path '*.git*' -type f will match only five files:

$ find d -not -path '*.git*' -type f
d/src/org/333
d/src/org/foo/555
d/src/org/bar/444
d/src/com/222
d/src/111

Prune

Or we can use -prune to "cut" or "remove" all directories that previous matches accepted:

$ find d \( -name '.git' -prune \) -o \( -print \)
d
d/src
d/src/org
d/src/org/three
d/src/org/333
d/src/org/.gitthree
d/src/org/foo
d/src/org/foo/five
d/src/org/foo/555
d/src/org/foo/.gitfive
d/src/org/bar
d/src/org/bar/.gitfour
d/src/org/bar/four
d/src/org/bar/444
d/src/one
d/src/com
d/src/com/.gittwo
d/src/com/222
d/src/com/two
d/src/com/ .git
d/src/111
d/src/.gitone

There must be something after -prune. Usually an '-o' and here also a -print to match everything that prune did not match. As you can see, this match much more than the five files of your command. Changing '.git' to '*.git' will only remove d/src/com/ .git. But using '*.git*' will work closer to grep (not exactly the same still):

$ find d \( -name '*.git*' -prune \) -o \( -print \)
d
d/src
d/src/org
d/src/org/three
d/src/org/333
d/src/org/foo
d/src/org/foo/five
d/src/org/foo/555
d/src/org/bar
d/src/org/bar/four
d/src/org/bar/444
d/src/one
d/src/com
d/src/com/222
d/src/com/two
d/src/111

This will match only the five files (not directories) of your command:

$ find d \( -name '*.git*' -prune \) -o \( -type f -print \)

As also this will match only the initial five files. Adding -type f hides a lot of detail:

$  find d \( -name '.git' -prune \) -o \( -type f -print \)

Usually without parenthesis (less clear):

$ find d -name '.git' -prune -o -type f -print

Warning: removing -print may have un-intended side-effects.

  • I don't understand the "! -name" syntax. I just tested 'find . ! -name ".git" -type f' in a random repo, and the first lines it printed out were all of the files in ".git". Also, one of your samples here has 'find . ! -name ".git"' and the other has 'find . ! -name "*.git"'. The latter seems wrong. – David M. Karr May 06 '16 at 18:17
  • @DavidM.Karr Maybe the added part will help you understand. In fact, the analysis made me decide that -prune is the correct solution to avoid directories named just .git (and everything inside them). Please also read the section about -prune. –  May 07 '16 at 05:11
0

Replacing in files

Your command isn't reliable for several reasons: it excludes all paths containing .git as a substring, it doesn't work with paths containing whitespace or \'", it replaces input with any character between com and foo. The last problem is easily solved by adding a backslash before the dot. The problems with file names may or may not be an issue in your setup — source trees do tend to be tame. Also -name "*" is redundant (it always matches). Here's a safe and simpler alternative that properly omits the .git subtree(s).

find . -name .git -prune -o -type f -exec sed -i 's/com\.foo/org.bar/g' {} +

This still replaces e.g. com.fooalso by org.baralso, which is probably not desirable if it happens. Here's a safer construction:

find . -name .git -prune -o -type f -exec sed -i -e 's/com\.foo$/org.bar/' -e 's/com\.foo\([^-_0-9A-Za-z]\)/org.bar\1/' {} +

Renaming files with zsh

Load the zmv function with autoload -U zmv, then you can use it to rename files based on wildcard patterns. I think what you're looking for is actually

zmv -w '**/com/foo' '$1/org/bar'
zmv -w '**/*com.foo' '$1/${2}org.bar'
zmv '(**/)(*)com.foo([^-_0-9A-Za-z]*)' '$1${2}org.bar$3'