3

I have a directory ~/music that contains all my music files and a directory /media/backup/music that I wish to synchronize with the first one using rsync. Initially, I did a rsync -a ~/music /media/backup which created the music directory inside /media/backup with all my music files as expected.

Since then I have modified a lot of filenames inside ~/music and now I want to sync these changes. Doing a rsync -ain seems to register these modifications as new files - and hence it will create a set of new files in the backup directory and not just update the exisiting target filenames. Most of my files are big and I don't want to re-copy them each time they change name.

Is there a way to tell rsync to synchronize identical files with different names by only updating the filenames from the source and not creating new ones? I might use the --delete option to delete extraneous files, but if there's a better way to this I'd like to know.


Example:
$ cd example
$ rsync -a var backup/                 # rsync var under backup/var
$ tree .
.
├── backup
│   └── var
│       ├── JSON.gif
│       └── logs
│           ├── xinit.log
│           └── x.log
└── var
    ├── JSON.gif
    └── logs
        ├── xinit.log
        └── x.log


$ mv var/JSON.gif var/JSON-LOGO.gif    # rename some file
$ mv var/logs var/log                  # rename some directory
$ rsync -a var backup/                 # sync the changes
$ tree .
.
├── backup
│   └── var
│       ├── JSON.gif                   # don't want this one
│       ├── JSON-LOGO.gif              # want -only- this one
│       ├── log                        # same here
│       │   ├── xinit.log
│       │   └── x.log
│       └── logs                       # don't want this one either
│           ├── xinit.log
│           └── x.log
└── var
    ├── JSON-LOGO.gif
    └── log
        ├── xinit.log
        └── x.log

2 Answers2

2

You might have a look at the --fuzzy option.

Quoting from the manpage:

This option tells rsync that it should look for a basis file for any destination file that is missing. The current algorithm looks in the same directory as the destination file for either a file that has an identical size and modified-time, or a similarly-named file. If found, rsync uses the fuzzy basis file to try to speed up the transfer.

Note that the use of the --delete option might get rid of any potential fuzzy-match files, so either use --delete-after or specify some filename exclusions if you need to prevent this.

According to this description the algorithm isn't very intelligent and won't work for renamed directories or identical files in different directories.

scai
  • 10,793
  • The problem of --fuzzy is that if the file is modified, it may change its size so fuzzy won't replace it and he will have again two files, the new and the old one in the backup directory. That's why I said the best option for avoid this kind of problem is use --delete. Recopy a music file (let's say 20MB if you say that your music is over 10MB) is 1-2 seconds. –  Feb 25 '13 at 12:49
  • There is --delete-after which solves this problem. --delete won't work together with --fuzzy as already explained in the quoted paragraph. – scai Feb 25 '13 at 13:03
  • You didn't understand what I said. I'm not talking about use --delete --fuzzy. You may use --fuzzy --delete-after but really worth check each file's metadata for find potential matches and if there aren't any matches, then copy and delete it. Isn't faster just copy and delete?. And regarding to --fuzzy, what if two files have the same size and last modification time? Fuzzy is a little risky to use, too many variables. –  Feb 25 '13 at 13:19
  • Whether deleting/copying is faster than fuzzy matching heavily depends on the use case and has to be decided by the user. I assume this decision was already made by the author. And yes the fuzzy option can lead to incorrect synchronization if checksums are not taken into account. Unfortunately this option isn't well documented. – scai Feb 25 '13 at 13:26
1

You need to use --delete.

Don't know if exists other way but music files are 10MB max so there is no problem in copy it again.

edit: as scai said, you may use --fuzzy, but I really wouldn't. It doesn't guarantee the task, depends on many variables that may end up in false positives.

  • Most of my music files are over 10M in size, I'd prefer not having to re-copy them. –  Feb 25 '13 at 12:30