14

I used the terminal to copy files from one drive to another.

sudo mv -vi /location/to/drive1/ /location/to/drive2/

However that suddenly stopped, while some hours into it, and without an error, after creating a directory.

My own solution to that is often a mix of hashing and comparing which is mostly a time consuming mess as I now have to recover from an intermediate copy without really knowing which files are missing (written as very long one-liner for zsh — note that this script doesn't work in bash as written):

source_directory="/path/to/source_directory/";
target_directory="/path/to/target_directory/";
while read hash_and_file; do {
    echo "${hash_and_file}" | read hash file;
    echo "${file}" | sed "s/^/${source_directory}/g" | read copy_from;
    echo "${copy_from}" | sed "s/${source_directory}/${target_directory}/g" | read copy_to;
    mv -v "${copy_from}" "${copy_to}" | tee -a log;
    rm -v "${copy_from}" | tee -a log; };
done <<<$(
    comm -23 <( find ${source_directory} -type f -exec sha256sum "{}" \; |
                sed "s: ${source_directory}: :g" | sort;
           ) <( find ${target_directory} -type f -exec sha256sum "{}" \; |
                sed "s: ${target_directory}: :g" | sort; ) )

This is error prone if the name target directory or source_directory are part of the path, and delete files if they have not been moved because they were marked as duplicates. Also it does not source directory in the end.

Is there a best practice how to recover from interrupted mv?

What
  • 356
  • I wrote a similar script, which uses cmp instead of hashing. It has dependencies, and the same issues with while read that Gilles mentioned. It is also slow and verbose. But it frees up disk-space sooner than the rsync method, because files are (re)moved from the source as it runs. It may serve as inspiration for the brave. – joeytwiddle Dec 01 '18 at 14:46
  • 3
    @joeytwiddle rsync offers --delete-during receiver deletes during the transfer and also several other useful alternatives: --delete --delete-before --delete-delay --delete-after --delete-excluded. So, yes, rsync is the best alternative, –  Dec 01 '18 at 19:31
  • I must be missing something. Why doesn't just repeating the same mv command work? Perhaps with * appended to source path if the original source was a directory. – jpa Dec 02 '18 at 11:04
  • @isaac No, I'm afraid rsync --delete* would be a disaster! It will remove things from dest which are not currently in src, so all files which were successfully moved in the previous attempt will now be deleted! You were probably thinking of rsync --remove-source-files which I agree would be a good alternative. (more1, more2) – joeytwiddle Dec 03 '18 at 03:29
  • @joeytwiddle No, rsync --delete will only remove other files that are not part of the source. From man rsyncdelete extraneous files from dest dirs. Understand what extraneous means: Not being synced. And yes, rsync also provides a way to remove source files after they have been correctly transmitted. –  Dec 03 '18 at 04:53
  • @jpa A mv will restart the whole copy from zero. It will re-copy files that were already copied successfully, rsync tests for equality and leave alone files that do not need to be re-synced. –  Dec 03 '18 at 04:55
  • @Isaac My mistake. My concern was that things would have already been removed from the source during the initial mv that was interrupted. However it appears that is only the case for individual arguments passed to mv. Since the OP only specified one directory as source, then that directory should be fully intact even though mv was interrupted. In which case your --delete* is harmless. Apologies. As others have mentioned, a --dry-run is the safe way to be sure. – joeytwiddle Dec 03 '18 at 05:37
  • @Isaac Ah, indeed; like joeytwiddle, I also mistakenly thought that already moved files would have been removed as soon as they were transferred. But looks like they are only removed after everything has been copied. – jpa Dec 03 '18 at 06:55

1 Answers1

48

Forget about trying to reinvent rsync, and use rsync.

sudo rsync -av /location/to/drive1/ /location/to/drive2/

Make sure you use a trailing slash on the source, otherwise it would copy to /location/to/drive2/drive1.

Double-check that the command succeeded, then run rm -rf /location/to/drive1/.

The command above will overwrite any preexisting file from drive2. If you want to prompt the user to skip files that already existed in drive2, as with mv -i, it's more complicated, because you now need to distinguish files that have already been copied and files that haven't. You can pass the --ignore-existing option to rsync to skip files that already exist on the destination regardless of their content. Note that if the original mv was interrupted in the middle of creating a file, this file will remain in its half-copied state (whereas a bare rsync -a would properly finish copying it).

If you want to reproduce the exact behavior of mv -i, including the prompting, it could be done, but it's a lot more complicated.

Note that your one-giant-liner is very fragile. If there are file names containing backslashes or newlines, they may not be copied properly or they may even trick your script into removing arbitrary files. So do not use the code in the question unless you're sure that you can trust the file names not to contain backslashes or newlines.

For future reference, I recommend to never use mv for large cross-drive moves, precisely because it's hard to control what happens if it gets interrupted. Use rsync to do the copying, and then remove the original.

  • What promises does rsync make that mv does not? – What Dec 01 '18 at 09:48
  • 6
    well, for example rsync does what you're trying to do, while mv does not. Also: copying between different machines; compression for transfer; skipping files existing at the destination based on timestamp- or hash-based equality; configurable handling of ownership, permissions, links and special files; etc. https://linux.die.net/man/1/rsync – Silly Freak Dec 01 '18 at 11:00
  • 1
    @SillyFreak should I conclude from that, that I should always use rsync instead of mv, not only as Gilles said for cross-drive, but any operation, as the boundary of "too large" is relatively subjective and if it comes to a problem it would have been solved by rsync anyway? – What Dec 01 '18 at 11:07
  • 11
    well, when I'm moving files or directories inside one partition, I usually use mv (or the file manager) because it's only moving a reference to the file/directory. If I need to do actual data transfer, then I use rsync if one of the following is true: 1) I'm moving more files than I can check for correct transfer at a glance; 2) I anticipate that I'll need to keep files in sync; 3) I expect the transfer could be interrupted. My point is, for the use case you're presenting in the question, rsync is simply the right tool, and mv or cp are not. – Silly Freak Dec 01 '18 at 11:22
  • @what that’s pretty much what I do: I use rsync all the time when moving large amounts of data between mountpoints. It’s one of my absolute favorite tools. – Josh Dec 01 '18 at 15:57
  • 7
    I would advise to always run any rsync command with -v and —dry-run first to confirm exactly what it’s going to do. – Darren Dec 01 '18 at 18:49