66

In Ubuntu, I want to copy a big file from my hard drive to a removable drive by rsync. For some other reason, the operation cannot complete in a single run. So I am trying to figure out how to use rsync to resume copying the file from where it left off last time.

I have tried to use the option --partial or --inplace, but together with --progress, I found rsync with --partial or --inplace actually starts from the beginning instead of from what was left last time. Manually stopping rsync early and checking the size of the received file also confirmed what I found.

But with --append, rsync starts from what was left last time.

I am confused as I saw on the man page --partial, --inplace, and --append seem to relate to resuming copying from what was left last time. Is someone able to explain the difference? Why don't --partial or --inplace work for resuming copying? Is it true that for resuming copying, rsync has to work with the --append option?

Also, if a partial file was left by mv or cp, not by rsync, will rsync --append correctly resume copying the file?

tshepang
  • 65,642
Tim
  • 101,790

6 Answers6

50

To resume an interrupted copy, you should use rsync --append. From the man page's explanation of --append:

This causes rsync to update a file by appending data onto the end of the file, which presumes that the data that already exists on the receiving side is identical with the start of the file on the sending side. [...] Implies --inplace, [...]

Option --inplace makes rsync (over)write the destination file contents directly; without --inplace, rsync would:

  1. create a new file with a temporary name,
  2. copy updated content into it,
  3. swap it with the destination file, and finally
  4. delete the old copy of the destination file.

The normal mode of operation mainly prevents conflicts with applications that might have the destination file open, and a few other mishaps which are duly listed in the rsync manpage.

Note that, if a copy/update operation fails in steps 1.-3. above, rsync will delete the temporary destination file; the --partial option disables this behavior and rsync will leave partially-transferred temporary files on the destination filesystem. Thus, resuming a single file copy operation will not gain much unless you called the first rsync with --partial or --partial-dir (same effect as --partial, in addition instructs rsync to create all temporary files in a specific directory).

tshepang
  • 65,642
  • Thanks! If a partial file was left by mv or cp not by rsync, will rsync --append correctly resume the file copying? – Tim Sep 26 '10 at 19:05
  • 2
    @Tim In short, --append makes rsync believe that, if two corresponding files have different length, then the shorter one is identical to the initial part of the longer one. So, yes, if you start copying a large file with cp and interrupt the copy process, then rsync --append will copy only the remaining part of the file. (Note: if cp is interrupted by a system crash, there is a small chance that the file contents and metadata are not in sync, i.e., the file is corrupted. In this case, running rsync once more without --append should fix the problem.) – Riccardo Murri Sep 26 '10 at 20:05
  • 3
    So If I understand this correctly, there is no way to tell rsync to verify a partial file and resume transfer to that partially transferred file? – Winny Jul 20 '14 at 17:29
  • @Winny See TomG's answer below. – Riccardo Murri Jul 25 '14 at 13:14
  • 1
    @Winny, very belatedly: for a local copy there is no sensible way to do this. For a network copy this is the default mode when you specify --partial without --append. – Chris Davies Apr 17 '16 at 07:55
  • 1
    @Winny --append and --append-verify have a dangerous failure case: when the receiver's file is the same size or larger but has different data. I suggest a solution based around --no-whole-file instead. – Tom Hale Oct 06 '19 at 10:27
30

Be aware that --append implies --inplace, which itself implies --partial.

  • By just using --partial you should cause rsync to leave partial transfers and resume them in subsequent attempts.

  • By using --append you should cause rsync to both leave partial files and resume them next time. After transfer rsync should verify the checksum of transmitted data only.

  • --append-verify includes the whole file in the checksum verification, including any portion transferred in a previous transfer.

  • With either --append or --append-verify a failed checksum verification should cause the file to be re-transmitted completely (using --inplace)

You should be able to resume a mv or cp operation with rsync but you may want to use the --append-verify option for peace of mind.

Note that using --append causes rsync to copy only those files which have its size on the receiver shorter than the size on the sender (regardless of time stamps), or are absent on receiver. By documentation on this option:

If a file needs to be transferred and its size on the receiver is the same or longer than the size on the sender, the file is skipped.

More info in the man page

ruvim
  • 105
TomG
  • 401
  • 4
    --append and --append-verify have a dangerous failure case: when the receiver's file is the same size or larger but has different data. I suggest a solution based around --no-whole-file instead. – Tom Hale Oct 06 '19 at 10:29
  • 1
    @TomHale the documentation suggests that in order for a file to be skipped it would need to have exactly the same size and modification time at both ends. If this is a plausible concern then --checksum should be used. I can't find it specified explicitly, but logically any of the resume-able options should imply --no-whole-file because --whole-file should be incompatible. – TomG Oct 07 '19 at 13:26
  • --append-verify will skip same or larger sized files with different dates, which may be "unexpected". There's no need to --checksum all files, as rsync will do a whole file checksum anyway, but only on what it transfers. – Tom Hale Oct 08 '19 at 05:13
  • 1
    --checksum tells rsync to checksum the files before sending which ensures that all changed files are transferred, regardless of size/time. Have you got a source for the unexpected --append-verify behaviour as what you describe doesn't match with the documentation or my (limited) experience? – TomG Oct 09 '19 at 09:07
  • 1
    --append-verify refers to --append which says: If a file needs to be transferred and its size on the receiver is the same or longer than the size on the sender, the file is skipped. Even if a file needs to be transferred because of --checksum, it may still be skipped. – Tom Hale Oct 09 '19 at 15:40
7

David Schwartz is correct, --partial (or better, -P) does do what you want. I verified this on a 37G file that was stopped ~8g into it, over a network. rsync quickly scanned the first parts of the partial (showing progress as it was going thanks to -P), and then resumed the transfer to the end of the partial file.

  • A network copy is treated differently to a local copy, which is the issue here. – Chris Davies Apr 17 '16 at 07:50
  • @roaima Do you have a source for that, or a document which explains in more detail what the differences are? I fail to find it in the (huge) manpage. – Jonas Schäfer Feb 04 '18 at 14:39
  • @JonasWielicki the man page alludes to it under the --whole-file option description. – Chris Davies Feb 04 '18 at 19:11
  • @roaima Thank you very much! This also means that the proper workaround is --no-W (which actually works!) – Jonas Schäfer Feb 04 '18 at 20:11
  • @JonasWielicki it's exceedingly inefficient, which is why it's disabled by default. You really do not want to use --no-W unless you understand exactly what setting it means for local files. See https://unix.stackexchange.com/a/181018/100397 – Chris Davies Feb 04 '18 at 20:32
  • @roaima Thanks. I think I am aware of the implications; if a destination is write-limited (and/or the partial destination file’s bytes are still in the cache), using --no-W makes sense, doesn’t it? – Jonas Schäfer Feb 05 '18 at 07:53
  • @JonasWielicki yes – Chris Davies Feb 05 '18 at 08:28
6

By default, rsync will enable --whole-file if transferring from local disk to local disk. This will restart an interrupted transfer from the beginning, rather than checking the parts that are already there.

To disable this, use:

--no-whole-file

Combining this with either --inplace or --partial will allow resuming the transfer later.

My alias for using rsync to copy is:

rscp='rsync -ax --inplace --sparse --no-whole-file --protect-args'

Warning: be careful of using --append-verify as it will skip any destination files which are the same size or larger.

Tom Hale
  • 30,455
  • For network transfers rsync compares source and corresponding destination files prior to transfer in order to send only those parts which have changed (delta-transfer). --no-whole-file tells rsync to do the same thing for local-to-local copying. The documentation doesn't suggest it would have any affect on resuming partial transfers of single files.

    rsync will skip files with the exactly the same size and timestamps by design. Neither --append-verify nor --no-whole-file should change that behaviour but --checksum should work with either for peace of mind at the cost of disk IO

    – TomG Oct 07 '19 at 14:34
  • --append-verify will skip same or larger sized files with different dates, which may be "unexpected". There's no need to --checksum all files, as rsync will do a whole file checksum anyway, but only on what it transfers. – Tom Hale Oct 08 '19 at 05:11
3

You were doing it right --partial does what you want. It appears to be starting from the beginning because it always starts at the beginning of the list of file data chunks it needs to copy. The --append option is dangerous and will result in a corrupt file if the data does not match for some reason.

0

--partial leaves partial files when you interrupt rsync (otherwise it deleted the partial file.)

--inplace makes it update a file in place, instead of having a temporary file (like foo.xyz would have a file with a name like .foo.xyz.xxxxx, then rename it to foo.xyz when it's done transferring.)

--append just adds the "rest of" the file to the end of the existing one.

What's the difference between append and inplace? inplace checks the existing portion of the file, and re-sends any parts of the existing file that have changed (and without --inplace, it checks the existing file, and copies the matching data into the temp file as it goes so it doesn't have to resend that data.) --append would work fine in the case where you're interrupting and resuming transfer of some large file; but rsync cannot assume you're doing that so by default it checks the existing part of the file first.

hwertz
  • 155