11

btrfs send and receive can be used to transfer terabytes of data, but these commands don't produce helpful progress output (even with -v). How can I check if they succeeded?

For example, if I create a new subvolume called source, write 1 GB of random data into it, and make it read-only so that it can be sent:

# btrfs subvolume create source
# head -c 1G < /dev/urandom > source/data
# btrfs property set source ro true

Then, create a copy of the new subvolume using btrfs send and receive, but interrupt the process before it completes:

# mkdir destination
# btrfs send source | btrfs receive destination
At subvol source
At subvol source
^C

btrfs subvolume list will not indicate that anything has gone wrong:

# btrfs subvolume list .
ID 1216 gen 370739 top level 5 path source
ID 1219 gen 371244 top level 5 path destination/source

The new subvolume can be browsed normally, although clearly its data is corrupt:

# exa -lT
   - ├── destination
   - │  └── source
251M │     └── random_data
   - └── source
1.1G    └── random_data

btrfs subvolume show destination/source does not warn us that the subvolume is incomplete. It does show that destination/source has a different UUID to source, and it looks as though destination/source's Received UUID will be set to source's UUID if and only if btrfs receive ran to completion.

Does the presence of the Received UUID guarantee that a subvolume created by btrfs receive is a complete and unmodified copy of the subvolume with that UUID on another filesystem?

This part of man btrfs-send suggests not, and seems to imply that using destination/source in the above example as the parent of a future snapshot of source would fail to detect and repair the corruption as well. However, I'm still not completely clear on the purpose of send -c and whether this advice also applies to send -p.

In the incremental mode (options -p and -c), previously sent snapshots that are available on both the sending and receiving side can be used to reduce the amount of information that has to be sent to reconstruct the sent snapshot on a different filesystem.

The -p <parent> option can be omitted when -c <clone-src> options are given, in which case btrfs send will determine a suitable parent from among the clone sources.

You must not specify clone sources unless you guarantee that these snapshots are exactly in the same state on both sides—both for the sender and the receiver.

From what I can tell, snap-sync, buttersink and other similar tools deal with this problem by redirecting the output of btrfs send to a series of files, and transferring them using a reliable method like rsync rather than a simple pipe. Is that the right approach to take, if I want to develop my own incremental backup solution without relying on third-party software that isn't packaged by my distro?

sjy
  • 896
  • If anyone else comes across this question and is looking at snap-sync and buttersink, currently I'm exploring btrbk, which seems like a more promising option with better documentation. – sjy Feb 29 '20 at 13:44

4 Answers4

5

TL;DR: If Received UUID and the readonly flag is set, then it's quite unlikely that something went wrong, unless carelessness or malice is involved.

Like @timakro already said in his answer, Received UUID is not set until the transfer is complete. Neither is the readonly flag. This, combined with the fact that every command in the stream is checksumed (and that, as far as I can understand, sent metadata also includes checksums) makes it quite unlikely that you will end up with a corrupt snapshot on the receiving side with readonly and Received UUID set. If any of them are unset, btrfs will refuse to use that snapshot as a reference for a future btrfs receive.

What could corrupt the received snapshot would be intentional corruption, if receiving a specially crafted stream, or if some process or user changed the contents of the received snapshot while it was received. From the btrfs-receive manpage:

BUGS

btrfs receive sets the subvolume read-only after it completes successfully. However, while the receive is in progress, users who have write access to files or directories in the receiving path can add, remove, or modify files, in which case the resulting read-only subvolume will not be an exact copy of the sent subvolume.

If the intention is to create an exact copy, the receiving path should be protected from access by users until the receive operation has completed and the subvolume is set to read-only.

Additionally, receive does not currently do a very good job of validating that an incremental send stream actually makes sense, and it is thus possible for a specially crafted send stream to create a subvolume with reflinks to arbitrary files in the same filesystem. Because of this, users are advised to not use btrfs receive on send streams from untrusted sources, and to protect trusted streams when sending them across untrusted networks.

It's also worth noting that it's possible to disable the readonly flag on a subvolume, modify things, and then enable it again. If this has been done on either side, all guarantees are thrown out of the window.

Note that piping the output to a file and transferring that file does not provide any protection from the above. Personally I see absolutely no reason why it would be insecure to pipe the output of btrfs send directly to ssh. The benefit of storing the stream in files intermediately is that it makes it possible to resume an interrupted transfer on an unreliable connection, but it does not provide any guarantee in the way of data integrity.

A good (though not fool-proof) way to verify that the received snapshot matches the sent snapshot is to use rsync -avcn --del path/to/sent/snapshot/ user@remote:path/to/received/snapshot/.

mbloms
  • 66
  • So, there seems to be no btrfs-native solution, but rsync -avcn --del does the job sufficiently good as it compares files by content (checksum). Cheers! – Greendrake Aug 26 '21 at 12:14
4

i have more of ten backup systems based exactly on last part of what you said. Direct pipes have never been an option to me, since i deal with backup over network that are > 1TB. Could not risk to lose a single bit and waste hours of work.

My final setup is as follows.

Bootstrap Phase:

  1. Take first full snapshot
  2. Send snapshot to local file (-f option)
  3. Rsync or physical media transfer of snapshot file to remote site.
  4. Remote receive of first snapshot

Incremental Phase:

  1. New local snapshot

  2. Local generation and send to file of diff between current and last snapshot

  3. Rsync to remote site

  4. Remote import of transferred snapshot file

  5. Cleaning logic (think about retention, remove old snapshots...)

This is up and running since 3 years. On worst cases, when snapshots don't match, it's enough to delete last two (1 local, 1 remote) to have it working again with next send.

Good luck

  • Thanks very much for your answer. It seems that it is definitely necessary to use another tool (eg. rsync) to run backups over the network, and I will have to look closer at how tools like buttersink deal with the bootstrapping problem without requiring eg. 2 TB of free space to transfer a 1 TB subvolume. – sjy Feb 01 '20 at 07:31
  • i would say that native send/receive pipes via ssh are only feasible in a quiet LAN segment, IMHO as soon as you encounter a gateway along your path it's already enough to switch to file based transfer ;) – realpclaudio Feb 03 '20 at 14:24
3

Does the presence of the Received UUID guarantee that a subvolume created by btrfs receive is a complete and unmodified copy of the subvolume with that UUID on another filesystem?

The Received UUID field is only set after the subvolume is received. From the btrfs-progs source:

@received_uuid: UUID of the subvolume this subvolume was received from, or all zeroes if this subvolume was not received. Note that this field, @stransid, @rtransid, @stime, and @rtime are set manually by userspace after a subvolume is received.

You can also observe this in verbose mode:

$ btrfs send -v 2020-12-28/ | ssh root@link "btrfs receive -v /mnt/test"
At subvol 2020-12-28/
BTRFS_IOC_SEND returned 0
joining genl thread
At subvol 2020-12-28
receiving subvol 2020-12-28 uuid=778ec7aa-6709-d240-b41d-58d99a6fb9a0, stransid=9
BTRFS_IOC_SET_RECEIVED_SUBVOL uuid=778ec7aa-6709-d240-b41d-58d99a6fb9a0, stransid=9
timakro
  • 235
0

I tested the case of @timakro but it didn't work properly: An incompletely transferred snapshot still owned a UUID.

Firstly, I had two separately mounted btrfs volumes:

➜  lsblk|grep loop99
loop99       7:99   0     1G  0 loop 
|-loop99p1 259:0    0   512M  0 part /mnt/tmp1
`-loop99p2 259:1    0   486M  0 part /mnt/tmp2

Then, I created a new subvolume at the first btrfs volume and made a random file on the subvolume:

➜  sudo btrfs subvolume create tmp1/ori
Create subvolume 'tmp1/ori'
➜  sudo dd if=/dev/urandom of=tmp1/ori/test.iso bs=1M count=256
256+0 records in
256+0 records out
268435456 bytes (268 MB, 256 MiB) copied, 1.58701 s, 161.3 MB/s

After creating a read-only snapshot, I transferred it to the second btrfs volume and interrupted the transfer:

➜  sudo btrfs subvolume snapshot -r tmp1/ori tmp1/snap         
Create a readonly snapshot of 'tmp1/ori' in 'tmp1/snap'
➜  sudo btrfs send tmp1/snap | sudo btrfs receive tmp2 
At subvol tmp1/snap
At subvol snap
^C

Via hashing, we can see the received snapshot has been different from the original one, which means it was corrupted.

➜  sudo sha256sum tmp1/snap/test.iso                
49455edf92d582346215679a52eb6d72f0afa10748ef62f1ce3fc5e417e70f6a  tmp1/snap/test.iso
➜  sudo sha256sum tmp2/snap/test.iso
bb30f487a39579dabc768129d59d4abeca25777e4eeb4f41b9beca366d379bab  tmp2/snap/test.iso

Now, let's check the UUID of them:

➜  sudo btrfs subvolume list tmp1 -u
ID 257 gen 25 top level 5 uuid b5f8ff5d-fafb-814a-a31b-25bb96c185b5 path ori
ID 258 gen 25 top level 5 uuid d2bfa408-b330-954f-9aa8-138de5030898 path snap
➜ sudo btrfs subvolume list tmp2 -u
ID 256 gen 31 top level 5 uuid 77950b38-812d-3646-88cd-8107def1eced path snap

The incomplete snapshot possesses its own UUID, and there are no indications or prompts indicating its incomplete status, except the unset read-only flag.