How to find out if a file on btrfs is copy-on-write?

Question

I know that cp has a --reflink option to control full copies vs. copy-on-write "copies".

On btrfs, can I use ls (or some other command) to find out whether a file shares (in a copy-on-write senses) some storage with another file?

EDIT: @StéphaneChazelas points me to filefrag, but that fails for me:

root@void:/tmp/mount# mount | tail -1
/tmp/back on /tmp/mount type btrfs (rw,relatime,space_cache)
root@void:/tmp/mount# df -h | tail -1
/dev/loop0       32M   13M   20M  38% /tmp/mount
root@void:/tmp/mount# ls -lh
total 8.0M
-rw-r--r-- 1 root root 8.0M Jan 19 08:43 one
root@void:/tmp/mount# cp --reflink=always one two
root@void:/tmp/mount# sync
root@void:/tmp/mount# ls -lh
total 16M
-rw-r--r-- 1 root root 8.0M Jan 19 08:43 one
-rw-r--r-- 1 root root 8.0M Jan 19 08:45 two
root@void:/tmp/mount# df -h | tail -1
/dev/loop0       32M   13M   20M  38% /tmp/mount
root@void:/tmp/mount# filefrag -kvx one 
Filesystem type is: 9123683e
File size of one is 8388608 (8192 blocks of 1024 bytes)
FIEMAP failed with unknown flags 2
one: FIBMAP unsupported
root@void:/tmp/mount# uname -a
Linux void 4.1.7+ #817 PREEMPT Sat Sep 19 15:25:36 BST 2015 armv6l GNU/Linux

With filefrag -v, you can check whether two files have data in common. — Stéphane Chazelas, Jan 18 '16 at 15:26

webminal.org · Answer 1 · 2021-01-21T18:02:56.230

Update (Jan-2021): see comment by @bitinerant: "btrfs-debug-tree is now obsolete; use btrfs inspect-internal dump-tree"

I don't know how to find it via the ls command. But if you really want it, you can use the btrfs-progs/btrfs-debug-tree.

With reflink=always, the files will share a common data block. This common data block (aka extents) has refs more than 1.

First you need to find the objectid for the files one and two

 #./btrfs-debug-tree  /dev/xvdc
 (Check under FS_TREE)
   <snip>
     item 8 key (256 DIR_INDEX 4) itemoff 15842 itemsize 33
         location key (259 INODE_ITEM 0) type FILE
         namelen 3 datalen 0 name: one
     item 9 key (256 DIR_INDEX 5) itemoff 15809 itemsize 33
         location key (260 INODE_ITEM 0) type FILE
         namelen 3 datalen 0 name: two
   </snip>

From above we can see its 259(one) and 260(two).

Now find its refs. from extent tree. Below command will find the data block shared between two files.

 # ./btrfs-debug-tree  /dev/xvdc | grep -A2 "refs 2"
         extent refs 2 gen 9 flags DATA
         extent data backref root 5 objectid 260 offset 0 count 1
         extent data backref root 5 objectid 259 offset 0 count 1

Bonus: Create another reference:

# cp --reflink=always one three

verify the refcount is incremented by 1.

# ./btrfs-debug-tree   /dev/xvdc | grep -A3 "refs 3"
        extent refs 3 gen 9 flags DATA
        extent data backref root 5 objectid 260 offset 0 count 1
        extent data backref root 5 objectid 261 offset 0 count 1
        extent data backref root 5 objectid 259 offset 0 count 1

Here the data block is shared between three files which are pointed to by objectid 259,260,261.

btrfs-debug-tree is now obsolete; use btrfs inspect-internal dump-tree — bitinerant, Sep 04 '20 at 22:15
@endolith - sorry, but no. That's more than a simple edit, and I'm not the author. — bitinerant, Jan 20 '21 at 05:24

score 3 · Answer 2 · answered Jun 25 '21 at 14:04

3

Just use:

$ btrfs filesystem du .
       Total   Exclusive  Set shared  Filename
    1.11GiB     1.11GiB           -  ./file1
    1.12GiB     1.12GiB           -  ./file2
    1.31GiB       0.00B           -  ./file3
    3.54GiB     2.23GiB     1.31GiB  .

In this example, 'file3' is a reflink copy as it is not consuming any Exclusive space.

answered Jun 25 '21 at 14:04

LuckyDams

31
2

that's doesn't seem to work when other snapshots exists, e.g. for backup purposes. – Frederick Nord Jan 26 '23 at 10:18

jrw32982 · Answer 3 · 2021-12-01T18:57:08.413

@pwaller's answer shows that a listing of the data extents of each file can be compared to see if two files share identical extents. filefrag from the e2fsprogs package can (almost) do this. filefrag -v FILE1 FILE2 will show if FILE1 and FILE2 have the same extents, in which case they are reflinks of each other.

Doing this programatically in a script is harder because filefrag outputs the filename. To do this, I have a patched copy of filefrag which makes two changes:

Output the device ID
Do not output the filename if only one filename is specified

With these changes, the outputs from two calls to filefrag can be compared. If identical, then the two files are reflinks of each other.

One final caveat: If the output from filefrag matches the regex inline|unknown_loc|delalloc, then the file cannot be reflinked since it has no data block. To handle that case, I wrap my patched filefrag with a check for that pattern and append the filename itself to the output if I find it (to make the output unique per filename, so that it will not match the output for a different filename). See @StéphaneChazelas's comments here for more details.

I submitted a pull request (https://github.com/tytso/e2fsprogs/pull/87) and an issue (https://github.com/tytso/e2fsprogs/issues/88) for this.

score 1 · Answer 4 · edited Apr 13 '17 at 12:36

1

I have just released a program called fienode (← link) which computes a SHA1 hash of the physical extents of a file. Identical CoW copies have the same hash.

There is also a more detailed answer here, explaining why this is necessary.

How to verify a file copy is reflink/CoW?

Note however, that BTRFS is at liberty to change the physical extents. I've observed a large reflinked file changes its physical extents without provocation, making the fienode output different, even though the majority of the physical extents were still shared.

edited Apr 13 '17 at 12:36

Community

1

answered Apr 17 '16 at 08:24

pwaller

312
4
6

would https://github.com/pwaller/sharedextents/ be more appropriate? – Frederick Nord Jan 26 '23 at 10:18

score 0 · Answer 5 · answered May 13 '22 at 00:02

On xfs at least if the files have not been altered then filefrag sets a shared flag.

For example:

 >filefrag -e foobar
 Filesystem type is: 58465342
 File size of filesystems.docker is 1344 (1 block of 4096 bytes)
 ext:     logical_offset:        physical_offset: length:   expected: flags:
 0:        0..       0:  348117738.. 348117738:      1:             last,eof
 foo: 1 extent found
>cp --reflink=auto foo bar
 >filefrag -e foo
 Filesystem type is: 58465342
 File size of filesystems.docker is 1344 (1 block of 4096 bytes)
 ext:     logical_offset:        physical_offset: length:   expected: flags:
 0:        0..       0:  348117738.. 348117738:      1:             last,shared,eof
 foo: 1 extent found

caveat: I'm not sure what happens if part of a file is altered so that only some blocks are in common.

caveat 2: I don't know if this works on btrfs (comment or edit if you do)

How to find out if a file on btrfs is copy-on-write?

5 Answers5

Linked