21

I'm replacing a failed harddrive in a mirrored btrfs.

btrfs device delete missing /[mountpoint] is taking very long, so I assume that it's actually rebalancing data across to the replacement drive.

Is there any way to monitor the progress of such an operation?

I don't necessarily expect a pretty looking GUI, or even a % counter; and I'm willing to write a couple of lines of shell script if that's necessary, but I don't even know where to start looking for relevant data. btrfs filesystem show for example just hangs, presumably waiting for the balance operation to finish before it displays any information about the mirrored fs.

user50849
  • 5,202

5 Answers5

34
btrfs balance status /mountpoint

man 8 btrfs

 [filesystem] balance status [-v] <path>
        Show status of running or paused balance.
    Options

    -v   be verbose

llua
  • 6,900
  • 4
    Thanks, turns out that in my case btrfs doesn't appear to consider the current operation a balance, as that returns nothing, but I see there's also a "replace status", which I could probably have used, had I used the replace command. Good answer regardless. – user50849 Jan 26 '14 at 15:28
  • The balance status should look something like: Balance on '/volume1' is running 28 out of about 171 chunks balanced (1156 considered), 84% left. Unusually, the percentage counts down. – mwfearnley Apr 30 '18 at 09:41
8
sudo btrfs fi show

this will output something like so:

Label: none  uuid: 2c97e7cd-06d4-4df0-b1bc-651397edf74c
        Total devices 16 FS bytes used 5.36TiB
        devid    1 size 931.51GiB used 770.48GiB path /dev/sdc
        devid    2 size 931.51GiB used 770.48GiB path /dev/sdg
        devid    3 size 931.51GiB used 770.48GiB path /dev/sdj
        devid    4 size 0.00 used 10.02GiB path
        devid    5 size 931.51GiB used 770.48GiB path /dev/sdh
        devid    6 size 931.51GiB used 770.48GiB path /dev/sdi
        devid    7 size 931.51GiB used 770.48GiB path /dev/sdd
        devid    8 size 931.51GiB used 770.48GiB path /dev/sdo
        devid    9 size 465.76GiB used 384.31GiB path /dev/sdn
        devid    10 size 931.51GiB used 770.48GiB path /dev/sdp
        devid    11 size 931.51GiB used 770.48GiB path /dev/sdr
        devid    12 size 931.51GiB used 770.48GiB path /dev/sdm
        devid    13 size 931.51GiB used 769.48GiB path /dev/sdq
        devid    14 size 931.51GiB used 770.48GiB path /dev/sdl
        devid    15 size 931.51GiB used 770.48GiB path /dev/sde
        devid    16 size 3.64TiB used 587.16GiB path /dev/sdf

Btrfs v3.12

And if you notice that device id #4 looks a little bit different than the rest. when you do "btrfs device delete missing /mntpoint" then it will start to regenerate the raid meta/data necessary to free up that "missing" drive.

if you do something like

"watch -n 10 sudo btrfs fi show"

then you can see the space on the offending "missing" device gradually getting smaller and smaller until the operation completes and it will be removed from the fi.

Ace
  • 410
4

BTRFS may take some time reading or rearranging data prior to writing data to the drive you expect it to write to.

You can see how much CPU time is being devoted to BTRFS operations including rebalance, add, delete, convert, etc:

ps -ef | grep btrfs

To see how busy each drive is, install sysstat, and run:

iostat

Add some options to make iostat show stats in megabytes and update every 30 seconds:

iostat -m -d 30

Sample output from scrub so no writes during this interval:

Device:            tps    MB_read/s    MB_wrtn/s    MB_read    MB_wrtn
sda             700.30       170.10         0.00       6804          0
sdb               0.78         0.00         0.01          0          0
sdc             520.20       127.98         0.00       5119          0
sdd             405.72        92.02         0.00       3680          0
sde             630.05       153.66         0.00       6146          0
sdf             627.43       153.60         0.00       6144          0

Install and run munin to see historical graphs of drive activity and lots of other info. https://www.digitalocean.com/community/tutorials/how-to-install-the-munin-monitoring-tool-on-ubuntu-14-04

1

I was also wondering when a long lasting delete would finish so I came up with this little piece of shell code:

get_bytes() {
  btrfs device usage --raw /mnt/data | egrep -- '-[0-9]+' | sed -E 's/[^0-9]+([0-9]+)/\1/'
}

prev=$(get_bytes)

while [ 1 ]; do
  current=$(get_bytes)
  diff=$((current-prev))
  if [ "$diff" -gt 0 ]; then
    dd if=/dev/zero iflag=count_bytes count="$diff" 2>/dev/null
  fi
  prev="$current"
  sleep 1
done | pv -petraW -s $(get_bytes) >/dev/null

This will give you a nice progress bar like this:

0:13:54 [0,00 B/s] [16,0MiB/s] [>                             ]  1% ETA 19:23:19

The general idea is to use pv to display progress. Since that command only allows to monitor bytes flowing through a pipe we use dd to generate an appropriate amount of zeros and pipe them into pv.

The advantage of this method is that you get a nice progress bar. However, since it seems btrfs always deletes data one GB at a time it takes some time until a new difference in byte sizes can be observed.

To address this issue the flag -a is added to the default flags of pv to make it display an average transmission rate (since the normal current transmission rate will be 0 most of the time).

I realize this is not the best solution but the best I could come up with. If someone has ideas for improvements please let me know! :)

-1

For simple "monitoring of command's output" you can use watch (as it works almost everywhere), for example running as root:

watch -n1 btrfs balance status /

(The switch -n1 means "wait 1 second and then run the command again".)

  • Whys is this better than @llua accepted answer? All you did was add watch to a monitoring command. – number9 Jun 30 '22 at 13:03