For some unknown reasons, my BTRFS filesystem is corrupted. dmesg prints
BTRFS critical (device sda2): corrupt leaf, slot offset bad: block=43231330304,root=1, slot=47
(more than 1000x in the dmesg trace).
How to repair block #43231330304?
For some unknown reasons, my BTRFS filesystem is corrupted. dmesg prints
BTRFS critical (device sda2): corrupt leaf, slot offset bad: block=43231330304,root=1, slot=47
(more than 1000x in the dmesg trace).
How to repair block #43231330304?
You should install smartmontools and run a long test (will take a while)
#smartctl -t long /dev/sd?
then it fails on the bad block
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed: read failure 80% 682 1193046
so you have the LBA address of the block (1193046).
Then you install sg_utils and run with lba the lba address from above
# sg_verify --lba=1193046 /dev/sda
You will get a response like
# sg_verify --lba=1193046 /dev/sdb
verify (10): Fixed format, current; Sense key: Medium Error
Additional sense: Unrecovered read error
Info fld=0x123456 [1193046]
Field replaceable unit code: 228
Actual retry count: 0x008b
medium or hardware error, reported lba=0x123456
so you will know that this sector is really bad and could not been automatically put to the defective list of the micro controller of the disk.
you can check the defects list with
# sg_reassign --grown /dev/sda
>> Elements in grown defect list: 0
and if you reallocate this sector with
# sg_reassign --address=1193046 -v /dev/sda
and you check the grown defects list afterwards with
# sg_reassign --grown /dev/sdb
>> Elements in grown defect list: 1
you should see the counter grow by 1.
After this you should run
#smartctl -t long /dev/sd?
again and retry this procedure until the disk is clean and the long test runs without errors.
In this case I would use this disk for non-important stuff like a steam library or something like this. But I would replace the disk just to be sure. But for the moment the disk should be ok.
Completed: read failure
(or Completed without error
or whatever), you need to run a command like this after the allotted time: sudo smartctl -a /dev/sd?
And I'll note that in my case, smartctl didn't find any error, but btrfs still seems to have a corruption.
– nealmcb
Jul 25 '21 at 00:27
If the problem comes from a hard-drive failure (e.g. a bad block), it is not repairable.
To check for bad blocks:
badblocks -n /dev/sdX
To know the corrupted files, see How to list files part of a BTRFS block?
Please don't suggest running
btrfs check --repair
unless you are exactly sure what caused the problem and this should be the last option and at this point you should have a running backup in place.
The man page states
Warning: Do not use
--repair
unless you are advised to do so by a developer or an experienced user, and then only after having accepted that no fsck successfully repair all types of filesystem corruption. Eg. some other software or hardware bugs can fatally damage a volume.
To get information on a volume, btrfs device stats /MountPoint
will give you plenty of hints on the state of the filesystem.
For an unmounted volume, btrfs check --repair /dev/TheDevice
will check and repair the filesystem.
BTRFS developers recommend to contact them via IRC or linux-btrfs mailing list with any (relatively serious) issues with BTRFS according to BTRFS Wiki FAQ: I have a problem with my Btrfs filesystem!:
See the Problem FAQ for commonly-encountered problems and solutions.
If that page doesn't help you, try asking on IRC or the Btrfs mailing list.
Explicitly said: please report bugs and issues to the mailing list (you are not required to subscribe).
See Btrfs mailing list for details on how to post to the mailing list and what information to include when asking for help.
In my case, the simplest resolution of "corrupt leaf" errors was to simply delete affected files as they didn't contain anything important.
To find out which file are affected by the corrupted leafs:
btrfs inspect-internal logical-resolve 43231330304 <mountpoint>
Other general recommendations are to
Warning: Do not use --repair unless you are advised to do so by a developer or an experienced user, and then only after having accepted that no fsck successfully repair all types of filesystem corruption. Eg. some other software or hardware bugs can fatally damage a volume.