1

My both hard drives where all my data is stored are failing. My system insconstently refuses to load the disks and mount the partitions. I moved one hard drive to other computer where it is recognized with less trouble but the partition has many errors, and I still get E/S errors on dmesg for that drive.

The partition for start has a bad superblock but it can be read with an alternative superblock where it shows even more errors so i did first a master backup of the partition on an external hard drive. I did two passes on ddrescue for this reason and it exited with only one error of 512 bytes acccording to the log, which I think is promising.

Listing the backup with lsblk looks even more promising:

Where lsblk for the damaged partition shows:

$lsblk -f
NAME   FSTYPE   LABEL        UUID                                 MOUNTPOINT
...
sda                                                               
└─sda1  
...

Where the now master shows:

sdc                                                               
├─sdc1 ext4     new          8cab6f75-1ea7-4451-9f48-2bbcce167184 

Now I did another backup from this master partition to the end of the same drive, so the actual output of lsblk would be:

 lsblk -f
NAME   FSTYPE   LABEL        UUID                                 MOUNTPOINT
fd0                                                               
loop0  squashfs                                                   /snap/anbox-installer/25
loop2  squashfs                                                   /snap/core/9669
loop3  squashfs                                                   /snap/core/10911
sda                                                               
└─sda1                                                            
sdb                                                               
├─sdb1 ext4     Debian_copia ce2c8e8f-f3ef-4005-9cb1-0bb9d5870f43 /
└─sdb2 swap                  d60a8ad0-5528-4bbc-af5e-092b96282df4 [SWAP]
sdc                                                               
├─sdc1 ext4     new          8cab6f75-1ea7-4451-9f48-2bbcce167184 
└─sdc2 ext4     new          8cab6f75-1ea7-4451-9f48-2bbcce167184 
sr0                                                               

Now here it is where is missed up things, I mistaken option p of fsck for option f so i have done

fsck -fy /dev/sdc2

which screwed it up some things and deleted some many nodes which after mounting it listed half of the files that should be, affortunately this is a copy of a copy of the damaged hard drive, so this time i will be more cautious.

Could you tell me please some good practices? my all data is in a gamble right now so please be precise.

Does lsblk make any changes to the partitions? can I mount a partition without doing any changes on it? I have this link handy btw: https://www.sans.org/blog/how-to-mount-dirty-ext4-file-systems/

How to safely do a fsck so i can win some time here? Does fsck -n still make changes to the partition? Does it make any difference where in the disk is a copy of a partition?

Is it any way of recovering the files without dealing with filesystem? I have read about photorec but i have many audacity file it would not recognize. Isnt it there anything more generic?

3 Answers3

2

If your disk is physically failing, then doing any more write operations on it (like with fsck) could only make things worse. To increase your chances of recovering your data from that disk, you should stop using that disk immediately. Unmount it now. Order a new disk, and when your new disk arrives, boot a plain Linux distro to command prompt, and ddrescue the old disk onto the new one, as described here. Remember: do not mount any file systems from the old disk, to avoid causing further damage.

Pourko
  • 1,844
1

Don't panic

It appears you are trying to failing hard drives with dirty ext4 filesystems on them.

Do you have backups? Restore from backups if you have them. If you don't have backups, you must tread very carefully here. The first thing to do is to take your hands away from the keyboard and develop a game plan. And make sure to fire up info or man for each command you're going to run, especially tools that touch the hard disk directly.

Limit access to the damaged media

If the hard disks are failing, you should cease any further attempt to access files directly off the disk. You should cease any attempt to run fsck. The more activity you throw at the hard disk, the more wear you are putting on the possibly-failing hard disks. If you are booting an OS off one of these disks, cease this activity as well. Boot from a live media such as GRML Linux.

You should instead try to image your failing hard drives. This involves copying the hard disk bit-for-bit into a file on another storage device. Ideally that other storage device should be pretty large, so you can store multiple copies of the image. Once your recovery tool has completed recovering as much data as possible, mark this image as read-only. This will become the master copy. You don't touch this image. Instead, make a copy of the master copy and run fsck and mount on this working copy. If you make a mistake, it's not a big deal - you just create a new working copy from the master copy.

Creating the master copy

See also the unix SE answer that Pourko linked.

GNU ddrescue is well suited to recovering data hard disks. Run it something like:

ddrescue --idirect /dev/sdX /mnt/big-storage-filesystem/sdX.img /mnt/big-storage-filesystem/sdX.mapfile

(The --idirect gives ddrescue more control over disk access.)

Once ddrescue has finished I recommend running chmod a-w sdX.img sdX.mapfile. These shouldn't be modified afterwards.

Attempting to recover from an working copy

First make your working copy

cp /mnt/big-storage-filesystem/sdX.img /mnt/big-storage-filesystem/work/work-sdX.img

Then use losetup to map the image to a block device file:

losetup -a /mnt/big-storage-filesystem/work/work-sdX.img

You might need to run kpartx -a /dev/loopN where /dev/loopN is your loopback device indicated by the above command's output.

Now you can access the image as if it were just another hard disk.

Check lsblk, you should be able to do fsck -y /dev/loop0p1 or the like.

If you're lucky, you can just do a mount /dev/loop0p1 /mnt/recovery then go from there.

If you're not so lucky, you may need to use forensic tools to grab data off the corrupted filesystem. See this unix SE post for an example.

Learn from this experience

Make backups & verify your backups. Imagine what you could be doing if you weren't asking this question on unix SE and tearing your hair out trying to recover irreplaceable data. Technology is always changing, and technology does not age well, so it's a good idea to anticipate data loss.

Winny
  • 752
  • You should cease any attempt to run fsck.

    Even with option n? fsck -n

    – Lerian Acosenossa Apr 04 '21 at 16:33
  • ddrescue --idirect /dev/sdX /mnt/big-storage-filesystem/sdX.img /mnt/big-storage-filesystem/sdX.mapfile – Lerian Acosenossa Apr 04 '21 at 16:35
  • ddrescue --idirect /dev/sdX /mnt/big-storage-filesystem/sdX.img /mnt/big-storage-filesystem/sdX.mapfile

    Why is it better a eaw image than cloning the partitioon?

    I know there're obvious advantages like preventing from overwritting a partition in use, but is it there anything further?

    I already have a master copy but it is a partition and not a raw file, as you see with sdc1, this is why I'm asking. Now if i try to copy the disk to an image file it would size 1 TB instead of 500 GB wich is the size of the damaged partition.

    – Lerian Acosenossa Apr 04 '21 at 16:43
  • Can I do ddrescue from partition to partition even on the same disk or is it only used for cloning whole disks?

    Like:

    ddrescue /dev/sdc1 /dev/sdc2

    – Lerian Acosenossa Apr 04 '21 at 16:44
  • Woldn't I do first fsck -p in the copy partition instead of fsck -y ? – Lerian Acosenossa Apr 04 '21 at 16:45
  • @LerianAcosenossa Re fsck-n: You should avoid any access to the failing media besides making the master copy. Even a fsck -n will put wear on the hard disk spindle and heads. Re imaging just the partition: If that is the only partition with data you wish to recover and the partition table is intact (i.e. not damaged/accidentally adjusted), then imaging the partition (just the filesystem) would suffice. Re ddrescue /dev/sdc1 /dev/sdc2: If /dev/sdc is not failing, then this is okay. If this is the problematic disk, you should write the output to a different location – Winny Apr 04 '21 at 21:53
  • @LerianAcosenossa Re fsck -p: Sure, probably no harm in that. The trick is you don't run any things like fsck against your master copy, so you can always revert back to the original copy of the filesystem if your recovery efforts happen to damage the filesystem more. Just a reminder, your master copy is NOT the failing disk, it's the disk/filesystem image kept elsewhere – Winny Apr 04 '21 at 22:00
  • reminder: yes I know that part. Indeed is a way to isolate physical damage and prevent more of it. Every error from a master copy comes from it's filesystem. – Lerian Acosenossa Apr 05 '21 at 00:41
  • ddrescue /dev/sdc1 /dev/sc2: Indeed I took an hybrid solution, because I think that you are right at guiding me to deal with file images instead of partitions, since they copy faster and are more safe. I wish i'm doing no blunder here but im doing ddrescue /dev/sdc1 mnt/Hugefilesystem/sdc1.img instead If I'm right im not sure how to mount this image without it's master block but i figure that is the same as if i had the full disk image, am I right? – Lerian Acosenossa Apr 05 '21 at 00:44
  • If you're copying a partition with a filesystem on it, you can try: mount -o loop sdc1-working-copy.img /mnt/your-mnt-point. This will create a loop device which maps the file to a device file, then mounts the mapped loop device as a filesystem. See the section of the mount(8) manpage detailing Loop-device support. You can also run fsck directly on the losetup-created loop device file, or on the file directly. – Winny Apr 05 '21 at 01:35
  • Loop device for single image partition: thanks that helps me a lot. Now I'm 99.99% sure how to proceed. – Lerian Acosenossa Apr 05 '21 at 02:32
  • mount -o loop sdc1-working-copy.img /mnt/your-mnt-point : – Lerian Acosenossa Apr 07 '21 at 02:34
  • mount -o loop sdc1-working-copy.img /mnt/your-mnt-point :

    That was magick man! thanks, it worked straight forward and i'm seeing all my files now. May be I would came out with other methods but you saved me hours of work and testing.

    There is still only one small inconsistence i'm not getting right now: After succesfully mounted my image partition as read only loop I copied it's files to a safe location, but total files size differ a few GB from those on the loop device.

    Right now I'm trying to figure out what happened.

    I did 'cp -rav' and 'du -sh'

    – Lerian Acosenossa Apr 07 '21 at 02:46
  • btw can you explain me what's the difference between what fsck sees and what mount sees? because if i run fsck -y I'm sure it will wipe half of the files and -p option wouldn't also be that gentle. But mounting an image solved the issue, I'm still have llots to learn about the low level mechanics of file systems, specially ext4. I used NTFS almost all my life but I use ext4 since 5 years ago and I'm still not used to it. – Lerian Acosenossa Apr 07 '21 at 02:51
  • This question seems to explain fsck better than I can. fsck reads/writes the filesystem directly to repair any problems, whereas mounting the filesystem will use the kernel's ext4 driver to read/write files. Filesystems are carefully designed to ensure a degree reliability when bad things happen, so fsck will try its best to only remove/unlink files that it doesn't know what to do with (due to corruption). – Winny Apr 08 '21 at 02:34
  • thanks to you I saved a lot of time and I have my data back from my 500 GB drive.

    As they're telling me here https://unix.stackexchange.com/questions/645480/sata-iii-hard-drives-not-recognized-by-ecs-motherboard-caused-data-corruption-of the problem was on the sata controller.

    – Lerian Acosenossa Apr 17 '21 at 22:31
0

Solved by now, if you have a problem like mine make an image file copy of your partition on another disk, then copy this image with cp at least once, mark them all as read only and mount them as loop. Ihave all my files back!!! except for a bunch of logs I care a damn.

Good luck to everybody and remember to backup your files!! even a DVD rescue boat is better than nothing!!! Try to have your escential files replicated many times and if possible upload them to the internet! You will care a damn about all those shitty firefox downloads.