0

My goal is to rescue partition to a file (no need for full disk copy, or creating a bootable drive), then mount that image, and get the folders out to my filesystem onto the new 3x times bigger drive. But I want to also know which specific files were corrupted.

I want to be able to run the commands from Synology DSM, where both source and target drives are connected (both are ext4 non-raid basic volumes). I couldn't find a ddrutility package for DSM, so it's either possible to use it over the network (booting from Parted Magic, accessing the image and mapfile over smb network), or I need to make sense of how to use the fill mode of ddrescue to point at those files.

My course of action for the rescue:

  • install synocommunity package SynoCli Disk Tools
  • connect corrupted HDD to Synology DS216+II via USB
  • immediately unmount it via web UI if it appeared as any external drive (USB eject)
  • get the device (i.e. sda) and partition (i.e. sda1) name and the physical sector size by running the fdisk -l command
  • make sure it isn't in any mounts listed by the command mount
  • run ddrescue first pass, skipping the suspicious parts:

ddrescue -f -n /dev/sda1 /volume2/rescue/sda1_rescue.img

  • run second pass with 3 retries and direct disk access:

ddrescue -d -f -r3 /dev/sda1 /volume2/rescue/sda1_rescue.img /volume2/rescue/rescue.log

  • somehow figure out which files are corrupted inside the image, if there were any bad blocks
  • mount the image:

sudo mkdir /mnt/newfolder

mount /volume2/rescue/sda1_rescue.img /mnt/newfolder -o loop,ro

  • cp or rsync everything from inside the /mnt/newfolder to a /volume2/salvaged folder

Questions:

I will be using SATA/USB3.0 adapter to connect the corrupted disk. Is there no need to use USB2.0 over the USB3.0?

-f Force overwrite of outfile.

Should I just skip this option if in my case writing to a file? Docs say "This option is just a safeguard to prevent the inadvertent destruction of partitions, and is ignored for regular files."

-n Skip the scraping phase.

I guess it's best used for the first run only?

-d Direct disk access.

Looks like it's better for recovery, but requires the correct sector size to work. I know I can't trust what USB adapters report, but this datasheet suggests that my WD4002FYYZ is 512n, not 512e. Does it mean I don't need to set the size explicitly? Or I better should, if I use the -d option?

The target disk is on SATA and shows up as 4096 physical sector, but I guess it doesn't matter if I just want to copy folders, and not recreate a bootable disk etc.

-v Verbose mode.

I probably won't need it? Not sure what its output looks like.

/dev/sda vs /dev/sda1

Do I in any way need to create a whole drive image for my purpose? From what I know, all my data is on the largest partition, but I'm a bit curious to see inside the system partitions too. Is reading the whole device instead of partition adds more risk to corrupt the disk further? If so, I can try to map the system partitions after my data partition is secure. Do I understand correctly that If I use a partition (/dev/sda1) with ddrescue, I won't need an offset option to mount the image, and it would work fine like that? mount /volume2/rescue/sda1_rescue.img /mnt/newfolder -o loop,ro

If you are trying to rescue a whole partition, first repair the copy with e2fsck or some other tool appropriate for the type of partition you are trying to rescue, then mount the repaired copy somewhere and try to recover the files in it.

e2fsck -v -f /dev/sdb1

Do I need to run e2fsck on my (mounted) destination image before copying my data from it, even if I supposedly don't need it to function partition-wise as long as I get to copy my data anyway?

Now the fill mode part of the docs is a bit confusing to me.

From the docs it looks like ddrescue fill mode can point out to the files. But it also says that the fill mode is not a rescue mode, and I don't want to badly affect the created image, or waste another disk pass because of it. Although on the 12TB there might be an extra space for a second 4TB image copy to experiment on. But it is still not clear to me, which is the best way to combine the best rescue way and the detection of the affected files via the fill mode?

Example 4: Figure out what files are in the bad areas of the disc.

 ddrescue -b2048 /dev/cdrom cdimage mapfile
 printf "DEADBEEF" > tmpfile
 ddrescue --fill-mode=l- tmpfile cdimage mapfile
 rm tmpfile
 mount -t iso9660 -o loop,ro cdimage /mnt/cdimage
 find /mnt/cdimage -type f -exec grep -l "DEADBEEF" '{}' ';'
   (note that my_thesis.txt has a bad sector at pos 0x12345000)
 umount /mnt/cdimage
 ddrescue -b2048 -i0x12345000 -s2048 -dr9 /dev/cdrom cdimage mapfile
 ddrescue --fill-mode=- /dev/zero cdimage mapfile
 mount -t iso9660 -o loop,ro cdimage /mnt/cdimage
 cp -a /mnt/cdimage/my_thesis.txt /safe/place/my_thesis.txt

Does this line ddrescue --fill-mode=l- tmpfile cdimage mapfile overwrite the cdimage? Should I do it after my first pass of rescue mode? And it doesn't additionally touch the infile (corrupted HDD) in any way, right? Shall I perform this command on an sda1_rescue.img or sda1_rescue.img.bak? Does it just paste the tmpfile text in files where corruption already is anyway? If I need to run the fill mode before the second pass (the one with retries), do I need to adjust my second pass in any way? In the example

ddrescue -b2048 -i0x12345000 -s2048 -dr9 /dev/cdrom cdimage mapfile

looks like you need to know the exact file size after the bad sector position, or something, not sure what -s2048 based on. And then it will retry 9 times to restore that file in the image from the corrupted state? And why does it run the fill mode again?

ddrescue --fill-mode=- /dev/zero cdimage mapfile

Compared with the first fill run, there is no l key, so it doesn't write the location (which is anyway only needed for the second rescue run?). And looks like it writes zeroes now instead of specific text over the bad block that couldn't be restored on the previous pass, and the next steps are mounting and recovery. Is this step just simply to replace the not-anymore-needed text with zeroes?

Most of my files are redownloadable and I just would like to replace them from elsewhere if they're corrupted, but some I might prefer to recover this way if possible, but I'm not sure which course to take. Does focusing on important endangered files (in multiple passes one per file?) instead of one whole partition second pass with -r3 will be much safer for the disk restoration?

  • Don't run fsck on your source partition. Reread that part. It says run it on the [destination] image before you try to mount [that destination image] – Chris Davies Feb 18 '21 at 09:01
  • Before you do any of this. Are you getting read and/or write errors to the disk? What does SMART say? There's no point using ddrescue unless you have unrecoverable hardware media errors – Chris Davies Feb 18 '21 at 09:04
  • Thanks, but I was asking about the destination image that I will mount, I'm trying to figure out if I really need to run e2fsck on it before copying my data from it, even if I don't need my mounted image to funtion partition-wise as long as I get to copy my data anyway. I'll try to correct my question. I do not intend to do anything with the corrupted source drive other than unmounting and running ddrescue. – Melody Nelson Feb 18 '21 at 09:08
  • 1
    I had I/O errors in DSM notifications, 1 pending sector in SMART, then 7 pending sectors after 2 hours, other than that SMART looked fine. I decided not to wait any longer and extracted the disk from the server. Now trying to figure out which commands to run after I plug it back via SATA/USB. – Melody Nelson Feb 18 '21 at 09:11

1 Answers1

2

Disclaimer: I don't have a Synology. I've used (and like) QNAP, which is similar.

The process should be fairly straightforward. However, if your disk really is dying you may only get one chance at this. You should probably read the online documentation for ddrescue before starting.

You can attempt to recover partitions in order of importance, or just go for the entire disk. Replace /dev/sda with /dev/sda1, etc., to pick off partitions one by one. Remember to amend the destination image filename too.

Make sure that the disk is idle and filesystems are unmounted. Do not run fsck on the source disk.

You only need the rescue.log during each recovery process, so it should be removed before each partition recovery attempt. If the recovery attempt for any given partition is interrupted and to be restarted, do not remove rescue.log

rm -f /volume2/rescue/rescue.log
ddrescue -v /dev/sda /volume2/rescue/sda_rescue.img /volume2/rescue/rescue.log

If the recovered image file, sda_rescue.img represents a disk you will need to read its partition table to determine the relevant offsets of its partitions. Fortunately losetup and a recent kernel can usually do this for you

losetup --find --show /volume2/rescue/sda_rescue.img

For example, if you get given /dev/loop0, you should now find /dev/loop0p1, /dev/loop0p2, etc., for each partition on the disk image. Otherwise, you'll have to work them out from the partition table yourself (sector maths and offsets).

You may need to run the appropriate instance of fsck on the image or each partition within it. Try mounting each image or partition; if that fails, cautiously try fsck.

mkdir /mnt/0p1
mount -o ro,noload /dev/loop0p1 /mnt/0p1    # ext4, readonly with no journal rollback
Chris Davies
  • 116,213
  • 16
  • 160
  • 287