I have a computer with two hard drives in it. One carries the OS and a whole bunch of other stuff; the other was mounted inside /media as a dump-space for an additional terabyte of storage space. Recently, I upgraded the system from Ubuntu Maverick to Debian Jessie, which involved the removal of a bunch of incompatible packages and the installation of a bunch more, and could have broken stuff; it's also possible that the hard drive was dying, and when I rebooted, it decided to give up.
There's nothing utterly crucial on this drive, but I would like to retrieve it rather than rebuild - also, I prefer to know what went wrong, rather than just mask the problem and move on. So what I'm asking for is recommendations on how to debug a strange hard drive failure in which a hard drive's file system is no longer recognized. If the question isn't appropriate here, I apologize, and please redirect me to a better place!
Prior to the upgrade, the primary drive was (I believe) /dev/sda, and the secondary was /dev/sdb. Now, they show up as:
$ sudo parted -l
Model: ATA WDC WD1002FAEX-0 (scsi)
Disk /dev/sda: 1000GB
Sector size (logical/physical): 512B/512B
Partition Table: msdos
Disk Flags:
Number Start End Size Type File system Flags
1 32.3kB 1000GB 1000GB primary
Model: ATA ST31000333AS (scsi)
Disk /dev/sdb: 1000GB
Sector size (logical/physical): 512B/512B
Partition Table: msdos
Disk Flags:
Number Start End Size Type File system Flags
1 1049kB 997GB 997GB primary ext4 boot
2 997GB 1000GB 3143MB extended
5 997GB 1000GB 3143MB logical linux-swap(v1)
Note that the file systems on /dev/sdb show up correctly (most of the terabyte as ext4, bootable, and 3GB swap partition), and the partitions on /dev/sda are most likely correct (though I don't have pre-upgrade partition table dumps), but with no file systems listed.
Attempting to fsck /dev/sda1 produces this error:
$ sudo fsck /dev/sda1
fsck from util-linux 2.25.2
e2fsck 1.42.12 (29-Aug-2014)
ext2fs_open2: Bad magic number in super-block
fsck.ext2: Superblock invalid, trying backup blocks...
fsck.ext2: Bad magic number in super-block while trying to open /dev/sda1
The superblock could not be read or does not describe a valid ext2/ext3/ext4
filesystem. If the device is valid and it really contains an ext2/ext3/ext4
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
e2fsck -b 8193 <device>
or
e2fsck -b 32768 <device>
A list of likely superblock options produced no helpful results, but I'm wondering if maybe there's a partition table offset. Is there a way to search for a valid superblock?
Also, I'm not 100% certain that this was an ext3/ext4 file system. It's possible that I used a different file system, but I don't know what. Is there any way to explore the partition and figure out what it would be using, such that I can install an additional file system driver?
Any pointers would be a help. Thanks!
EDIT: doktor5000 suggested I grab testdisk and see what it says.
Disk /dev/sda - 1000 GB / 931 GiB - CHS 121601 255 63
Current partition structure:
Partition Start End Size in sectors
No ext2, JFS, Reiser, cramfs or XFS marker
1 P Linux 0 1 1 121600 254 63 1953520002
1 P Linux 0 1 1 121600 254 63 1953520002
No partition is bootable
Selecting 'Quick Search' produces this:
Disk /dev/sda - 1000 GB / 931 GiB - CHS 121601 255 63
Partition Start End Size in sectors
>* Linux 0 32 33 118619 237 18 1905627136
P Linux Swap 118620 14 51 121601 57 56 47892480
I previously forgot to quote what fdisk said, so here's that:
$ sudo fdisk /dev/sda -l
Disk /dev/sda: 931.5 GiB, 1000204886016 bytes, 1953525168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x000a56a5
Device Boot Start End Sectors Size Id Type
/dev/sda1 63 1953520064 1953520002 931.5G 83 Linux
So, the discrepancies I'm seeing are: fdisk reckons the partition starts at 63, but testdisk says 32 (I think); and testdisk says that there's a swap partition there, which doesn't make sense to me (it's a secondary drive, I don't know why I'd have allocated any swap space). But, coolness of coolnesses, I can dig into the file system and copy files off! This is awesome compared to what I had to work with the last time I tried disk recovery - but then, that was back in the 1990s using OS/2, so no surprises there :)
I'm confident enough to let testdisk write out a new partition table. And yep! All the data's there and readable, the drive appears to be working just fine. Many thanks, doktor5000! So, follow-up question... any idea how the partition table could have came to be corrupted?
On how the partition table got corrupted, that's more of a crystal-ball type of question. Something similar can happen when you use several different partitioners on one drive, sometimes they interpret things slightly different. So best stick to only one. If you like gparted, use it for everything. Like the bundled partitioner of your favorite linux OS? Use it for everything.
– doktor5000 May 10 '15 at 16:38