Fsck on partition inside RAID broke file system AND RAID

Question

What was: RAID 1 with mdadm on whole disks say /dev/md0 from /dev/sda and sdb, GPT with one partition on md0 and ext4 on this partition (md0p1). What I think happens: after changing motherboard, Linux detects problems with ext4 on md0p1. I run fsck on md0p1 and say "yes" for all questions. It was some bad check sums, lots of extent tree could be narrower and some not empty journals. It seems to end successfully and I try to mount /dev/md0p1 but have same error about bad filesystem. I run fsck on md0p1 again but now it say "no superblock" and alternative superblock numbers doesn't help. I reboot and now mdadm can't find its superblock on both sda and sdb. GPT partition still fine, but no signs of ext4 on both disks found by testdisk (only ms data).

# fdisk -l
GPT PMBR size mismatch (3907028991 != 3907029167) will be corrected by w(rite).
Disk /dev/sda: 1,8 TiB, 2000398934016 bytes, 3907029168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: D09686A6-2428-48EC-868B-D3C8CE5E0C23
Device     Start        End    Sectors  Size Type
/dev/sda1     34 3907024064 3907024031  1,8T Microsoft basic data
Partition 1 does not start on physical sector boundary.
...
GPT PMBR size mismatch (3907028991 != 3907029167) will be corrected by w(rite).
Disk /dev/sde: 1,8 TiB, 2000398934016 bytes, 3907029168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: D09686A6-2428-48EC-868B-D3C8CE5E0C23
Device     Start        End    Sectors  Size Type
/dev/sde1     34 3907024064 3907024031  1,8T Microsoft basic data
Partition 1 does not start on physical sector boundary.

# mdadm --examine /dev/sd*
/dev/sda:
MBR Magic : aa55
Partition[0] :   3907028991 sectors at            1 (type ee)
mdadm: No md superblock detected on /dev/sda1.
...
/dev/sde:
MBR Magic : aa55
Partition[0] :   3907028991 sectors at            1 (type ee)
mdadm: No md superblock detected on /dev/sde1.

gdisk /dev/sda
GPT fdisk (gdisk) version 1.0.1

Partition table scan:
MBR: protective
BSD: not present
APM: not present
GPT: present

Found valid GPT with protective MBR; using GPT.

Command (? for help): p
Disk /dev/sda: 3907029168 sectors, 1.8 TiB
Logical sector size: 512 bytes
Disk identifier (GUID): D09686A6-2428-48EC-868B-D3C8CE5E0C23
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 3907029134
Partitions will be aligned on 8-sector boundaries
Total free space is 5070 sectors (2.5 MiB)

Number  Start (sector)    End (sector)  Size       Code  Name
1              34      3907024064   1.8 TiB     0700

Command (? for help): i
Using 1
Partition GUID code: EBD0A0A2-B9E5-4433-87C0-68B6B72699C7 (Microsoft basic data)
Partition unique GUID: E11B0DE3-9ABD-47B2-9F09-E993F76FBC6F
First sector: 34 (at 17.0 KiB)
Last sector: 3907024064 (at 1.8 TiB)
Partition size: 3907024031 sectors (1.8 TiB)
Attribute flags: 0000000000000000
Partition name: ''

So my questions:

Is there a chance to recover filesystem? I can recover some files by foremost but it is 2 TB of trash and very few valuable files, so it is meaningless without at least filenames.
More important: what was wrong? It seems I did all what was suggested (except no backup), and lost my data.

Because situation is weird, I'll tell whole story: Linux runs on SSD and most of data (include home directory) lives on RAID over 2 HDD.
RAID works fine all time since 2011 or 2012.
6-8 month ago I upgraded computer: change processor from 2 to 8 cores, add RAM and insert SSD for Windows.
After this computer ended turning on from first attempt, I had to press reset button 1-2 within 10-20 seconds to turn it on. But all other systems works fine.
1-2 month ago 2 times all applications start crashing and I see I/O errors in console, but after reboot all works fine.
1 month ago a upgraded kubuntu to latest release.
2 weeks ago things gone bad
Linux won't start – some error on SSD. I bought other SSD, manage to save most of file system by ddrescue, but it would not start, so I installed fresh OS on empty partition on SSD. It assemble RAID after install mdadm but would not add partition to /dev: it was /dev/md127 but not md127p1. I fixed GPT table in md127 by gdisck followed it suggestions (it has fine primary GTP table) and align backup GTP table that was corrupted. fsck on md127p1 (it changed to md0p1) was fine, I successfully mount it.
It works one or two days, then computer refused to start anyway.
I managed to lunch BIOS one time and it was no IDE devices, so I bought new motherboard (old was asrock 900FX Extreme3, new gigabyte 970-DS3P).
After motherboard change I run Linux, it starts in recovery mode (/dev/md0p1 has problems with file system) and then it was what I wrote in begining of my question.

What was done wrong?

no backup – of course. Now I understand that RAID isn't a backup.
ignoring of IO errors? It crashes my SSD with system, so it just run me to install fresh system.
is it bad idea to have partitions inside RAID? Is it better to assemble raid from sda1 and sdb1 then from sda and sdb?

addition:

lsblk
NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda      8:0    0   1,8T  0 disk <-- raid
└─sda1   8:1    0   1,8T  0 part
sdb      8:16   1 957,9M  0 disk
└─sdb1   8:17   1 956,9M  0 part
sdc      8:32   0  59,6G  0 disk
├─sdc1   8:33   0     1M  0 part
├─sdc2   8:34   0  29,8G  0 part /old
└─sdc3   8:35   0  29,8G  0 part /
sdd      8:48   0 119,2G  0 disk
└─sdd1   8:49   0 119,2G  0 part
sde      8:64   0   1,8T  0 disk <-- raid
└─sde1   8:65   0   1,8T  0 part
sr0     11:0    1     2G  0 rom

I have a few questions for you, 1- what was you RAID configuration Ex: RAID-1 on /dev/sd{ab} ? 2- In your FStab, are you using the UUID or the /dev/mdX to mount partitions ? If you no longer have the superblock, mdadm can create a new one. Otherwise you can also recreate your RAID array on top of the existing one in an attemp to recover it. — Louis Ouellet, Feb 21 '18 at 22:50
In fstab was /dev/md0p1. About recreate: would data, or maybe geometry of disk on recreated raid be different from used disk? Wold md0p1 be different from sda1 if I create read from sda? — Vestild, Feb 22 '18 at 08:19
recreating the array is a last resort, lets see about recovering the superblock first. you can use mdadm -E /dev/sdX to check for the presents of a superblock on each of your disk. — Louis Ouellet, Feb 23 '18 at 08:21
As a side note, I strongly recommend you mount your partitions in your fstab using the UUID as it is more reliable. Also I suggest you create multiple RAIDs instead of partitionning 1 raid as it makes it a lot easier to recover. — Louis Ouellet, Feb 23 '18 at 08:24

Fsck on partition inside RAID broke file system AND RAID

0 Answers0

Linked