SoftRAID 10 corrupted after reboot. Possibly due to sector alignment

Question

Centos 7. SoftRAID 10 corrupted after reboot.

Raid 10 and ext4 was created(1,2). After the standard procedures, I did the full array alignment of the sectors (3). Don`t ask me. After that, 4 months of work passed. Raid didn't start after reboot (4). Currently I did full by sector copy of the array. Unfortunately data backup was 2 month ago. I did not restart the computer all these 4 months.

Question: how to restore RAID? Whether alignment is causing RAID corruption. If yes, how do I get it back.

1. mdadm --create --verbose /dev/md0 --level=10 --raid-devices=4 /dev/sda /dev/sdb /dev/sdc /dev/sdd
2. mkfs.ext4 -F /dev/md0
3. parted -a optimal /dev/md0

mdadm --detail /dev/md0

/dev/md0:
           Version : 1.2
        Raid Level : raid0
     Total Devices : 2
       Persistence : Superblock is persistent
             State : inactive
              Name : clustera.lab:0  (local to host clustera.lab)
              UUID : def60eb0:d92a0ca5:5297ab23:446fdcdc
            Events : 140006
    Number   Major   Minor   RaidDevice
       -       8       32        -        /dev/sdc
       -       8       48        -        /dev/sdd

cat /proc/mdstat

Personalities : 
md0 : inactive sdd[3](S) sdc[2](S)
      27344502784 blocks super 1.2
       unused devices: <none>

fdisk -l

Disk /dev/sda: 14000.5 GB, 14000519643136 bytes, 27344764928 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk label type: gpt
Disk identifier: A8EE15FE-6D52-4559-9DDE-F48143F736F3
#         Start          End    Size  Type            Name
 1         2048  27344762879   12.8T  Microsoft basic primary
WARNING: fdisk GPT support is currently new, and therefore in an experimental phase. Use at your own discretion.
Disk /dev/sdb: 14000.5 GB, 14000519643136 bytes, 27344764928 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk label type: gpt
Disk identifier: A8EE15FE-6D52-4559-9DDE-F48143F736F3
Start          End    Size  Type            Name
1         2048  27344762879   12.8T  Microsoft basic primary
Disk /dev/sdc: 14000.5 GB, 14000519643136 bytes, 27344764928 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes  / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk label type: dos
Disk identifier: 0x00000000
Device Boot      Start         End      Blocks   Id  System
/dev/sdc1               1  4294967295  2147483647+  ee  GPT
Partition 1 does not start on physical sector boundary.
Disk /dev/sdd: 14000.5 GB, 14000519643136 bytes, 27344764928 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disk label type: dos
Disk identifier: 0x00000000
Device Boot      Start         End      Blocks   Id  System
/dev/sdd1               1  4294967295  2147483647+  ee  GPT
Partition 1 does not start on physical sector boundary.
WARNING: fdisk GPT support is currently new, and therefore in an experimental phase. Use at your own discretion.

All parted -a optimal does is the check, it doesn't perform any alignment or other writes to the device. Also, did you you really created an array from raw devices, with no partitioning, and the filesystem directly on it, no partitions, no volume management, etc? The boot loader was installed somehow, where? The raid was created s raid10 with 4 devices, but detail show raid0 and two devices, are you sure that's the same array? How "microsoft basic primary" partition managed to get there? What is real structure of the disks involved? — Nikita Kipriyanov, Jan 11 '23 at 12:55
Why "Personalities" is empty in /proc/mdstat? Do you have all necessary kernel modules loaded? — Nikita Kipriyanov, Jan 11 '23 at 13:03
Yes, OS has all necessary kernel modules. Initially, the devices were not divided into volumes. In any case, I did not note this in the action log. But I definitely didn't create any additional volumes in the RAID. LVM did not create. The array was created as 10, worked for 4 months. It may be worth just directly trying to recreate array 0 from the two remaining disks. Or do mdadm --assemble --scan — V. Kifirenko, Jan 11 '23 at 14:24
This looks like you created RAID on bare drives without partition table, then something else (Windows?) created a partition table and killed your RAID metadata in the process. You should never use bare drives directly, always use a partition table, because this happens quite often. — frostschutz, Jan 11 '23 at 14:57
fdisk -l should not show any partitions on unpartitioned drives; check if there are filesystems on those partitions (file -sL, blkid, lsblk, ...) if there are then it has been formatted in addition to partition table creation. check if you still get anything for mdadm --examine. If you know all the right settings, you can try your luck re-creating https://unix.stackexchange.com/a/131927/30851 — frostschutz, Jan 11 '23 at 15:01
@ВикторСметанюк actually, previous commenter is right: especially with metadata v1.2, there should be not possible to see any partition tables, because even if you were partitioned the RAID after assembling (which is possible), the parititon tables metadata would end up in wrong places when drives are viewed directly. So it really looks that your RAID was destroyed by subsequent creation of metadata. // No, don't recreate array as raid0, rather you can try to assemble raid10 in its degraded state. But better find a decent data recovery service who knows how MD RAID works, I am very serious. — Nikita Kipriyanov, Jan 11 '23 at 15:21

SoftRAID 10 corrupted after reboot. Possibly due to sector alignment

Start End Size Type Name

0 Answers0