10

I have a CentOS 8 installation, where the partitioning and RAID 1 configuration where done using the automatic partitioning of the CentOS installer. Here is the output of lsblk:

sda         8:0    0 558.9G  0 disk
├─sda1      8:1    0    50G  0 part
│ └─md127   9:127  0    50G  0 raid1 /
├─sda2      8:2    0    20G  0 part
│ └─md126   9:126  0    20G  0 raid1 [SWAP]
├─sda3      8:3    0     1G  0 part
│ └─md125   9:125  0  1022M  0 raid1 /boot
├─sda4      8:4    0   600M  0 part
│ └─md124   9:124  0   600M  0 raid1 /boot/efi
└─sda5      8:5    0 487.3G  0 part
  └─md123   9:123  0 487.2G  0 raid1 /home
sdb         8:16   0 558.9G  0 disk
├─sdb1      8:17   0    50G  0 part
│ └─md127   9:127  0    50G  0 raid1 /
├─sdb2      8:18   0    20G  0 part
│ └─md126   9:126  0    20G  0 raid1 [SWAP]
├─sdb3      8:19   0     1G  0 part
│ └─md125   9:125  0  1022M  0 raid1 /boot
├─sdb4      8:20   0   600M  0 part
│ └─md124   9:124  0   600M  0 raid1 /boot/efi
└─sdb5      8:21   0 487.3G  0 part
  └─md123   9:123  0 487.2G  0 raid1 /home

As you can see, the /boot/efi partition is mirrored in RAID 1 as any other partition. Now, I'm trying to recreate the same setup when installing Debian, and I'm unable to proceed. If I setup the partitions and RAID 1 in this way, I get an failure from the installer during the grub installation (with no other error message, just "Some installation step has failed" generic message).

Screenshot:

error

The error goes away if I do not mirror the ESP partition.

I realise that mirroring the ESP partition is something that sounds unfeasible, and looking around it seems everybody agrees. But the CentOS installer manages to do it somehow.

What do I have to do to recreate the same setup on Debian?

  • 1
    try just using just sda4 for /boot/efi, and then turn it into a raid-1 mirror with mdadm after the system has installed and booted. BTW, a raid-1 mirror of the ESP partition is fine (but don't use other raid types like raid-0 or 10 or 5 or 6), but remember that you'll have to tell your UEFI to use the other disk if the primary disk dies - UEFI doesn't understand linux mdadm raid and won't automatically switch to the mirror. – cas Apr 08 '21 at 12:12
  • So the steps after installation are: to create the md device, format the partition as FAT32, change its type to ESP with parted/fdisk/etc, mount it again to /boot/efi, and then how to I tell grub to repopulate it again? – Nicola Gigante Apr 08 '21 at 12:54
  • Make a degraded raid-1 using only /dev/sdb4. format it as FAT32, mount it somewhere convenient (/mnt, perhaps) and copy everything from /boot/efi to it (use cp -a or rsync or some other method that recurses any sub-directories). unmount /boot/efi and then add /dev/sda4 to the raid-1 with sdb4. This will cause sda4 to be synced with the contents of sdb4. Unmount this raid-1 mirror and remount it as /boot/efi (and don't forget to update /etc/fstab so that it mounts the mirror device instead of /dev/sda4 - use a LABEL or UUID instead of a /dev/ entry). – cas Apr 08 '21 at 13:05
  • Thanks, so the contents of /boot/efi cannot be "recreated" from the grub package? Just curious. – Nicola Gigante Apr 08 '21 at 13:09
  • If you need more details, there are numerous questions with answers here on this site with detailed instructions for doing this kind of thing with degraded (i.e. missing one or more devices) raid mirrors. e.g. https://unix.stackexchange.com/questions/63928/can-i-create-a-software-raid-1-with-one-device – cas Apr 08 '21 at 13:10
  • 1
    depends what's on /boot/efi. IIRC, update-grub can & will copy it's own boot-loader there, but can't do anything about any others that might have been installed by the bios or other programs or operating systems. Easiest to just copy everything from sda4 to the mirror, it's only a few hundred MB at most, anyway. – cas Apr 08 '21 at 13:12
  • When you get it working, please write up what you did as an answer and then accept it (unless someone else posts a better answer), so that this question gets flagged as answered. So take notes :) – cas Apr 08 '21 at 13:15
  • I've tried, and It worked and booted. But then I tried removing the first disk to simulate a failure and as you pointed out the system does not boot anymore. Is it just a problem with EFI that I have to solve with my EFI system? (I'm on VirtualBox by the way) – Nicola Gigante Apr 08 '21 at 13:30
  • yeah, you need to tell UEFI where to find its ESP partition. I think some motherboard/BIOS manufacturers allow you to give it a list of partitions to try, but I don't know if vbox's uefi bios code is capable of that or not. Maybe try setting the partition type to ESP on both sda4 and sdb4 (which will probably mean you have to manually define that raid array in /etc/mdadm/mdadm.conf rather than rely on mdadm auto-detect). – cas Apr 08 '21 at 13:36
  • This is notoriously complicated. See this bug and the (numerous) other bugs linked from there for discussion and workaround. EFI partition mirroring is feasible but definitely not nice or easy. Here’s the exact configuration that “works” for me. – Andrej Podzimek Dec 22 '22 at 14:36

2 Answers2

11

Thanks to the comments by @cas I had this working.

The steps are mainly:

  1. I've installed Debian without setting up the RAID for the ESP partition. During the partitioning, I've already created two identical partitions marked as ESP partitions. They were on /dev/sda1 and /dev/sdb1
  2. I've copied the contents of /boot/efi somewhere else (/boot/eficopy).
  3. umount /boot/efi
  4. mdadm --create --verbose /dev/md3 --level=1 --raid-devices=2 --metadata=1.0 /dev/sda1 /dev/sdb1. Of course change /dev/md3 to something else if /dev/md3 is already an active MD device
  5. mkfs.vfat /dev/md3
  6. found the UUID of the partition in /dev/disk/by-uuid
  7. changed the /boot/efi entry in /etc/fstab with the new UUID
  8. mount /boot/efi
  9. copied the data from the backup into /boot/efi again

The reboot worked.

EDIT: Instead of backing up the /boot/efi partition, it seems that

grub-install --efi-directory=/boot/efi

does the job of restoring its contents (at step 9 above), even though I got a lot of warnings I cannot understand.

EDIT2: One should probably consider using metadata version 1.0 in favor of 0.9, as per the wiki page A guide to mdadm.

Version 1.0 still has the requirement (for this usecase) of placing the superblock at the end of the device, but also includes "the modern features of mdadm", by using common layout format as 1.1 & 1.2.

twan
  • 3
  • had trouble with this; this step is wrong mkfs.vfat /dev/md3 ; the resulting filesystem is messed up because mkfs.vfat doesn't know the the sector size is really 4096 bytes. – Joshua Sep 24 '23 at 20:35
  • Also I think step 4 doesn't create a mirror with a write-intent. Thankfully recovery of /boot/efi is rare. – Joshua Sep 24 '23 at 20:40
  • And the second disk isn't bootable yet. There's a missing command. :( – Joshua Sep 24 '23 at 20:42
2

After encountering numerous issues with gigabyte's solution, I gave up and invented a completely other-than solution.

Problems:

  • The filesystem needs to be aligned
  • fsck.msdos doesn't actually handle failed writes to the FAT on the SSD (due to power loss mid write).
  • The mdraid1.0 trick only seems to work if you don't use write intents, which is a bad idea.
  • The linux kernel doesn't know about ordered writes; writing to /boot/efi can still disable boot if the power goes out in the middle. Too bad; the DOS kernel got it right (by accident).

My solution to the final problem was ultimately to give up on actually mirroring and keep the second SSD with a backup copy of EFI synced from a known good state after upgrades. I assembled a full solution from pieces.

I only have the necessary tools available in 16 bit assembly; so this solution is about as crazy as it looks. I regret that I have no better, but quite frankly I'm not going to port the tools to x64 for reputation on stackexchange. I needed the 16 bit tools anyway for old DOS games I have on archive. What we have here is FreeDOS doing maintenance work on a modern system; which is both fascinating and horrifying at the same time.

You will need:

  1. 8086tiny https://github.com/ecm-pushbx/8086tiny

(Dosbox-x works just as well if you have a working SDL console; but you will also need SHUTDOWN.EXE)

  1. FreeDOS (actually included with 8086tiny so no more packages)

  2. My SSDFMT, which actually emits an aligned filesystem

  3. My flushbuf, because no emulator I could lay hands on actually forces writes through to the disk. I could easily patch 8086tiny to do so, but this is expedient. flushbuf just opens its argument and calls fdatasync on it.

begin-base64 755 /boot/flushbuf
f0VMRgIBAQMAAAAAAAAAAAIAPgABAAAAeAAgAAAAAABAAAAAAAAAAHgAAAAA
AAAAAAAAAEAAOAABAAAAAAAAAAEAAAAFAAAAAAAAAAAAAAAAACAAAAAAAAAA
AAAAAAAA/gAAAAAAAAD+AAAAAAAAAAAQAAAAAAAAWEiD+AJ1QV9fSDH2uAIA
AAAPBUiFwHwJSJe4SwAAAA8F99h0G1C/AgAAAEiNNCX3ACAAugcAAAC4AQAA
AA8FWJe4PAAAAA8FvwIAAABIjTQl4AAgALoXAAAAuAEAAAAPBb8OAAAA69lV
c2FnZTogZmx1c2hidWYgZGV2aWNlCkVycm9yIQo=
====

(Yes, that's an entire Linux x64 binary.)

  1. The ability to boot removable media in case something goes wrong.

Setting up

This answer is written using /dev/sda1 and /dev/sda2 as devices. If you don't have such a system where the identities of /dev/sda and /dev/sdb are stable, you must use /dev/disk/by-id/... values instead. /dev/disk/by-uuid won't work. Trust me in this. You should also write scripts for absolutely everything because typing /dev/disk/by-id/... devices every time is a pill. I found the best place to keep your scripts is in /boot.

  1. Get SSDFMT.COM onto 8086tiny's fd.img (mtools or mount -o loop will work

  2. As root do

   (cd /boot/efi && tar -zcf /boot/efi-bak.tgz *)
   umount /boot/efi
   STTY_SAVE=`stty -g` 
   stty cols 80 rows 25
   ./8086tiny bios.bin fd.img hd.img
   SSDFMT
  1. In SSDFMT, select disk 1 (the only disk visible), your actual sector size on your SSD, and select compatibility 5. Force LBA won't work with 8086tiny.

  2. In order to avoid a wedged console later we need to install FreeDOS on the hard disk.

    QUITEMU
   ./8086tiny bios.bin fd.img hd.img
    SYS C:
    XCOPY /e *.* C:\
    QUITEMU
    /boot/flushbuf /dev/sda1
    stty `$STTY_SAVE`
  1. This installs, but CONFIG.SYS and AUTOEXEC.BAT still point to files on the floppy. Let's fix that.
    mount /boot/efi
    sed -i 's/A:\\/C:\\/g' /boot/efi/CONFIG.SYS
    sed -i 's/A:\\/C:\\/g' /boot/efi/AUTOEXEC.BAT
  1. Time to put EFI back:
    (cd /boot/efi && tar -zxf /boot/efi-bak.tgz)
  1. Create the script to sync the mirror after upgrading and rebooting (to verify the boot works)
#!/bin/sh -x
umount /boot/efi
/boot/flushbuf /dev/sda1 # replace sda1 and sdb1 with your devices
cat /dev/sda1 > /dev/sdb1
/boot/flushbuf /dev/sdb1
mount /boot/efi
  1. Create the reverse script (for when the filesystem cannot be repaired)
#!/bin/sh -x
[ -d /boot/efi/EFI ] && umount /boot/efi # probably won't be mounted
/boot/flushbuf /dev/sda2 # replace sda1 and sdb1 with your devices
cat /dev/sdb1 > /dev/sda1
/boot/flushbuf /dev/sda1
mount /boot/efi
  1. Run the script give in step 7

Note that the second disk isn't fully registered to boot as an EFI disk yet. I'm pretty sure I fixed this in BIOS setup, not from my OS.

Should fsck.msdos find errors on boot, it might not be able to fix them. If it finds nontrivial errors, the torn write fixer from SSDFMT needs to run first. How? By booting the EFI partition in the emulator (!).

The launcher script should be saved:

#!/bin/sh -x
umount /boot/efi # making sure
./8086tiny bios.bin /dev/null @/dev/sda1
fsck.msdos /dev/sda1
/boot/flushbuf /dev/sda1
mount /boot/efi

Note that running this script leaves you at a DOS prompt after performing the torn write check/repair pass. You can run QUITEMU to return. I'm pretty sure there's honest differences in opinion on whether or not to run CHDKSK inside the emulator or not. I don't have enough experience to know which is better.

Bonus: I discovered you can get faster boots if you defragment the EFI partition after debian major updates (as far as I can tell it's the boot logo that causes this). You can grep DEFRAG.EXE from ibiblio and copy it into your EFI partition and run it later.

Joshua
  • 1,893