FANDOM


Mbm329 (talk) 06:48, April 20, 2015 (UTC)

The purpose of this document is to chronicle the setup of a RAID1 boot within CentOS 7. In trying to set this up, I've encountered several pitfalls and complexities. I hope to address those here and provide an end-to-end guide for myself and others wishing to do similar activities.


System Description

As it seems, the Dell server is incapable of utilizing an alternate boot path in "BIOS" boot mode. I had to go with "UEFI" boot mode. As I've found in researching this setup, UEFI seems to be more "extensible" anyway so this worked out well. Unfortunately, the CentOS 7 Installation DVD I'm using doesn't do RAID1 setups very well for /boot. Because of this, there are some manual steps that I needed to execute to have a complete working installation and end-result.

In an attempt to cut costs, I ordered the server without a RAID card and only a single drive as I have SATA drives here I could use. In retrospect, it probably cost me more in time and effort of getting the additional drive carriers and screws for allowing the drives to fit into the hot-swap bays. I would advise if you go this route, to just spend the extra money on the drives.

But if you're interested in proceeding down this path, here's a google search of the part number for the drive carriers.  I bought mine direct from Dell as a spare part.  Here's where I bought the screws to mount the carriers to the drives.  I had other drives I'm using in the server, so I bought enough carriers to utilize them.

One interesting item to note is that I couldn't order the server without a drive. The drive that came packaged with the R515, and I assume all Dell servers for that matter, had a diagnostic partition loaded on it. Because I'm not 100% sure if I'll ever want or need these diagnostic utilities, I decided to keep them and also mirror them onto the alternate drive of my RAID1 setup. If you're following this guide and you do not have a diagnostic partition packaged with your drive or you wish to remove it, these instructions do not vary much. Just be aware of those steps and skip them accordingly. In my case, the diagnostic partition is partition 1. So skipping should be easy.

With all that said, lets begin the quest!

Installation

BIOS Boot -> UEFI Boot

As mentioned previously, BIOS boot mode does not give an option for an alternate boot path if the primary drive goes south. Before we can boot the DVD and begin setup, we must change the boot mode to "UEFI". This may also come into play if you are booting from drives larger than 2TB.

  1. Set Boot Mode to UEFI
    1. Press F2 (System Setup) when visible in the top right corner during POST
    2. Navigate to "Boot Settings"
    3. Change Boot Mode from "BIOS" to "UEFI"
    4. Press Escape twice and select "Save changes and exit"
  2. Set DVD-ROM as primary boot device
    1. Press F11 (UEFI Boot Manager) when visible in the top right corner during POST
    2. Navigate to "UEFI Boot Sequence"
    3. Press Enter to enter "UEFI Boot Sequence"
    4. Press Enter again to REALLY enter "UEFI Boot Sequence"
    5. Use + to elevate the DVD-ROM to the top of the list
    6. Press Enter to accept the changes
    7. Press Escape and answer "Y" to save the changes
    8. Press Escape to exit the UEFI Boot Settings screen
    9. Select "Continue" and press Enter

Mirror Partition Table to Secondary Drive

Background

Dell loads, on the installed drive, an MBR (DOS) partition table containing a diagnostic partition with a handful of utilities and an empty EFI boot partition. The layout of these partitions will cause an error during installation of BIOS boot mode because the offset of the first partition does not give enough room for GRUB's core.img to be installed on the drive when using RAID modules for the /boot partition.

Generally, this core.img requires more than 30 sectors of space prior to the first partition. To solve this problem, we could continue using the MBR partition table and moving the partitions further out on the drive. Or, because we're using UEFI due to BIOS boot mode limitations not allowing alternate boot paths, we could go with GPT partition tables. With UEFI, we no longer need the core.img to load our GRUB boot loader. However, the partition table is still not optimal because it's starting at sector 20. This does not give very good partition alignment for performance. As most research into partition alignment will find, keeping partitions on a 1MB start, most block sizes will perform optimally performance-wise.

Also of note is the size of the EFI partition (partition 2). This partition is sized by Dell at 2GB. By most accounts, this is way overkill. I decided to lower it to 128MB. Which is still overkill for an EFI partition.

To achieve a proper GPT layout while still keeping the Dell diagnostic utilities, I utilize the second drive to build the initial partition table.

Currently, Drive 1 (/dev/sda), looks like this:

Partition Type Size Start Sector End Sector
1 Diagnostic 32 MB 20 65579
2 EFI System Partition 2048 MB 67584 4261887

For Drive 2 (/dev/sdc), I wanted to keep the same diagnostic partition size as it's rather small and the filesystem is already created by Dell. We create the 3rd partition for /boot because even though we can't RAID it properly in the installer, we can allow for it to consume a placeholder on both drives so that the 4th partition is created by the installer for the LVM Physical Volume containing all other partitions. NOTE: Due to how I have populated the drives in my system, my second drive is /dev/sdc. Yours may be /dev/sdb.

Partition Type Size Start Sector End Sector
1 Diagnostic 32 MB 2048 67607
2 EFI System Partition 128 MB 69632 331776
3 /boot 500 MB 333824 1357824

The math behind this is fairly simple. The idea is to keep the start sector on a 1MB boundary. As noted above, the sector size of the drive is 512 Bytes.

  1. Calculate Diagnostic Partition Start/End Sectors
    1. Start Sector = 2048 (1MB)
    2. End Sector = Start Sector + (sda1EndSector - sda1StartSector)
  2. Calculate EFI Partition Start/End Sectors
    1. Start Sector = 2048 * (whole(sdc1EndSector / 2048) + 1)
    2. End Sector = sdc2StartSector + ((128 * 1024 * 1024) / 512)
  3. Calculate /boot Filesystem Partition Start/End Sectors
    1. Start Sector = 2048 * (whole(sdc2EndSector / 2048) + 1)
    2. End Sector = sdc3StartSector + ((500 * 1024 * 1024) / 512)

Setup Initial Partitions

First thing's first, we need to get to the commandline. There are two methods of doing this - with and without SSH. By using SSH to run the commands, it may be easier to copy/paste and the screen's terminal width may be easier on the eyes.

  1. Boot the system from the Installation DVD.
  2. To perform the tasks in the following section with SSH
    1. Highlight "Install CentOS 7"
    2. Press "e" key to edit boot parameters
    3. On line starting with "linuxefi", add "sshd" to the end of the line.
    4. Press CTRL+x to boot
    5. On the welcome screen, press CTRL+ALT+F2 to get a shell prompt
    6. To get a list of interfaces to use, run: ip link show
    7. To add an IP address to connect to, run: ip addr add <IP ADDRESS>/<CIDR PREFIX> dev <INTERFACE>
    8. Use your favorite SSH client to connect as root to the host on the IP address you assigned. There will be no password.
  3. To perform the tasks in the following section without SSH
    1. Highlight "Install CentOS 7"
    2. Press Enter key
    3. Press CTRL+ALT+F2 to get a shell prompt

Now that we have a commandline prompt, lets gather some information from the partition tables of the drives before we begin. If you are logged in via SSH, you may want to copy this information in case you would like to reset the partitions back like they were originally (see "Reset Back Partitions" section). If you're on the console, maybe snap a picture. Here I've captured with fdisk and parted for good measure. The "Disk identifier" may be worth writing down in either case if you decide to reset your system. Generally in the Linux world, this isn't really used much. But from what I understand, in the Windows world, it can be used in licensing software.

# fdisk -l /dev/sda

Disk /dev/sda: 500.1 GB, 500107862016 bytes, 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0xd4e911ee

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1              20       65579       32780   de  Dell Utility
/dev/sda2   *       67584     4261887     2097152    c  W95 FAT32 (LBA)
# parted /dev/sda unit s print
Model: ATA WDC WD5003ABYX-1 (scsi)
Disk /dev/sda: 976773168s
Sector size (logical/physical): 512B/512B
Partition Table: msdos
Disk Flags: 

Number  Start   End       Size      Type     File system  Flags
 1      20s     65579s    65560s    primary  fat16        diag
 2      67584s  4261887s  4194304s  primary  fat32        boot, lba

Create the partitions on Drive 2 (/dev/sdc).

# parted /dev/sdc mklabel gpt mkpart Dell_Utility fat16 2048s 67607s mkpart EFI fat32 69632s 331776s mkpart BOOTFS ext4 333824s 1357824s set 1 diag on set 2 boot on set 3 raid on unit mib print align-check opt 1 align-check opt 2 align-check opt 3
Model: ATA ST320DM000-1BD14 (scsi)
Disk /dev/sdc: 305245MiB
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags: 

Number  Start    End      Size     File system  Name          Flags
 1      1.00MiB  33.0MiB  32.0MiB               Dell_Utility  diag
 2      34.0MiB  162MiB   128MiB                EFI           boot
 3      163MiB   663MiB   500MiB                BOOTFS        raid

1 aligned
2 aligned
3 aligned
Information: You may need to update /etc/fstab.

Copy data from Dell Diagnostic Partition of Drive 1 to Drive 2

# dd if=/dev/sda1 of=/dev/sdc1
65560+0 records in
65560+0 records out
33566720 bytes (34 MB) copied, 0.958902 s, 35.0 MB/s

Wipe partition table on Drive 1 (/dev/sda)

# dd if=/dev/zero of=/dev/sda bs=512 count=1
1+0 records in
1+0 records out
512 bytes (512 B) copied, 0.000375992 s, 1.4 MB/s

Replicate partition table from Drive 2 to Drive 1

# sgdisk -R /dev/sda /dev/sdc
The operation has completed successfully.

Assign new UUIDs to Drive 1's partitions.

# sgdisk -G /dev/sda
The operation has completed successfully.

Copy data from Dell Diagnostic Partition of Drive 2 to Drive 1

# dd if=/dev/sdc1 of=/dev/sda1
65560+0 records in
65560+0 records out
33566720 bytes (34 MB) copied, 0.988221 s, 34.0 MB/s

Reboot system for good measure since partition table has been changed

# reboot

Finish Install with DVD

Now that we have identical partitions on both drives that will participate in our RAID1 boot drives, we can install the OS and allow disk setup to use LVM for all other filesystems. Since SSH will not be needed during this step, just boot from the DVD and select "Install CentOS 7". Choose the proper keyboard setup and click "Continue". Since I've noticed that LVM seems to include the hostname as an owner of LVM volumes I just complete the networking setup first followed by date/time so that we can enable NTP.

In the "Installation Destination" screen, you will want to select both drive icons you wish to use as the boot drives. Then ensure the "I will configure partitioning" is selected and click "Done". You will be taken to a page where you can complete the disk setup. I used the following layout:

Device / VG Device Type Mount Point Filesystem Size (MiB)
sda2 Standard Partition /boot/efi EFI System Partition 128
sda3 Standard Partition /boot ext4 500
vg00 LVM / ext4 3072
vg00 LVM swap swap 6144
vg00 LVM /home ext4 1024
vg00 LVM /opt ext4 1024
vg00 LVM /tmp ext4 1024
vg00 LVM /usr ext4 3072
vg00 LVM /var ext4 2048

Notice the devices "sda2" and "sda3" are listed as Standard Partition. For sda2, this is apparent as we cannot RAID this volume. For sda3, the installer will not work as intended for making this volume a RAID1 device. So we'll make it by hand in the next section below.

For vg00 above, when configuring the "/" partition, I renamed the Volume Group from "centos_<hostname>" to "vg00". Also, because my drives are not exactly identical, I chose Size Policy = Fixed and gave it a size fitting within the smaller drive (295 GiB).

When the configuration is satisfactory, click "Done". A window will appear with a summary of changes showing "Destroy Format" and "Create Device/Format". Click "Accept Changes" when satisfied.

Of course, add the root password and a standard user account for yourself.

When the install is complete, click "Reboot".

After Installation

Update and Install Additional Utilities

TIP: If you had connected to the server via SSH during the install, you should remove the host-key from your ~/.ssh/known_hosts file on the client system. Example:

[client]$ ssh-keygen -R 10.1.1.2
/home/user/.ssh/known_hosts updated.
Original contents retained as /home/user/.ssh/known_hosts.old

Update to latest patches. This will also install any newer kernels needed wich will assist with creating initrd images later.

# yum -y update

Install helper apps to assist in completion of the rest of the exercise

# yum -y install patch gdisk

Reboot system to boot into fresh kernel and updated packages

# reboot

Configure Alternate Drive for Failover

Move /boot to RAID1

Create RAID1 device from a "missing" drive and sdc3. This will become /boot later.

# mdadm --create /dev/md/boot --level=1 --raid-devices=2 --metadata=default --bitmap=internal missing /dev/sdc3
mdadm: array /dev/md/boot started.

Create ext4 filesystem on the device just created.

# mkfs.ext4 /dev/md/boot
mke2fs 1.42.9 (28-Dec-2013)
Filesystem label=
OS type: Linux
Block size=1024 (log=0)
Fragment size=1024 (log=0)
Stride=4 blocks, Stripe width=4 blocks
128016 inodes, 511680 blocks
25584 blocks (5.00%) reserved for the super user
First data block=1
Maximum filesystem blocks=34078720
63 block groups
8192 blocks per group, 8192 fragments per group
2032 inodes per group
Superblock backups stored on blocks: 
	8193, 24577, 40961, 57345, 73729, 204801, 221185, 401409

Allocating group tables: done                            
Writing inode tables: done                            
Creating journal (8192 blocks): done
Writing superblocks and filesystem accounting information: done

Add entry into /etc/mdadm.conf for newly created RAID device

# mdadm --examine --scan | grep /dev/md/boot >>/etc/mdadm.conf

Copy over /boot (/dev/sda3) to /tmp/boot (/dev/md/boot), modify /etc/fstab with new device, and remount filesystems

# umount /boot/efi
# mkdir /tmp/boot
# mount /dev/md/boot /tmp/boot
# cd /boot
# cp -a . /tmp/boot
# cd
# umount /boot
# umount /tmp/boot
# sed -i "s%^UUID=$(blkid /dev/sda3 | awk -F\" '{print $2}') %/dev/md/boot            %" /etc/fstab
# mount /boot
# mount /boot/efi

Add in /dev/sda3 into /dev/md/boot RAID1 device

# mdadm /dev/md/boot --add /dev/sda3
mdadm: added /dev/sda3

Create a new initrd in /boot for pre-loading the system from RAID and LVM (we changed /etc/mdadm.conf and /etc/fstab)

# mv /boot/initramfs-$(uname -r).img /boot/initramfs-$(uname -r).img.bak
# dracut

Add UUID for /dev/md/boot to /etc/default/grub so that the grub config will contain it.

# sed -i "s/\(rd.md.uuid\)/\1=$(mdadm --detail /dev/md/boot | awk '/UUID/ {print $3}') \1/" /etc/default/grub

Patch GRUB2 due to bug - https://bugs.centos.org/view.php?id=7651

# cp -a /usr/share/grub/grub-mkconfig_lib /usr/share/grub/grub-mkconfig_lib.orig
# vi /tmp/grub-mkconfig_lib.patch

Place following data into file

--- a/util/grub-mkconfig_lib.in	2014-06-30 16:16:11.000000000 +0000
+++ a/util/grub-mkconfig_lib.in	2014-12-08 23:05:56.936903046 +0000
@@ -263,13 +263,14 @@ 
 
 version_find_latest ()
 {
-  version_find_latest_a=""
-  for i in "$@" ; do
-    if version_test_gt "$i" "$version_find_latest_a" ; then
-      version_find_latest_a="$i"
-    fi
-  done
-  echo "$version_find_latest_a"
+  {
+    for i in "$@"; do
+      echo $i
+    done | grep -v rescue | sed 's/.x86_64$//g' | sort -V -r | sed 's/$/.x86_64/g'
+    for i in "$@"; do
+      echo $i
+    done | grep rescue | sort -V
+  } | head -n 1
 }
 
 # One layer of quotation is eaten by "" and the second by sed; so this turns

Then patch the /usr/share/grub/grub-mkconfig_lib file with the patch just created.

# patch -b /usr/share/grub/grub-mkconfig_lib /tmp/grub-mkconfig_lib.patch
patching file /usr/share/grub/grub-mkconfig_lib

Re-build the grub2 configuration now that /dev/md/boot is our new boot device

# cp -a /boot/efi/EFI/centos/grub.cfg /boot/efi/EFI/centos/grub.cfg.orig
# grub2-mkconfig -o /boot/efi/EFI/centos/grub.cfg
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-3.10.0-229.1.2.el7.x86_64
Found initrd image: /boot/initramfs-3.10.0-229.1.2.el7.x86_64.img
/usr/sbin/grub2-probe: warning: Couldn't find physical volume ‘(null)’. Some modules may be missing from core image..
/usr/sbin/grub2-probe: warning: Couldn't find physical volume ‘(null)’. Some modules may be missing from core image..
/usr/sbin/grub2-probe: warning: Couldn't find physical volume ‘(null)’. Some modules may be missing from core image..
Found linux image: /boot/vmlinuz-3.10.0-229.el7.x86_64
Found initrd image: /boot/initramfs-3.10.0-229.el7.x86_64.img
Found linux image: /boot/vmlinuz-0-rescue-90e17130631b4b7193eaeb900f3b1631
Found initrd image: /boot/initramfs-0-rescue-90e17130631b4b7193eaeb900f3b1631.img
done

Configure Alternate UEFI Firmware Boot Path

Copy contents of /boot/efi partition (/dev/sda2) to second drive

# umount /dev/sda2
# dd if=/dev/sda2 of=/dev/sdc2
262145+0 records in
262145+0 records out
134218240 bytes (134 MB) copied, 2.49582 s, 53.8 MB/s
# mount /boot/efi

Add in an EFI entry for Drive 2 high availability booting

# efibootmgr -v -c -L CentOS-Alt -d /dev/sdc -p 2 -l '\EFI\centos\shim.efi'
BootCurrent: 0005
Timeout: 0 seconds
BootOrder: 0007,0005,0000,0001,0002,0003,0004,0006
Boot0000* DVDRAM SP60NB50 	ACPI(a0841d0,0)PCI(12,2)USB(2,0)USB(0,0)
Boot0001* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,0)MAC(MAC(001018bfec10,0)
Boot0002* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,1)MAC(MAC(001018bfec11,0)
Boot0003* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,2)MAC(MAC(001018bfec12,0)
Boot0004* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,3)MAC(MAC(001018bfec13,0)
Boot0005* CentOS	HD(2,11000,40001,46388b8f-a628-4fb7-a9d9-011759eb22de)File(\EFI\centos\shim.efi)
Boot0006* EFI Fixed Disk Boot Device 1	ACPI(a0841d0,0)PCI(4,0)PCI(0,0)VenMsg(d487ddb4-008b-11d9-afdc-001083ffca4d,00000000e2ab8967b3650050000000000000000012000100)HD(2,11000,40001,46388b8f-a628-4fb7-a9d9-011759eb22de)
Boot0007* CentOS-Alt	HD(2,11000,40001,7a8886e6-0a6b-4a4d-b344-b514a4b82338)File(\EFI\centos\shim.efi)

Keep an eye out for when the drives are synched up.

# cat /proc/mdstat
Personalities : [raid1] 
md126 : active raid1 sda3[2] sdc3[1]
      511680 blocks super 1.2 [2/1] [_U]
      	resync=DELAYED
      bitmap: 1/1 pages [4KB], 65536KB chunk

md127 : active raid1 sdc4[1] sda4[0]
      309374976 blocks super 1.2 [2/2] [UU]
      [========>............]  resync = 41.0% (127129728/309374976) finish=28.8min speed=105261K/sec
      bitmap: 3/3 pages [12KB], 65536KB chunk

unused devices: <none>

Example of sync'd raid volumes

# cat /proc/mdstat
Personalities : [raid1] 
md126 : active raid1 sda3[2] sdc3[1]
      511680 blocks super 1.2 [2/2] [UU]
      bitmap: 0/1 pages [0KB], 65536KB chunk

md127 : active raid1 sdc4[1] sda4[0]
      309374976 blocks super 1.2 [2/2] [UU]
      bitmap: 0/3 pages [0KB], 65536KB chunk

unused devices: <none>

Collect output from efibootmgr for comparison after reboot (next).

# efibootmgr -v
BootCurrent: 0005
Timeout: 0 seconds
BootOrder: 0007,0005,0000,0001,0002,0003,0004,0006
Boot0000* DVDRAM SP60NB50 	ACPI(a0841d0,0)PCI(12,2)USB(2,0)USB(0,0)
Boot0001* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,0)MAC(MAC(001018bfec10,0)
Boot0002* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,1)MAC(MAC(001018bfec11,0)
Boot0003* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,2)MAC(MAC(001018bfec12,0)
Boot0004* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,3)MAC(MAC(001018bfec13,0)
Boot0005* CentOS	HD(2,11000,40001,46388b8f-a628-4fb7-a9d9-011759eb22de)File(\EFI\centos\shim.efi)
Boot0006* EFI Fixed Disk Boot Device 1	ACPI(a0841d0,0)PCI(4,0)PCI(0,0)VenMsg(d487ddb4-008b-11d9-afdc-001083ffca4d,00000000e2ab8967b3650050000000000000000012000100)HD(2,11000,40001,46388b8f-a628-4fb7-a9d9-011759eb22de)
Boot0007* CentOS-Alt	HD(2,11000,40001,7a8886e6-0a6b-4a4d-b344-b514a4b82338)File(\EFI\centos\shim.efi)

Reboot to ensure everything is working as expected.
NOTE: DO NOT REBOOT UNLESS MDSTAT OUTPUT LOOKS SYNC'D AS ABOVE!!!

# reboot

Collect output from efibootmgr for comparison against output taken before reboot.

# efibootmgr -v
BootCurrent: 0007
Timeout: 0 seconds
BootOrder: 0007,0005,0000,0001,0002,0003,0004,0006,0008,0009
Boot0000* DVDRAM SP60NB50 	ACPI(a0841d0,0)PCI(12,2)USB(2,0)USB(0,0)
Boot0001* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,0)MAC(MAC(001018bfec10,0)
Boot0002* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,1)MAC(MAC(001018bfec11,0)
Boot0003* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,2)MAC(MAC(001018bfec12,0)
Boot0004* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,3)MAC(MAC(001018bfec13,0)
Boot0005* CentOS	HD(2,11000,40001,46388b8f-a628-4fb7-a9d9-011759eb22de)File(\EFI\centos\shim.efi)
Boot0006* EFI Fixed Disk Boot Device 1	ACPI(a0841d0,0)PCI(4,0)PCI(0,0)VenMsg(d487ddb4-008b-11d9-afdc-001083ffca4d,00000000e2ab8967b3650050000000000000000012000100)HD(2,11000,40001,46388b8f-a628-4fb7-a9d9-011759eb22de)
Boot0007* CentOS-Alt	HD(2,11000,40001,7a8886e6-0a6b-4a4d-b344-b514a4b82338)File(\EFI\centos\shim.efi)
Boot0008* CentOS	HD(2,11000,40001,7a8886e6-0a6b-4a4d-b344-b514a4b82338)File(\EFI\centos\shim.efi)
Boot0009* EFI Fixed Disk Boot Device 2	ACPI(a0841d0,0)PCI(4,0)PCI(0,0)VenMsg(d487ddb4-008b-11d9-afdc-001083ffca4d,00000000e6ab8967b3650050000000000000000012020100)HD(2,11000,40001,7a8886e6-0a6b-4a4d-b344-b514a4b82338)

As we can see, it appears my Dell UEFI firmware is auto-discovering my /dev/sdc drive (7a8886e6-0a6b-4a4d-b344-b514a4b82338) as it placed an additional entry automatically. If this is the case, feel free to remove the boot entry you just created before (CentOS-Alt).

# efibootmgr -v -B -b 7
BootCurrent: 0007
Timeout: 0 seconds
BootOrder: 0005,0000,0001,0002,0003,0004,0006,0008,0009
Boot0000* DVDRAM SP60NB50 	ACPI(a0841d0,0)PCI(12,2)USB(2,0)USB(0,0)
Boot0001* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,0)MAC(MAC(001018bfec10,0)
Boot0002* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,1)MAC(MAC(001018bfec11,0)
Boot0003* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,2)MAC(MAC(001018bfec12,0)
Boot0004* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,3)MAC(MAC(001018bfec13,0)
Boot0005* CentOS	HD(2,11000,40001,46388b8f-a628-4fb7-a9d9-011759eb22de)File(\EFI\centos\shim.efi)
Boot0006* EFI Fixed Disk Boot Device 1	ACPI(a0841d0,0)PCI(4,0)PCI(0,0)VenMsg(d487ddb4-008b-11d9-afdc-001083ffca4d,00000000e2ab8967b3650050000000000000000012000100)HD(2,11000,40001,46388b8f-a628-4fb7-a9d9-011759eb22de)
Boot0008* CentOS	HD(2,11000,40001,7a8886e6-0a6b-4a4d-b344-b514a4b82338)File(\EFI\centos\shim.efi)
Boot0009* EFI Fixed Disk Boot Device 2	ACPI(a0841d0,0)PCI(4,0)PCI(0,0)VenMsg(d487ddb4-008b-11d9-afdc-001083ffca4d,00000000e6ab8967b3650050000000000000000012020100)HD(2,11000,40001,7a8886e6-0a6b-4a4d-b344-b514a4b82338)

Confirm which drive the system booted from.

# efibootmgr -v | grep BootCurrent
BootCurrent: 0007

Set proper boot order so that boot paths are Drive 1, Drive 2, DVD, and everything else.

# efibootmgr -v -o 0005,0008,0000,0001,0002,0003,0004,0006,0009
BootCurrent: 0007
Timeout: 0 seconds
BootOrder: 0005,0008,0000,0001,0002,0003,0004,0006,0009
Boot0000* DVDRAM SP60NB50 	ACPI(a0841d0,0)PCI(12,2)USB(2,0)USB(0,0)
Boot0001* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,0)MAC(MAC(001018bfec10,0)
Boot0002* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,1)MAC(MAC(001018bfec11,0)
Boot0003* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,2)MAC(MAC(001018bfec12,0)
Boot0004* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,3)MAC(MAC(001018bfec13,0)
Boot0005* CentOS	HD(2,11000,40001,46388b8f-a628-4fb7-a9d9-011759eb22de)File(\EFI\centos\shim.efi)
Boot0006* EFI Fixed Disk Boot Device 1	ACPI(a0841d0,0)PCI(4,0)PCI(0,0)VenMsg(d487ddb4-008b-11d9-afdc-001083ffca4d,00000000e2ab8967b3650050000000000000000012000100)HD(2,11000,40001,46388b8f-a628-4fb7-a9d9-011759eb22de)
Boot0008* CentOS	HD(2,11000,40001,7a8886e6-0a6b-4a4d-b344-b514a4b82338)File(\EFI\centos\shim.efi)
Boot0009* EFI Fixed Disk Boot Device 2	ACPI(a0841d0,0)PCI(4,0)PCI(0,0)VenMsg(d487ddb4-008b-11d9-afdc-001083ffca4d,00000000e6ab8967b3650050000000000000000012020100)HD(2,11000,40001,7a8886e6-0a6b-4a4d-b344-b514a4b82338)

Reboot System off Drive 1

# reboot

Confirm the system booted from Drive 1

# efibootmgr -v | grep BootCurrent
BootCurrent: 0005

Set Drive 2 as next boot path

# efibootmgr -v -n 8
BootNext: 0008
BootCurrent: 0005
Timeout: 0 seconds
BootOrder: 0005,0008,0000,0001,0002,0003,0004,0006,0009
Boot0000* DVDRAM SP60NB50 	ACPI(a0841d0,0)PCI(12,2)USB(2,0)USB(0,0)
Boot0001* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,0)MAC(MAC(001018bfec10,0)
Boot0002* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,1)MAC(MAC(001018bfec11,0)
Boot0003* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,2)MAC(MAC(001018bfec12,0)
Boot0004* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,3)MAC(MAC(001018bfec13,0)
Boot0005* CentOS	HD(2,11000,40001,46388b8f-a628-4fb7-a9d9-011759eb22de)File(\EFI\centos\shim.efi)
Boot0006* EFI Fixed Disk Boot Device 1	ACPI(a0841d0,0)PCI(4,0)PCI(0,0)VenMsg(d487ddb4-008b-11d9-afdc-001083ffca4d,00000000e2ab8967b3650050000000000000000012000100)HD(2,11000,40001,46388b8f-a628-4fb7-a9d9-011759eb22de)
Boot0008* CentOS	HD(2,11000,40001,7a8886e6-0a6b-4a4d-b344-b514a4b82338)File(\EFI\centos\shim.efi)
Boot0009* EFI Fixed Disk Boot Device 2	ACPI(a0841d0,0)PCI(4,0)PCI(0,0)VenMsg(d487ddb4-008b-11d9-afdc-001083ffca4d,00000000e6ab8967b3650050000000000000000012020100)HD(2,11000,40001,7a8886e6-0a6b-4a4d-b344-b514a4b82338)

Reboot System off Drive 2

# reboot

Confirm the system booted from Drive 2

# efibootmgr -v | grep BootCurrent
BootCurrent: 0008

Testing Drive Failures

Of course, who would trust this to be an accurate procedure. We need to test drive failures to negate a false sense of security. First, we'll test the alternate, then we'll test the primary. These tests are ran on a system which enabled hot-swap drive bays. If your system doesn't have hot-swap drive bays, you perform the operation and tests by shutting down the system, and removing/adding the drive then booting back up.

During bootup you may see the following "error". This is most likely because the DVD drive is empty. If you leave it be, it should automatically boot into the OS. If you don't like this, you can always disable the DVD drive from UEFI.

error: failure reading sector 0x0 from 'hd0'.

Press any key to continue...

Alternate Drive

Simulate Immediate Failure of Drive 2

Boot the system.

Review drive and partition layout.

# lsblk -i
NAME            MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sda               8:0    0 465.8G  0 disk  
|-sda1            8:1    0    32M  0 part  
|-sda2            8:2    0   128M  0 part  
|-sda3            8:3    0   500M  0 part  
| `-md127         9:127  0 499.7M  0 raid1 /boot
`-sda4            8:4    0 295.2G  0 part  
  `-md126         9:126  0   295G  0 raid1 
    |-vg00-swap 253:0    0     6G  0 lvm   [SWAP]
    |-vg00-usr  253:1    0     3G  0 lvm   /usr
    |-vg00-root 253:2    0     3G  0 lvm   /
    |-vg00-home 253:3    0     1G  0 lvm   /home
    |-vg00-opt  253:4    0     1G  0 lvm   /opt
    |-vg00-tmp  253:5    0     1G  0 lvm   /tmp
    `-vg00-var  253:6    0     2G  0 lvm   /var
sdb               8:16   0 931.5G  0 disk  
sdc               8:32   0 298.1G  0 disk  
|-sdc1            8:33   0    32M  0 part  
|-sdc2            8:34   0   128M  0 part  /boot/efi
|-sdc3            8:35   0   500M  0 part  
| `-md127         9:127  0 499.7M  0 raid1 /boot
`-sdc4            8:36   0 295.2G  0 part  
  `-md126         9:126  0   295G  0 raid1 
    |-vg00-swap 253:0    0     6G  0 lvm   [SWAP]
    |-vg00-usr  253:1    0     3G  0 lvm   /usr
    |-vg00-root 253:2    0     3G  0 lvm   /
    |-vg00-home 253:3    0     1G  0 lvm   /home
    |-vg00-opt  253:4    0     1G  0 lvm   /opt
    |-vg00-tmp  253:5    0     1G  0 lvm   /tmp
    `-vg00-var  253:6    0     2G  0 lvm   /var
sdd               8:48   0 931.5G  0 disk  
sde               8:64   0 931.5G  0 disk  
sdf               8:80   0 931.5G  0 disk  
sr0              11:0    1   636M  0 rom   

Identify /dev/sdc (Drive 2) if unsure. Look at the front of the server for a near-steady activity light. Press CTRL+c to end the dd command.

# dd if=/dev/sdc of=/dev/null
^C1128969+0 records in
1128968+0 records out
578031616 bytes (578 MB) copied, 4.72669 s, 122 MB/s

Physically hot-pull /dev/sdc from the server.

Check dmesg for kernel output. Notice how the drive went offline.

# dmesg
...
[  454.528094] md: md127 still in use.
[  454.528095] md: md126 still in use.
[  454.529013] md/raid1:md126: Disk failure on sdc4, disabling device.
md/raid1:md126: Operation continuing on 1 devices.
[  454.529266] md/raid1:md127: Disk failure on sdc3, disabling device.
md/raid1:md127: Operation continuing on 1 devices.
[  454.583338] RAID1 conf printout:
[  454.583343]  --- wd:1 rd:2
[  454.583346]  disk 0, wo:0, o:1, dev:sda3
[  454.583348]  disk 1, wo:1, o:0, dev:sdc3
[  454.595469] RAID1 conf printout:
[  454.595473]  --- wd:1 rd:2
[  454.595476]  disk 0, wo:0, o:1, dev:sda4
[  454.595478]  disk 1, wo:1, o:0, dev:sdc4
[  454.616357] sd 0:0:2:0: [sdc] Synchronizing SCSI cache
[  454.616382] sd 0:0:2:0: [sdc]  
[  454.616384] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
[  454.616526] mpt2sas0: removing handle(0x000d), sas_addr(0x500065b36789abe6)
[  454.625952] RAID1 conf printout:
[  454.625955]  --- wd:1 rd:2
[  454.625957]  disk 0, wo:0, o:1, dev:sda3
[  454.625963] RAID1 conf printout:
[  454.625965]  --- wd:1 rd:2
[  454.625967]  disk 0, wo:0, o:1, dev:sda4
[  454.658500] md: unbind<sdc3>
[  454.663928] md: export_rdev(sdc3)

Check RAID status. Should show one drive missing in RAID1 arrays associated with this drive.

# cat /proc/mdstat
Personalities : [raid1] 
md126 : active raid1 sdc4[1](F) sda4[0]
      309374976 blocks super 1.2 [2/1] [U_]
      bitmap: 1/3 pages [4KB], 65536KB chunk

md127 : active raid1 sda3[2]
      511680 blocks super 1.2 [2/1] [U_]
      bitmap: 0/1 pages [0KB], 65536KB chunk

unused devices: <none>

Check the mounted filesystems. If you see /dev/sdc2 (/boot/efi) mounted, go ahead and unmount it. Instead, you may see an error as such. Unmount /boot/efi, then mount it again. This time, /dev/sda2 should show up.

# df
df: ‘/boot/efi’: Input/output error
Filesystem            1K-blocks   Used Available Use% Mounted on
/dev/mapper/vg00-root   3030800  30216   2826916   2% /
devtmpfs                1878604      0   1878604   0% /dev
tmpfs                   1921220      0   1921220   0% /dev/shm
tmpfs                   1921220   8712   1912508   1% /run
tmpfs                   1921220      0   1921220   0% /sys/fs/cgroup
/dev/mapper/vg00-usr    3030800 876624   1980508  31% /usr
/dev/md127               487314 153730    303904  34% /boot
/dev/mapper/vg00-var    1998672  78916   1798516   5% /var
/dev/mapper/vg00-opt     999320   2564    927944   1% /opt
/dev/mapper/vg00-tmp     999320   2604    927904   1% /tmp
/dev/mapper/vg00-home    999320   2580    927928   1% /home
# umount /boot/efi
# mount /boot/efi
# df
Filesystem            1K-blocks   Used Available Use% Mounted on
/dev/mapper/vg00-root   3030800  30216   2826916   2% /
devtmpfs                1878604      0   1878604   0% /dev
tmpfs                   1921220      0   1921220   0% /dev/shm
tmpfs                   1921220   8712   1912508   1% /run
tmpfs                   1921220      0   1921220   0% /sys/fs/cgroup
/dev/mapper/vg00-usr    3030800 876624   1980508  31% /usr
/dev/md127               487314 153730    303904  34% /boot
/dev/mapper/vg00-var    1998672  78916   1798516   5% /var
/dev/mapper/vg00-opt     999320   2564    927944   1% /opt
/dev/mapper/vg00-tmp     999320   2604    927904   1% /tmp
/dev/mapper/vg00-home    999320   2580    927928   1% /home
/dev/sda2                130800   9980    120820   8% /boot/efi

Reboot to test full boot process with only Drive 1.

# reboot

Review drive and partition layout.

# lsblk -i
NAME            MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sda               8:0    0 465.8G  0 disk  
|-sda1            8:1    0    32M  0 part  
|-sda2            8:2    0   128M  0 part  /boot/efi
|-sda3            8:3    0   500M  0 part  
| `-md127         9:127  0 499.7M  0 raid1 /boot
`-sda4            8:4    0 295.2G  0 part  
  `-md126         9:126  0   295G  0 raid1 
    |-vg00-swap 253:0    0     6G  0 lvm   [SWAP]
    |-vg00-usr  253:1    0     3G  0 lvm   /usr
    |-vg00-root 253:2    0     3G  0 lvm   /
    |-vg00-home 253:3    0     1G  0 lvm   /home
    |-vg00-opt  253:4    0     1G  0 lvm   /opt
    |-vg00-tmp  253:5    0     1G  0 lvm   /tmp
    `-vg00-var  253:6    0     2G  0 lvm   /var
sdb               8:16   0 931.5G  0 disk  
sdc               8:32   0 931.5G  0 disk  
sdd               8:48   0 931.5G  0 disk  
sde               8:64   0 931.5G  0 disk  
sr0              11:0    1   636M  0 rom   

Check to ensure /boot/efi is mounted with /dev/sda2.

# df
Filesystem            1K-blocks   Used Available Use% Mounted on
/dev/mapper/vg00-root   3030800  30236   2826896   2% /
devtmpfs                1878604      0   1878604   0% /dev
tmpfs                   1888252      0   1888252   0% /dev/shm
tmpfs                   1888252   8724   1879528   1% /run
tmpfs                   1888252      0   1888252   0% /sys/fs/cgroup
/dev/mapper/vg00-usr    3030800 877240   1979892  31% /usr
/dev/md127               487314 153815    303819  34% /boot
/dev/mapper/vg00-opt     999320   2564    927944   1% /opt
/dev/mapper/vg00-home    999320   2580    927928   1% /home
/dev/sda2                130800   9980    120820   8% /boot/efi
/dev/mapper/vg00-tmp     999320   2604    927904   1% /tmp
/dev/mapper/vg00-var    1998672  79416   1798016   5% /var

Fabricate a "New" Drive

We need to simulate an new drive insertion/rebuild, so we must properly clean the partitions of any filesystem and RAID metadata, then wipe the partition table.

Physically hot-plug Drive 2 back into the server.

Check dmesg for kernel output. Notice which drive it shows up as (/dev/sdf in my case).

# dmesg
...
[  282.356198] scsi 0:0:6:0: Direct-Access     ATA      ST320DM000-1BD14 KC48 PQ: 0 ANSI: 5
[  282.356207] scsi 0:0:6:0: SATA: handle(0x0011), sas_addr(0x500065b36789abe6), phy(6), device_name(0xc5005000d04c64af)
[  282.356211] scsi 0:0:6:0: SATA: enclosure_logical_id(0x500065b37689abff), slot(2)
[  282.356289] scsi 0:0:6:0: atapi(n), ncq(y), asyn_notify(n), smart(y), fua(y), sw_preserve(y)
[  282.356292] scsi 0:0:6:0: qdepth(32), tagged(1), simple(0), ordered(0), scsi_level(6), cmd_que(1)
[  282.361040] sd 0:0:6:0: [sdf] physical block alignment offset: 4096
[  282.361046] sd 0:0:6:0: [sdf] 625142448 512-byte logical blocks: (320 GB/298 GiB)
[  282.361049] sd 0:0:6:0: [sdf] 4096-byte physical blocks
[  282.390463] sd 0:0:6:0: [sdf] Write Protect is off
[  282.390467] sd 0:0:6:0: [sdf] Mode Sense: 7f 00 00 08
[  282.419658] sd 0:0:6:0: [sdf] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[  282.550111]  sdf: sdf1 sdf2 sdf3 sdf4
[  282.588461] sd 0:0:6:0: [sdf] Attached SCSI disk

Wipe partition 1 (Dell Utilities)

# dd if=/dev/zero of=/dev/sdf1 bs=512 count=1
1+0 records in
1+0 records out
512 bytes (512 B) copied, 0.0184175 s, 27.8 kB/s

Wipe partition 2 (EFI System Partition)

# dd if=/dev/zero of=/dev/sdf2 bs=512 count=1
1+0 records in
1+0 records out
512 bytes (512 B) copied, 0.000612183 s, 836 kB/s

Zero RAID superblock on partition 3 (/boot)

# mdadm --zero-superblock /dev/sdf3

Zero RAID superblock on partition 4 (pv00/vg00)

# mdadm --zero-superblock /dev/sdf4

Wipe partition table of Drive 2.

# dd if=/dev/zero of=/dev/sdf bs=512 count=1
1+0 records in
1+0 records out
512 bytes (512 B) copied, 0.000822915 s, 622 kB/s

Physically hot-pull Drive 2 from system.

Reboot system from only Drive 1 again.

# reboot

Simulate Replacement Drive Insertion and Rebuild

Now that we have a "new" empty drive, we want to test the insertion and rebuild process.
NOTE: This can be referenced for actual replacement drive rebuild.

Physically hot-plug Drive 2 into the server.

Check dmesg for kernel output. You should see the new drive appear (in my case, it was /dev/sdf again).

# dmesg
...
[  278.110896] scsi 0:0:6:0: Direct-Access     ATA      ST320DM000-1BD14 KC48 PQ: 0 ANSI: 5
[  278.110903] scsi 0:0:6:0: SATA: handle(0x0011), sas_addr(0x500065b36789abe6), phy(6), device_name(0xc5005000d04c64af)
[  278.110905] scsi 0:0:6:0: SATA: enclosure_logical_id(0x500065b37689abff), slot(2)
[  278.110982] scsi 0:0:6:0: atapi(n), ncq(y), asyn_notify(n), smart(y), fua(y), sw_preserve(y)
[  278.110986] scsi 0:0:6:0: qdepth(32), tagged(1), simple(0), ordered(0), scsi_level(6), cmd_que(1)
[  278.115742] sd 0:0:6:0: [sdf] physical block alignment offset: 4096
[  278.115747] sd 0:0:6:0: [sdf] 625142448 512-byte logical blocks: (320 GB/298 GiB)
[  278.115749] sd 0:0:6:0: [sdf] 4096-byte physical blocks
[  278.145135] sd 0:0:6:0: [sdf] Write Protect is off
[  278.145139] sd 0:0:6:0: [sdf] Mode Sense: 7f 00 00 08
[  278.175597] sd 0:0:6:0: [sdf] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[  278.276884]  sdf: unknown partition table
[  278.356262] sd 0:0:6:0: [sdf] Attached SCSI disk

As we can see from the above output, there is no partition table. We will replicate one from the partition table of Drive 1 (/dev/sda).

# sgdisk -R /dev/sdf /dev/sda
Caution! Secondary header was placed beyond the disk's limits! Moving the
header, but other problems may occur!
The operation has completed successfully.

Due to the above error, we should confirm that both drive partition tables are identical. Everything looks alright here. The error is very likely the result of having two different sized drives. This is why we set the Volume Group during the install to "Fixed Size".

# parted /dev/sda unit s print
Model: ATA WDC WD5003ABYX-1 (scsi)
Disk /dev/sda: 976773168s
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags: 

Number  Start     End         Size        File system  Name                  Flags
 1      2048s     67607s      65560s      fat16        Dell_Utility          diag
 2      69632s    331776s     262145s     fat16        EFI System Partition  boot
 3      333824s   1357824s    1024001s    ext4         BOOTFS                raid
 4      1359872s  620371967s  619012096s                                     raid
# parted /dev/sdf unit s print
Model: ATA ST320DM000-1BD14 (scsi)
Disk /dev/sdf: 625142448s
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags: 

Number  Start     End         Size        File system  Name                  Flags
 1      2048s     67607s      65560s                   Dell_Utility          diag
 2      69632s    331776s     262145s                  EFI System Partition  boot
 3      333824s   1357824s    1024001s                 BOOTFS                raid
 4      1359872s  620371967s  619012096s                                     raid

When we replicated the partition table, we also replicated the UUIDs of the partitions. We can rectify this with sgdisk's -G flag. New UUIDs will be generated for the partitions.

# sgdisk -G /dev/sdf
The operation has completed successfully.

Copy partition 1 (Dell Utilities) from Drive 1 to Drive 2.

# dd if=/dev/sda1 of=/dev/sdf1
65560+0 records in
65560+0 records out
33566720 bytes (34 MB) copied, 0.942867 s, 35.6 MB/s

Copy partition 2 (EFI System Partition) from Drive 1 to Drive 2. It's good to do this while the filesystem is quiesed, so we unmount it.

# umount /boot/efi
# dd if=/dev/sda2 of=/dev/sdf2
262145+0 records in
262145+0 records out
134218240 bytes (134 MB) copied, 2.26022 s, 59.4 MB/s
# mount /boot/efi

Remove old entry for Drive 2 in UEFI Firmware. To ensure we remove the one for Drive 2, and because we changed the UUIDs above, we will search for Drive 1 (/dev/sda2), and remove the other entry.

# efibootmgr -v
BootCurrent: 0005
Timeout: 0 seconds
BootOrder: 0005,0008,0000,0001,0002,0003,0004,0006
Boot0000* DVDRAM SP60NB50 	ACPI(a0841d0,0)PCI(12,2)USB(2,0)USB(0,0)
Boot0001* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,0)MAC(MAC(001018bfec10,0)
Boot0002* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,1)MAC(MAC(001018bfec11,0)
Boot0003* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,2)MAC(MAC(001018bfec12,0)
Boot0004* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,3)MAC(MAC(001018bfec13,0)
Boot0005* CentOS	HD(2,11000,40001,46388b8f-a628-4fb7-a9d9-011759eb22de)File(\EFI\centos\shim.efi)
Boot0006* EFI Fixed Disk Boot Device 1	ACPI(a0841d0,0)PCI(4,0)PCI(0,0)VenMsg(d487ddb4-008b-11d9-afdc-001083ffca4d,00000000e2ab8967b3650050000000000000000012000100)HD(2,11000,40001,46388b8f-a628-4fb7-a9d9-011759eb22de)
Boot0008* CentOS	HD(2,11000,40001,7a8886e6-0a6b-4a4d-b344-b514a4b82338)File(\EFI\centos\shim.efi)
# blkid /dev/sda2
/dev/sda2: SEC_TYPE="msdos" UUID="1FC9-8164" TYPE="vfat" PARTLABEL="EFI System Partition" PARTUUID="46388b8f-a628-4fb7-a9d9-011759eb22de" 
# efibootmgr -v -B -b 8
BootCurrent: 0005
Timeout: 0 seconds
BootOrder: 0005,0000,0001,0002,0003,0004,0006
Boot0000* DVDRAM SP60NB50 	ACPI(a0841d0,0)PCI(12,2)USB(2,0)USB(0,0)
Boot0001* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,0)MAC(MAC(001018bfec10,0)
Boot0002* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,1)MAC(MAC(001018bfec11,0)
Boot0003* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,2)MAC(MAC(001018bfec12,0)
Boot0004* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,3)MAC(MAC(001018bfec13,0)
Boot0005* CentOS	HD(2,11000,40001,46388b8f-a628-4fb7-a9d9-011759eb22de)File(\EFI\centos\shim.efi)
Boot0006* EFI Fixed Disk Boot Device 1	ACPI(a0841d0,0)PCI(4,0)PCI(0,0)VenMsg(d487ddb4-008b-11d9-afdc-001083ffca4d,00000000e2ab8967b3650050000000000000000012000100)HD(2,11000,40001,46388b8f-a628-4fb7-a9d9-011759eb22de)

Add a new entry to the UEFI Firmware for Drive 2 since the UUID is different. By default, when a new entry is added, it is set first in the boot order. This is fine for now. It will need to be tested.

# efibootmgr -v -c -L CentOS-Alt -d /dev/sdf -p 2 -l '\EFI\centos\shim.efi'
BootCurrent: 0005
Timeout: 0 seconds
BootOrder: 0007,0005,0000,0001,0002,0003,0004,0006
Boot0000* DVDRAM SP60NB50 	ACPI(a0841d0,0)PCI(12,2)USB(2,0)USB(0,0)
Boot0001* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,0)MAC(MAC(001018bfec10,0)
Boot0002* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,1)MAC(MAC(001018bfec11,0)
Boot0003* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,2)MAC(MAC(001018bfec12,0)
Boot0004* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,3)MAC(MAC(001018bfec13,0)
Boot0005* CentOS	HD(2,11000,40001,46388b8f-a628-4fb7-a9d9-011759eb22de)File(\EFI\centos\shim.efi)
Boot0006* EFI Fixed Disk Boot Device 1	ACPI(a0841d0,0)PCI(4,0)PCI(0,0)VenMsg(d487ddb4-008b-11d9-afdc-001083ffca4d,00000000e2ab8967b3650050000000000000000012000100)HD(2,11000,40001,46388b8f-a628-4fb7-a9d9-011759eb22de)
Boot0007* CentOS-Alt	HD(2,11000,40001,9bf5d69e-0335-458f-85a6-b42cd2d720da)File(\EFI\centos\shim.efi)

Add partition 3 back to /dev/md/boot RAID 1 array.

# mdadm /dev/md/boot --add /dev/sdf3
mdadm: added /dev/sdf3

Add partition 4 back to /dev/md/pv00 RAID 1 array.

# mdadm /dev/md/pv00 --add /dev/sdf4
mdadm: added /dev/sdf4

Keep an eye out for when the drives are synched up.

# cat /proc/mdstat
Personalities : [raid1] 
md126 : active raid1 sdf4[2] sda4[0]
      309374976 blocks super 1.2 [2/1] [U_]
      [======>..............]  recovery = 30.2% (93459840/309374976) finish=33.6min speed=106971K/sec
      bitmap: 0/3 pages [0KB], 65536KB chunk

md127 : active raid1 sdf3[3] sda3[2]
      511680 blocks super 1.2 [2/2] [UU]
      bitmap: 0/1 pages [0KB], 65536KB chunk

unused devices: <none>

Example of sync'd raid volumes

# cat /proc/mdstat
Personalities : [raid1] 
md126 : active raid1 sdf4[2] sda4[0]
      309374976 blocks super 1.2 [2/2] [UU]
      bitmap: 0/3 pages [0KB], 65536KB chunk

md127 : active raid1 sdf3[3] sda3[2]
      511680 blocks super 1.2 [2/2] [UU]
      bitmap: 0/1 pages [0KB], 65536KB chunk

unused devices: <none>

Reboot the system to ensure booting from Drive 2 works as expected.
NOTE: DO NOT REBOOT UNLESS MDSTAT OUTPUT LOOKS SYNC'D AS ABOVE!!!

# reboot

Have a look at the UEFI Firmware configuration. Two things of interest here:
a) BootCurrent - This tells us we successfully booted from /dev/sdc2.
b) Boot0007,Boot0008 - This tells us Dell UEFI firmware automatically discovered and added EFI entries for the alternate path.

# efibootmgr -v
BootCurrent: 0007
Timeout: 0 seconds
BootOrder: 0007,0005,0000,0001,0002,0003,0004,0006,0008,0009
Boot0000* DVDRAM SP60NB50 	ACPI(a0841d0,0)PCI(12,2)USB(2,0)USB(0,0)
Boot0001* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,0)MAC(MAC(001018bfec10,0)
Boot0002* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,1)MAC(MAC(001018bfec11,0)
Boot0003* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,2)MAC(MAC(001018bfec12,0)
Boot0004* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,3)MAC(MAC(001018bfec13,0)
Boot0005* CentOS	HD(2,11000,40001,46388b8f-a628-4fb7-a9d9-011759eb22de)File(\EFI\centos\shim.efi)
Boot0006* EFI Fixed Disk Boot Device 1	ACPI(a0841d0,0)PCI(4,0)PCI(0,0)VenMsg(d487ddb4-008b-11d9-afdc-001083ffca4d,00000000e2ab8967b3650050000000000000000012000100)HD(2,11000,40001,46388b8f-a628-4fb7-a9d9-011759eb22de)
Boot0007* CentOS-Alt	HD(2,11000,40001,9bf5d69e-0335-458f-85a6-b42cd2d720da)File(\EFI\centos\shim.efi)
Boot0008* CentOS	HD(2,11000,40001,9bf5d69e-0335-458f-85a6-b42cd2d720da)File(\EFI\centos\shim.efi)
Boot0009* EFI Fixed Disk Boot Device 2	ACPI(a0841d0,0)PCI(4,0)PCI(0,0)VenMsg(d487ddb4-008b-11d9-afdc-001083ffca4d,00000000e6ab8967b3650050000000000000000012020100)HD(2,11000,40001,9bf5d69e-0335-458f-85a6-b42cd2d720da)

Let's clear up a few items with the UEFI Firmware configuration:
a) Since it is reasonable to expect the Dell UEFI Firmware is automatically adding entries, it's safe to remove the entry we manually added (-B -b). If your UEFI Firmware is not doing this, keep the manual entry.
b) Because of item "a", we should reboot onto the alternate path the server automatically added (-n).
c) Set a proper boot order of preferred devices (-o).

# efibootmgr -v -B -b 7 -n 8 -o 0005,0008,0000,0001,0002,0003,0004,0006,0009
BootNext: 0008
BootCurrent: 0007
Timeout: 0 seconds
BootOrder: 0005,0008,0000,0001,0002,0003,0004,0006,0009
Boot0000* DVDRAM SP60NB50 	ACPI(a0841d0,0)PCI(12,2)USB(2,0)USB(0,0)
Boot0001* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,0)MAC(MAC(001018bfec10,0)
Boot0002* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,1)MAC(MAC(001018bfec11,0)
Boot0003* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,2)MAC(MAC(001018bfec12,0)
Boot0004* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,3)MAC(MAC(001018bfec13,0)
Boot0005* CentOS	HD(2,11000,40001,46388b8f-a628-4fb7-a9d9-011759eb22de)File(\EFI\centos\shim.efi)
Boot0006* EFI Fixed Disk Boot Device 1	ACPI(a0841d0,0)PCI(4,0)PCI(0,0)VenMsg(d487ddb4-008b-11d9-afdc-001083ffca4d,00000000e2ab8967b3650050000000000000000012000100)HD(2,11000,40001,46388b8f-a628-4fb7-a9d9-011759eb22de)
Boot0008* CentOS	HD(2,11000,40001,9bf5d69e-0335-458f-85a6-b42cd2d720da)File(\EFI\centos\shim.efi)
Boot0009* EFI Fixed Disk Boot Device 2	ACPI(a0841d0,0)PCI(4,0)PCI(0,0)VenMsg(d487ddb4-008b-11d9-afdc-001083ffca4d,00000000e6ab8967b3650050000000000000000012020100)HD(2,11000,40001,9bf5d69e-0335-458f-85a6-b42cd2d720da)

Confirm all RAID devices are sync'd.

# cat /proc/mdstat
Personalities : [raid1] 
md126 : active raid1 sdc4[2] sda4[0]
      311472128 blocks super 1.2 [2/2] [UU]
      bitmap: 0/3 pages [0KB], 65536KB chunk

md127 : active raid1 sdc3[3] sda3[2]
      511680 blocks super 1.2 [2/2] [UU]
      bitmap: 0/1 pages [0KB], 65536KB chunk

unused devices: <none>

Reboot the system.

# reboot

Have a look at the UEFI Firmware configuration. Items to note: BootCurrent is from Drive 2 and no additional entries created by server UEFI Firmware.

# efibootmgr -v
BootCurrent: 0008
Timeout: 0 seconds
BootOrder: 0005,0008,0000,0001,0002,0003,0004,0006,0009
Boot0000* DVDRAM SP60NB50 	ACPI(a0841d0,0)PCI(12,2)USB(2,0)USB(0,0)
Boot0001* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,0)MAC(MAC(001018bfec10,0)
Boot0002* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,1)MAC(MAC(001018bfec11,0)
Boot0003* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,2)MAC(MAC(001018bfec12,0)
Boot0004* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,3)MAC(MAC(001018bfec13,0)
Boot0005* CentOS	HD(2,11000,40001,46388b8f-a628-4fb7-a9d9-011759eb22de)File(\EFI\centos\shim.efi)
Boot0006* EFI Fixed Disk Boot Device 1	ACPI(a0841d0,0)PCI(4,0)PCI(0,0)VenMsg(d487ddb4-008b-11d9-afdc-001083ffca4d,00000000e2ab8967b3650050000000000000000012000100)HD(2,11000,40001,46388b8f-a628-4fb7-a9d9-011759eb22de)
Boot0008* CentOS	HD(2,11000,40001,9bf5d69e-0335-458f-85a6-b42cd2d720da)File(\EFI\centos\shim.efi)
Boot0009* EFI Fixed Disk Boot Device 2	ACPI(a0841d0,0)PCI(4,0)PCI(0,0)VenMsg(d487ddb4-008b-11d9-afdc-001083ffca4d,00000000e6ab8967b3650050000000000000000012020100)HD(2,11000,40001,9bf5d69e-0335-458f-85a6-b42cd2d720da)

Primary Drive

Simulate Immediate Failure of Drive 1

Boot the system.

Review drive and partition layout.

# lsblk -i
NAME            MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sda               8:0    0 465.8G  0 disk  
|-sda1            8:1    0    32M  0 part  
|-sda2            8:2    0   128M  0 part  
|-sda3            8:3    0   500M  0 part  
| `-md127         9:127  0 499.7M  0 raid1 /boot
`-sda4            8:4    0 295.2G  0 part  
  `-md126         9:126  0   295G  0 raid1 
    |-vg00-swap 253:0    0     6G  0 lvm   [SWAP]
    |-vg00-usr  253:1    0     3G  0 lvm   /usr
    |-vg00-root 253:2    0     3G  0 lvm   /
    |-vg00-home 253:3    0     1G  0 lvm   /home
    |-vg00-opt  253:4    0     1G  0 lvm   /opt
    |-vg00-tmp  253:5    0     1G  0 lvm   /tmp
    `-vg00-var  253:6    0     2G  0 lvm   /var
sdb               8:16   0 931.5G  0 disk  
sdc               8:32   0 298.1G  0 disk  
|-sdc1            8:33   0    32M  0 part  
|-sdc2            8:34   0   128M  0 part  /boot/efi
|-sdc3            8:35   0   500M  0 part  
| `-md127         9:127  0 499.7M  0 raid1 /boot
`-sdc4            8:36   0 295.2G  0 part  
  `-md126         9:126  0   295G  0 raid1 
    |-vg00-swap 253:0    0     6G  0 lvm   [SWAP]
    |-vg00-usr  253:1    0     3G  0 lvm   /usr
    |-vg00-root 253:2    0     3G  0 lvm   /
    |-vg00-home 253:3    0     1G  0 lvm   /home
    |-vg00-opt  253:4    0     1G  0 lvm   /opt
    |-vg00-tmp  253:5    0     1G  0 lvm   /tmp
    `-vg00-var  253:6    0     2G  0 lvm   /var
sdd               8:48   0 931.5G  0 disk  
sde               8:64   0 931.5G  0 disk  
sdf               8:80   0 931.5G  0 disk  
sr0              11:0    1   636M  0 rom   

Identify /dev/sda (Drive 1) if unsure. Look at the front of the server for a near-steady activity light. Press CTRL+c to end the dd command.

# dd if=/dev/sda of=/dev/null
^C1381385+0 records in
1381384+0 records out
707268608 bytes (707 MB) copied, 5.13658 s, 138 MB/s

Physically hot-pull Drive 1 from the server.

Check dmesg for kernel output. Notice how the drive went offline.

# dmesg
...
[  203.769220] md: md126 still in use.
[  203.769223] md: md127 still in use.
[  203.770136] md/raid1:md126: Disk failure on sda4, disabling device.
md/raid1:md126: Operation continuing on 1 devices.
[  203.770137] md/raid1:md127: Disk failure on sda3, disabling device.
md/raid1:md127: Operation continuing on 1 devices.
[  203.831328] RAID1 conf printout:
[  203.831333]  --- wd:1 rd:2
[  203.831335]  disk 0, wo:1, o:0, dev:sda3
[  203.831337]  disk 1, wo:0, o:1, dev:sdc3
[  203.843822] RAID1 conf printout:
[  203.843826]  --- wd:1 rd:2
[  203.843829]  disk 0, wo:1, o:0, dev:sda4
[  203.843831]  disk 1, wo:0, o:1, dev:sdc4
[  203.885064] sd 0:0:0:0: [sda] Synchronizing SCSI cache
[  203.885090] sd 0:0:0:0: [sda]  
[  203.885092] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
[  203.885236] mpt2sas0: removing handle(0x000b), sas_addr(0x500065b36789abe2)
[  203.887866] RAID1 conf printout:
[  203.887869]  --- wd:1 rd:2
[  203.887871]  disk 1, wo:0, o:1, dev:sdc3
[  203.890862] RAID1 conf printout:
[  203.890866]  --- wd:1 rd:2
[  203.890869]  disk 1, wo:0, o:1, dev:sdc4
[  203.925833] md: unbind<sda3>
[  203.930821] md: export_rdev(sda3)
[  203.938263] md: unbind<sda4>
[  203.943798] md: export_rdev(sda4)

Check RAID status. Should show one drive missing in RAID1 arrays associated with this drive.

# cat /proc/mdstat
Personalities : [raid1] 
md126 : active raid1 sdc4[2]
      309374976 blocks super 1.2 [2/1] [_U]
      bitmap: 1/3 pages [4KB], 65536KB chunk

md127 : active raid1 sdc3[3]
      511680 blocks super 1.2 [2/1] [_U]
      bitmap: 0/1 pages [0KB], 65536KB chunk

unused devices: <none>

Check the mounted filesystems. If you see /dev/sda2 (/boot/efi) mounted, go ahead and unmount it. Instead, you may see an error as such. Unmount /boot/efi, then mount it again. This time, /dev/sdc2 should show up.

# df
df: ‘/boot/efi’: Input/output error
Filesystem            1K-blocks   Used Available Use% Mounted on
/dev/mapper/vg00-root   3030800  30216   2826916   2% /
devtmpfs                1878604      0   1878604   0% /dev
tmpfs                   1921220      0   1921220   0% /dev/shm
tmpfs                   1921220   8712   1912508   1% /run
tmpfs                   1921220      0   1921220   0% /sys/fs/cgroup
/dev/mapper/vg00-usr    3030800 876624   1980508  31% /usr
/dev/md127               487314 153730    303904  34% /boot
/dev/mapper/vg00-var    1998672  78916   1798516   5% /var
/dev/mapper/vg00-opt     999320   2564    927944   1% /opt
/dev/mapper/vg00-tmp     999320   2604    927904   1% /tmp
/dev/mapper/vg00-home    999320   2580    927928   1% /home
# umount /boot/efi
# mount /boot/efi
# df
Filesystem            1K-blocks   Used Available Use% Mounted on
/dev/mapper/vg00-root   3030800  30216   2826916   2% /
devtmpfs                1878604      0   1878604   0% /dev
tmpfs                   1921220      0   1921220   0% /dev/shm
tmpfs                   1921220   8712   1912508   1% /run
tmpfs                   1921220      0   1921220   0% /sys/fs/cgroup
/dev/mapper/vg00-usr    3030800 876624   1980508  31% /usr
/dev/md127               487314 153730    303904  34% /boot
/dev/mapper/vg00-var    1998672  78916   1798516   5% /var
/dev/mapper/vg00-opt     999320   2564    927944   1% /opt
/dev/mapper/vg00-tmp     999320   2604    927904   1% /tmp
/dev/mapper/vg00-home    999320   2580    927928   1% /home
/dev/sdc2                130800   9980    120820   8% /boot/efi

Reboot to test full boot process with only Drive 2.

# reboot

Review drive and partition layout. Notice this time Drive 2 shows as /dev/sda.

# lsblk -i
NAME            MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sda               8:0    0 298.1G  0 disk  
|-sda1            8:1    0    32M  0 part  
|-sda2            8:2    0   128M  0 part  /boot/efi
|-sda3            8:3    0   500M  0 part  
| `-md126         9:126  0 499.7M  0 raid1 /boot
`-sda4            8:4    0 295.2G  0 part  
  `-md127         9:127  0   295G  0 raid1 
    |-vg00-swap 253:0    0     6G  0 lvm   [SWAP]
    |-vg00-usr  253:1    0     3G  0 lvm   /usr
    |-vg00-root 253:2    0     3G  0 lvm   /
    |-vg00-home 253:3    0     1G  0 lvm   /home
    |-vg00-opt  253:4    0     1G  0 lvm   /opt
    |-vg00-tmp  253:5    0     1G  0 lvm   /tmp
    `-vg00-var  253:6    0     2G  0 lvm   /var
sdb               8:16   0 931.5G  0 disk  
sdc               8:32   0 931.5G  0 disk  
sdd               8:48   0 931.5G  0 disk  
sde               8:64   0 931.5G  0 disk  
sr0              11:0    1   636M  0 rom   

Check to ensure /boot/efi is mounted with /dev/sda2.

# df
Filesystem            1K-blocks   Used Available Use% Mounted on
/dev/mapper/vg00-root   3030800  30236   2826896   2% /
devtmpfs                1878604      0   1878604   0% /dev
tmpfs                   1888252      0   1888252   0% /dev/shm
tmpfs                   1888252   8724   1879528   1% /run
tmpfs                   1888252      0   1888252   0% /sys/fs/cgroup
/dev/mapper/vg00-usr    3030800 877240   1979892  31% /usr
/dev/md126               487314 153815    303819  34% /boot
/dev/mapper/vg00-var    1998672  81748   1795684   5% /var
/dev/mapper/vg00-home    999320   2580    927928   1% /home
/dev/mapper/vg00-tmp     999320   2604    927904   1% /tmp
/dev/sda2                130800   9980    120820   8% /boot/efi
/dev/mapper/vg00-opt     999320   2564    927944   1% /opt

Fabricate a "New" Drive

We need to simulate a new drive insertion/rebuild, so we must properly clean the partitions of any filesystem and RAID metadata, then wipe the partition table.

Physically hot-plug Drive 1 back into the server.

Check dmesg for kernel output. Notice which drive it shows up as (/dev/sdf in my case).

# dmesg
...
[  217.112988] scsi 0:0:6:0: Direct-Access     ATA      WDC WD5003ABYX-1 1S05 PQ: 0 ANSI: 5
[  217.112994] scsi 0:0:6:0: SATA: handle(0x0011), sas_addr(0x500065b36789abe2), phy(2), device_name(0x4ee050019c145942)
[  217.112997] scsi 0:0:6:0: SATA: enclosure_logical_id(0x500065b37689abff), slot(0)
[  217.113068] scsi 0:0:6:0: atapi(n), ncq(y), asyn_notify(n), smart(y), fua(y), sw_preserve(y)
[  217.113071] scsi 0:0:6:0: qdepth(32), tagged(1), simple(0), ordered(0), scsi_level(6), cmd_que(1)
[  217.117518] sd 0:0:6:0: [sdf] 976773168 512-byte logical blocks: (500 GB/465 GiB)
[  217.125146] sd 0:0:6:0: [sdf] Write Protect is off
[  217.125149] sd 0:0:6:0: [sdf] Mode Sense: 7f 00 00 08
[  217.127573] sd 0:0:6:0: [sdf] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[  217.190716]  sdf: sdf1 sdf2 sdf3 sdf4
[  217.204172] sd 0:0:6:0: [sdf] Attached SCSI disk

Wipe partition 1 (Dell Utilities)

# dd if=/dev/zero of=/dev/sdf1 bs=512 count=1
1+0 records in
1+0 records out
512 bytes (512 B) copied, 0.0184175 s, 27.8 kB/s

Wipe partition 2 (EFI System Partition)

# dd if=/dev/zero of=/dev/sdf2 bs=512 count=1
1+0 records in
1+0 records out
512 bytes (512 B) copied, 0.000612183 s, 836 kB/s

Zero RAID superblock on partition 3 (/boot)

# mdadm --zero-superblock /dev/sdf3

Zero RAID superblock on partition 4 (pv00/vg00)

# mdadm --zero-superblock /dev/sdf4

Wipe partition table of Drive 1.

# dd if=/dev/zero of=/dev/sdf bs=512 count=1
1+0 records in
1+0 records out
512 bytes (512 B) copied, 0.000822915 s, 622 kB/s

Physically hot-pull Drive 1 from system.

Reboot system from only Drive 2 again.

# reboot

Simulate Replacement Drive Insertion and Rebuild

Now that we have a "new" empty drive, we want to test the insertion and rebuild process.
NOTE: This can be referenced for actual replacement drive rebuild.

Physically hot-plug Drive 1 into the server.

Check dmesg for kernel output. You should see the new drive appear (in my case, it was /dev/sdf again).

# dmesg
...
[   57.516999] scsi 0:0:6:0: Direct-Access     ATA      WDC WD5003ABYX-1 1S05 PQ: 0 ANSI: 5
[   57.517005] scsi 0:0:6:0: SATA: handle(0x0011), sas_addr(0x500065b36789abe2), phy(2), device_name(0x4ee050019c145942)
[   57.517007] scsi 0:0:6:0: SATA: enclosure_logical_id(0x500065b37689abff), slot(0)
[   57.517078] scsi 0:0:6:0: atapi(n), ncq(y), asyn_notify(n), smart(y), fua(y), sw_preserve(y)
[   57.517081] scsi 0:0:6:0: qdepth(32), tagged(1), simple(0), ordered(0), scsi_level(6), cmd_que(1)
[   57.521408] sd 0:0:6:0: [sdf] 976773168 512-byte logical blocks: (500 GB/465 GiB)
[   57.529091] sd 0:0:6:0: [sdf] Write Protect is off
[   57.529095] sd 0:0:6:0: [sdf] Mode Sense: 7f 00 00 08
[   57.531539] sd 0:0:6:0: [sdf] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[   57.567808]  sdf: unknown partition table
[   57.605265] sd 0:0:6:0: [sdf] Attached SCSI disk

As we can see from the above output, there is no partition table. We will replicate one from the partition table of Drive 2 (/dev/sda).

# sgdisk -R /dev/sdf /dev/sda
The operation has completed successfully.

For good measure, we should confirm that both drive partition tables are identical.

# parted /dev/sda unit s print
Model: ATA ST320DM000-1BD14 (scsi)
Disk /dev/sda: 625142448s
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags: 

Number  Start     End         Size        File system  Name                  Flags
 1      2048s     67607s      65560s      fat16        Dell_Utility          diag
 2      69632s    331776s     262145s     fat16        EFI System Partition  boot
 3      333824s   1357824s    1024001s                 BOOTFS                raid
 4      1359872s  620371967s  619012096s                                     raid

# parted /dev/sdf unit s print
Model: ATA WDC WD5003ABYX-1 (scsi)
Disk /dev/sdf: 976773168s
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags: 

Number  Start     End         Size        File system  Name                  Flags
 1      2048s     67607s      65560s                   Dell_Utility          diag
 2      69632s    331776s     262145s                  EFI System Partition  boot
 3      333824s   1357824s    1024001s    ext4         BOOTFS                raid
 4      1359872s  620371967s  619012096s                                     raid

When we replicated the partition table, we also replicated the UUIDs of the partitions. We can rectify this with sgdisk's -G flag. New UUIDs will be generated for the partitions.

# sgdisk -G /dev/sdf
The operation has completed successfully.

Copy partition 1 (Dell Utilities) from Drive 2 to Drive 1.

# dd if=/dev/sda1 of=/dev/sdf1
65560+0 records in
65560+0 records out
33566720 bytes (34 MB) copied, 0.942867 s, 35.6 MB/s

Copy partition 2 (EFI System Partition) from Drive 2 to Drive 1. It's good to do this while the filesystem is quiesed, so we unmount it.

# umount /boot/efi
# dd if=/dev/sda2 of=/dev/sdf2
262145+0 records in
262145+0 records out
134218240 bytes (134 MB) copied, 2.26022 s, 59.4 MB/s
# mount /boot/efi

Remove old entry for Drive 1 in UEFI Firmware. To ensure we remove the one for Drive 1, and because we changed the UUIDs above, we will search for Drive 2 (/dev/sda2), and remove the other entry.

# efibootmgr -v
BootCurrent: 0008
Timeout: 0 seconds
BootOrder: 0005,0008,0000,0001,0002,0003,0004,0009
Boot0000* DVDRAM SP60NB50 	ACPI(a0841d0,0)PCI(12,2)USB(2,0)USB(0,0)
Boot0001* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,0)MAC(MAC(001018bfec10,0)
Boot0002* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,1)MAC(MAC(001018bfec11,0)
Boot0003* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,2)MAC(MAC(001018bfec12,0)
Boot0004* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,3)MAC(MAC(001018bfec13,0)
Boot0005* CentOS	HD(2,11000,40001,46388b8f-a628-4fb7-a9d9-011759eb22de)File(\EFI\centos\shim.efi)
Boot0008* CentOS	HD(2,11000,40001,9bf5d69e-0335-458f-85a6-b42cd2d720da)File(\EFI\centos\shim.efi)
Boot0009* EFI Fixed Disk Boot Device 1	ACPI(a0841d0,0)PCI(4,0)PCI(0,0)VenMsg(d487ddb4-008b-11d9-afdc-001083ffca4d,00000000e6ab8967b3650050000000000000000012020100)HD(2,11000,40001,9bf5d69e-0335-458f-85a6-b42cd2d720da)
# blkid /dev/sda2
/dev/sda2: SEC_TYPE="msdos" UUID="1FC9-8164" TYPE="vfat" PARTLABEL="EFI System Partition" PARTUUID="9bf5d69e-0335-458f-85a6-b42cd2d720da" 
# efibootmgr -v -B -b 5
BootCurrent: 0008
Timeout: 0 seconds
BootOrder: 0008,0000,0001,0002,0003,0004,0009
Boot0000* DVDRAM SP60NB50 	ACPI(a0841d0,0)PCI(12,2)USB(2,0)USB(0,0)
Boot0001* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,0)MAC(MAC(001018bfec10,0)
Boot0002* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,1)MAC(MAC(001018bfec11,0)
Boot0003* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,2)MAC(MAC(001018bfec12,0)
Boot0004* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,3)MAC(MAC(001018bfec13,0)
Boot0008* CentOS	HD(2,11000,40001,9bf5d69e-0335-458f-85a6-b42cd2d720da)File(\EFI\centos\shim.efi)
Boot0009* EFI Fixed Disk Boot Device 1	ACPI(a0841d0,0)PCI(4,0)PCI(0,0)VenMsg(d487ddb4-008b-11d9-afdc-001083ffca4d,00000000e6ab8967b3650050000000000000000012020100)HD(2,11000,40001,9bf5d69e-0335-458f-85a6-b42cd2d720da)

Add a new entry to the UEFI Firmware for Drive 1 since the UUID is different. By default, when a new entry is added, it is set first in the boot order. This is fine for now. It will need to be tested.

# efibootmgr -v -c -L CentOS-Pri -d /dev/sdf -p 2 -l '\EFI\centos\shim.efi'
BootCurrent: 0008
Timeout: 0 seconds
BootOrder: 0005,0008,0000,0001,0002,0003,0004,0009
Boot0000* DVDRAM SP60NB50 	ACPI(a0841d0,0)PCI(12,2)USB(2,0)USB(0,0)
Boot0001* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,0)MAC(MAC(001018bfec10,0)
Boot0002* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,1)MAC(MAC(001018bfec11,0)
Boot0003* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,2)MAC(MAC(001018bfec12,0)
Boot0004* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,3)MAC(MAC(001018bfec13,0)
Boot0008* CentOS	HD(2,11000,40001,9bf5d69e-0335-458f-85a6-b42cd2d720da)File(\EFI\centos\shim.efi)
Boot0009* EFI Fixed Disk Boot Device 1	ACPI(a0841d0,0)PCI(4,0)PCI(0,0)VenMsg(d487ddb4-008b-11d9-afdc-001083ffca4d,00000000e6ab8967b3650050000000000000000012020100)HD(2,11000,40001,9bf5d69e-0335-458f-85a6-b42cd2d720da)
Boot0005* CentOS-Pri	HD(2,11000,40001,69260b8d-6f7a-4503-b252-8b5f31170bd2)File(\EFI\centos\shim.efi)

Add partition 3 back to /dev/md/boot RAID 1 array.

# mdadm /dev/md/boot --add /dev/sdf3
mdadm: added /dev/sdf3

Add partition 4 back to /dev/md/pv00 RAID 1 array.

# mdadm /dev/md/pv00 --add /dev/sdf4
mdadm: added /dev/sdf4

Keep an eye out for when the drives are synched up.

# cat /proc/mdstat
Personalities : [raid1] 
md126 : active raid1 sdf4[3] sda4[2]
      309374976 blocks super 1.2 [2/1] [_U]
      [=====>...............]  recovery = 28.1% (87165184/309374976) finish=33.9min speed=108926K/sec
      bitmap: 0/3 pages [0KB], 65536KB chunk

md127 : active raid1 sdf3[2] sda3[3]
      511680 blocks super 1.2 [2/2] [UU]
      bitmap: 0/1 pages [0KB], 65536KB chunk

unused devices: <none>

Example of sync'd raid volumes

# cat /proc/mdstat
Personalities : [raid1] 
md126 : active raid1 sdf4[3] sda4[2]
      309374976 blocks super 1.2 [2/2] [UU]
      bitmap: 0/3 pages [0KB], 65536KB chunk

md127 : active raid1 sdf3[2] sda3[3]
      511680 blocks super 1.2 [2/2] [UU]
      bitmap: 0/1 pages [0KB], 65536KB chunk

unused devices: <none>

Reboot the system to ensure booting from Drive 1 works as expected.
NOTE: DO NOT REBOOT UNLESS MDSTAT OUTPUT LOOKS SYNC'D AS ABOVE!!!

# reboot

Have a look at the UEFI Firmware configuration. Two things of interest here:
a) BootCurrent - This tells us we successfully booted from Drive 1.
b) Boot0005,Boot0006 - This tells us Dell UEFI firmware automatically discovered and added EFI entries for the alternate path.

# efibootmgr -v
BootCurrent: 0005
Timeout: 0 seconds
BootOrder: 0005,0008,0000,0001,0002,0003,0004,0009,0006,0007
Boot0000* DVDRAM SP60NB50 	ACPI(a0841d0,0)PCI(12,2)USB(2,0)USB(0,0)
Boot0001* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,0)MAC(MAC(001018bfec10,0)
Boot0002* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,1)MAC(MAC(001018bfec11,0)
Boot0003* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,2)MAC(MAC(001018bfec12,0)
Boot0004* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,3)MAC(MAC(001018bfec13,0)
Boot0005* CentOS-Pri	HD(2,11000,40001,69260b8d-6f7a-4503-b252-8b5f31170bd2)File(\EFI\centos\shim.efi)
Boot0006* CentOS	HD(2,11000,40001,69260b8d-6f7a-4503-b252-8b5f31170bd2)File(\EFI\centos\shim.efi)
Boot0007* EFI Fixed Disk Boot Device 1	ACPI(a0841d0,0)PCI(4,0)PCI(0,0)VenMsg(d487ddb4-008b-11d9-afdc-001083ffca4d,00000000e2ab8967b3650050000000000000000012000100)HD(2,11000,40001,69260b8d-6f7a-4503-b252-8b5f31170bd2)
Boot0008* CentOS	HD(2,11000,40001,9bf5d69e-0335-458f-85a6-b42cd2d720da)File(\EFI\centos\shim.efi)
Boot0009* EFI Fixed Disk Boot Device 2	ACPI(a0841d0,0)PCI(4,0)PCI(0,0)VenMsg(d487ddb4-008b-11d9-afdc-001083ffca4d,00000000e6ab8967b3650050000000000000000012020100)HD(2,11000,40001,9bf5d69e-0335-458f-85a6-b42cd2d720da)

Let's clear up a few items with the UEFI Firmware configuration:
a) Since it is reasonable to expect the Dell UEFI Firmware is automatically adding entries, it's safe to remove the entry we manually added (-B -b). If your UEFI Firmware is not doing this, keep the manual entry.
b) Because of item "a", we should reboot onto the alternate path the server automatically added (-n).
c) Set a proper boot order of preferred devices (-o).

# efibootmgr -v -B -b 5 -n 6 -o 0006,0008,0000,0001,0002,0003,0004,0007,0009
BootNext: 0006
BootCurrent: 0005
Timeout: 0 seconds
BootOrder: 0006,0008,0000,0001,0002,0003,0004,0007,0009
Boot0000* DVDRAM SP60NB50 	ACPI(a0841d0,0)PCI(12,2)USB(2,0)USB(0,0)
Boot0001* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,0)MAC(MAC(001018bfec10,0)
Boot0002* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,1)MAC(MAC(001018bfec11,0)
Boot0003* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,2)MAC(MAC(001018bfec12,0)
Boot0004* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,3)MAC(MAC(001018bfec13,0)
Boot0006* CentOS	HD(2,11000,40001,69260b8d-6f7a-4503-b252-8b5f31170bd2)File(\EFI\centos\shim.efi)
Boot0007* EFI Fixed Disk Boot Device 1	ACPI(a0841d0,0)PCI(4,0)PCI(0,0)VenMsg(d487ddb4-008b-11d9-afdc-001083ffca4d,00000000e2ab8967b3650050000000000000000012000100)HD(2,11000,40001,69260b8d-6f7a-4503-b252-8b5f31170bd2)
Boot0008* CentOS	HD(2,11000,40001,9bf5d69e-0335-458f-85a6-b42cd2d720da)File(\EFI\centos\shim.efi)
Boot0009* EFI Fixed Disk Boot Device 2	ACPI(a0841d0,0)PCI(4,0)PCI(0,0)VenMsg(d487ddb4-008b-11d9-afdc-001083ffca4d,00000000e6ab8967b3650050000000000000000012020100)HD(2,11000,40001,9bf5d69e-0335-458f-85a6-b42cd2d720da)

Confirm all RAID devices are sync'd.

# cat /proc/mdstat
Personalities : [raid1] 
md126 : active raid1 sdc4[2] sda4[3]
      309374976 blocks super 1.2 [2/2] [UU]
      bitmap: 0/3 pages [0KB], 65536KB chunk

md127 : active raid1 sdc3[3] sda3[2]
      511680 blocks super 1.2 [2/2] [UU]
      bitmap: 0/1 pages [0KB], 65536KB chunk

unused devices: <none>

Reboot the system.

# reboot

Have a look at the UEFI Firmware configuration. Items to note: BootCurrent is from Drive 1 and no additional entries created by server UEFI Firmware.

# efibootmgr -v
BootCurrent: 0006
Timeout: 0 seconds
BootOrder: 0006,0008,0000,0001,0002,0003,0004,0007,0009
Boot0000* DVDRAM SP60NB50 	ACPI(a0841d0,0)PCI(12,2)USB(2,0)USB(0,0)
Boot0001* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,0)MAC(MAC(001018bfec10,0)
Boot0002* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,1)MAC(MAC(001018bfec11,0)
Boot0003* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,2)MAC(MAC(001018bfec12,0)
Boot0004* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,3)MAC(MAC(001018bfec13,0)
Boot0006* CentOS	HD(2,11000,40001,69260b8d-6f7a-4503-b252-8b5f31170bd2)File(\EFI\centos\shim.efi)
Boot0007* EFI Fixed Disk Boot Device 1	ACPI(a0841d0,0)PCI(4,0)PCI(0,0)VenMsg(d487ddb4-008b-11d9-afdc-001083ffca4d,00000000e2ab8967b3650050000000000000000012000100)HD(2,11000,40001,69260b8d-6f7a-4503-b252-8b5f31170bd2)
Boot0008* CentOS	HD(2,11000,40001,9bf5d69e-0335-458f-85a6-b42cd2d720da)File(\EFI\centos\shim.efi)
Boot0009* EFI Fixed Disk Boot Device 2	ACPI(a0841d0,0)PCI(4,0)PCI(0,0)VenMsg(d487ddb4-008b-11d9-afdc-001083ffca4d,00000000e6ab8967b3650050000000000000000012020100)HD(2,11000,40001,9bf5d69e-0335-458f-85a6-b42cd2d720da)

Confirm we can still boot successfully from Drive 2

# efibootmgr -v -n 8
BootNext: 0008
BootCurrent: 0006
Timeout: 0 seconds
BootOrder: 0006,0008,0000,0001,0002,0003,0004,0007,0009
Boot0000* DVDRAM SP60NB50 	ACPI(a0841d0,0)PCI(12,2)USB(2,0)USB(0,0)
Boot0001* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,0)MAC(MAC(001018bfec10,0)
Boot0002* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,1)MAC(MAC(001018bfec11,0)
Boot0003* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,2)MAC(MAC(001018bfec12,0)
Boot0004* Broadcom NetXtreme Gigabit Ethernet (BCM5719)	ACPI(a0841d0,0)PCI(b,0)PCI(0,3)MAC(MAC(001018bfec13,0)
Boot0006* CentOS	HD(2,11000,40001,69260b8d-6f7a-4503-b252-8b5f31170bd2)File(\EFI\centos\shim.efi)
Boot0007* EFI Fixed Disk Boot Device 1	ACPI(a0841d0,0)PCI(4,0)PCI(0,0)VenMsg(d487ddb4-008b-11d9-afdc-001083ffca4d,00000000e2ab8967b3650050000000000000000012000100)HD(2,11000,40001,69260b8d-6f7a-4503-b252-8b5f31170bd2)
Boot0008* CentOS	HD(2,11000,40001,9bf5d69e-0335-458f-85a6-b42cd2d720da)File(\EFI\centos\shim.efi)
Boot0009* EFI Fixed Disk Boot Device 2	ACPI(a0841d0,0)PCI(4,0)PCI(0,0)VenMsg(d487ddb4-008b-11d9-afdc-001083ffca4d,00000000e6ab8967b3650050000000000000000012020100)HD(2,11000,40001,9bf5d69e-0335-458f-85a6-b42cd2d720da)

Reboot the system

# reboot

Confirm we booted from Drive 2 successfully

# efibootmgr -v | grep BootCurrent
BootCurrent: 0008

Reset System Back

In case you feel you want to try again or you didn't like the procedure and wanted to try something else, this section is for putting things back like they were.

First, boot the system from the Installation DVD. This will allow you to re-partition and wipe the drives without them being active.

Access the commandline as shown in section 2.2.2.

Assemble all RAID volumes so we can act on them appropriately.

# mdadm --assemble --scan
mdadm: /dev/md/host.example.com:boot has been started with 2 drives.

Review all partitions and RAID configurations

# lsblk -i
NAME            MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sda               8:0    0 465.8G  0 disk  
|-sda1            8:1    0    32M  0 part  
|-sda2            8:2    0   128M  0 part  
|-sda3            8:3    0   500M  0 part  
| `-md126         9:126  0 499.7M  0 raid1 
`-sda4            8:4    0 297.2G  0 part  
  `-md127         9:127  0   297G  0 raid1 
    |-vg00-swap 253:3    0     6G  0 lvm   
    |-vg00-home 253:4    0     1G  0 lvm   
    |-vg00-opt  253:5    0     1G  0 lvm   
    |-vg00-tmp  253:6    0     1G  0 lvm   
    |-vg00-usr  253:7    0     3G  0 lvm   
    `-vg00-var  253:8    0     2G  0 lvm   
sdb               8:16   0 931.5G  0 disk  
sdc               8:32   0 298.1G  0 disk  
|-sdc1            8:33   0    32M  0 part  
|-sdc2            8:34   0   128M  0 part  
|-sdc3            8:35   0   500M  0 part  
| `-md126         9:126  0 499.7M  0 raid1 
`-sdc4            8:36   0 297.2G  0 part  
  `-md127         9:127  0   297G  0 raid1 
    |-vg00-swap 253:3    0     6G  0 lvm   
    |-vg00-home 253:4    0     1G  0 lvm   
    |-vg00-opt  253:5    0     1G  0 lvm   
    |-vg00-tmp  253:6    0     1G  0 lvm   
    |-vg00-usr  253:7    0     3G  0 lvm   
    `-vg00-var  253:8    0     2G  0 lvm   
sdd               8:48   0 931.5G  0 disk  
sde               8:64   0 931.5G  0 disk  
sdf               8:80   0 931.5G  0 disk  
sr0              11:0    1   636M  0 rom   /run/install/repo
loop0             7:0    0 274.8M  1 loop  
loop1             7:1    0     2G  1 loop  
|-live-rw       253:0    0     2G  0 dm    /
`-live-base     253:1    0     2G  1 dm    
loop2             7:2    0   512M  0 loop  
`-live-rw       253:0    0     2G  0 dm    /


Remove RAID Volumes

Deactivate Volume Group vg00 so we can remove RAID volumes

# vgremove -f vg00
  Logical volume "root" successfully removed
  Logical volume "swap" successfully removed
  Logical volume "home" successfully removed
  Logical volume "opt" successfully removed
  Logical volume "tmp" successfully removed
  Logical volume "usr" successfully removed
  Logical volume "var" successfully removed
  Volume group "vg00" successfully removed

Deactivate RAID devices

# mdadm --stop /dev/md127       
mdadm: stopped /dev/md127
# mdadm --stop /dev/md126
mdadm: stopped /dev/md126

Remove RAID superblock on disk devices previously participating in RAID volume

# mdadm --zero-superblock /dev/sda3
# mdadm --zero-superblock /dev/sda4
# mdadm --zero-superblock /dev/sdc3
# mdadm --zero-superblock /dev/sdc4

Reset Partitions on Drive 1

Wipe Filesystem from /dev/sda2

# dd if=/dev/zero of=/dev/sda2 bs=512 count=1
1+0 records in
1+0 records out
512 bytes (512 B) copied, 0.000296246 s, 1.7 MB/s

Wipe Filesystem from /dev/sda1

# dd if=/dev/zero of=/dev/sda1 bs=512 count=1
1+0 records in
1+0 records out
512 bytes (512 B) copied, 0.000393731 s, 1.3 MB/s

Wipe entire partition table on /dev/sda

# dd if=/dev/zero of=/dev/sda bs=512 count=1
1+0 records in
1+0 records out
512 bytes (512 B) copied, 0.000378895 s, 1.4 MB/s

Create original partition table on Drive 1

Place original disk identifier back on disk

NOTE: The disk identifier in this example is: 0xd4e911ee Replace with yours respectively.

# echo -e "x\ni\n0xd4e911ee\nw" | fdisk /dev/sda
Welcome to fdisk (util-linux 2.23.2).

Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.

Device does not contain a recognized partition table
Building a new DOS disklabel with disk identifier 0x472100cc.

Command (m for help): 
Expert command (m for help): New disk identifier (current 0x472100cc): Disk identifier: 0xd4e911ee

Expert command (m for help): The partition table has been altered!

Calling ioctl() to re-read partition table.
Syncing disks.

Place original partition table back on disk

# echo -e "20 65560 de -\n67584 4194304 c *" | sfdisk -u S /dev/sda
Checking that no-one is using this disk right now ...
OK

Disk /dev/sda: 60801 cylinders, 255 heads, 63 sectors/track
Old situation:
Units: sectors of 512 bytes, counting from 0

   Device Boot    Start       End   #sectors  Id  System
/dev/sda1             0         -          0   0  Empty
/dev/sda2             0         -          0   0  Empty
/dev/sda3             0         -          0   0  Empty
/dev/sda4             0         -          0   0  Empty
New situation:
Units: sectors of 512 bytes, counting from 0

   Device Boot    Start       End   #sectors  Id  System
/dev/sda1            20     65579      65560  de  Dell Utility
/dev/sda2   *     67584   4261887    4194304   c  W95 FAT32 (LBA)
/dev/sda3             0         -          0   0  Empty
/dev/sda4             0         -          0   0  Empty
Warning: partition 1 does not end at a cylinder boundary
Warning: partition 2 does not start at a cylinder boundary
Warning: partition 2 does not end at a cylinder boundary
Successfully wrote the new partition table

Re-reading the partition table ...

If you created or changed a DOS partition, /dev/foo7, say, then use dd(1)
to zero the first 512 bytes:  dd if=/dev/zero of=/dev/foo7 bs=512 count=1
(See fdisk(8).)

Add back Dell diagnostic utilities back to partition 1

# dd if=/dev/sdc1 of=/dev/sda1
65560+0 records in
65560+0 records out
33566720 bytes (34 MB) copied, 0.98346 s, 34.1 MB/s

Confirm partition 1 mounts appropriately and we see contents

# mkdir /tmpmnt
# mount /dev/sda1 /tmpmnt
# ls -l /tmpmnt
total 114
-rwxr-xr-x. 1 root root 57389 Aug 13  2008 COMMAND.COM
-r-xr-xr-x. 1 root root 23856 Aug 13  2008 DELLBIO.BIN
-r-xr-xr-x. 1 root root 30978 Aug 13  2008 DELLRMK.BIN
# umount /tmpmnt
# 

Add back the empty FAT32 filesystem back to partition 2

# mkfs.fat -F 32 /dev/sda2
mkfs.fat 3.0.20 (12 Jun 2013)

Confirm partition table layout looks like it did before we started work

# fdisk -l /dev/sda

Disk /dev/sda: 500.1 GB, 500107862016 bytes, 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0xd4e911ee

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1              20       65579       32780   de  Dell Utility
/dev/sda2   *       67584     4261887     2097152    c  W95 FAT32 (LBA)
# parted /dev/sda unit s print
Model: ATA WDC WD5003ABYX-1 (scsi)
Disk /dev/sda: 976773168s
Sector size (logical/physical): 512B/512B
Partition Table: msdos
Disk Flags: 

Number  Start   End       Size      Type     File system  Flags
 1      20s     65579s    65560s    primary  fat16        diag
 2      67584s  4261887s  4194304s  primary  fat32        boot, lba

Reset Partitions on Drive 2

Wipe filesystem from /dev/sdc2

# dd if=/dev/zero of=/dev/sdc2 bs=512 count=1
1+0 records in
1+0 records out
512 bytes (512 B) copied, 0.00052408 s, 977 kB/s

Wipe filesystem from /dev/sda1

# dd if=/dev/zero of=/dev/sdc1 bs=512 count=1
1+0 records in
1+0 records out
512 bytes (512 B) copied, 0.0312699 s, 16.4 kB/s

Wipe partition table from /dev/sdc

# dd if=/dev/zero of=/dev/sdc bs=512 count=1
1+0 records in
1+0 records out
512 bytes (512 B) copied, 0.000776999 s, 659 kB/s

Remove all EFI entries from UEFI Firmware

Removing the EFI entries from the UEFI firmware can be achieved by the following three methods:

  1. Physically moving a designated jumper on the system board of your server to clear the NVRAM of your firmware. Follow your vendor-supplied documentation.
  2. I believe there may be a bug with efibootmgr or my Dell UEFI Firmware in which it does not release memory as entries are removed. This leads to a "No space left on device" error when modifying the UEFI Firmware from the OS. A google search for "efibootmgr enospc" yields plenty of other systems where this appears. My advice is that if you are running through this procedure as much as I am, you use this option before making each run. Example strace output:

    access("/sys/firmware/efi/efivars/BootNext-8be4df61-93ca-11d2-aa0d-00e098032b8c", F_OK) = 0
    open("/sys/firmware/efi/efivars/BootNext-8be4df61-93ca-11d2-aa0d-00e098032b8c", O_WRONLY|O_CREAT, 0600) = 3
    write(3, "\7\0\0\0\t\0", 6)             = -1 ENOSPC (No space left on device)
    close(3)                                = 0
    
  3. Accessing the UEFI firmware interface during POST of the system
  4. Removing the entries within Linux using efibootmgr
  5. # efibootmgr -v -B -b 1
    # efibootmgr -v -B -b 2
    # efibootmgr -v -B -b 3
    # efibootmgr -v -B -b 4
    # efibootmgr -v -B -b 5
    ...etc
    # efibootmgr -v         
    BootCurrent: 0001
    Timeout: 0 seconds
    # 
    

Ad blocker interference detected!


Wikia is a free-to-use site that makes money from advertising. We have a modified experience for viewers using ad blockers

Wikia is not accessible if you’ve made further modifications. Remove the custom ad blocker rule(s) and the page will load as expected.