Bug 2058386
| Summary: | Duplicate PV errors prevent system from booting after installation when mdadm is on direct disks | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | Lark Gordon <lagordon> | ||||
| Component: | lvm2 | Assignee: | LVM and device-mapper development team <lvm-team> | ||||
| lvm2 sub component: | Activating existing Logical Volumes | QA Contact: | cluster-qe <cluster-qe> | ||||
| Status: | CLOSED CURRENTRELEASE | Docs Contact: | |||||
| Severity: | high | ||||||
| Priority: | high | CC: | agk, heinzm, jbrassow, jcastran, jkonecny, ldigby, msnitzer, nweddle, prajnoha, rmetrich, sbarcomb, zkabelac | ||||
| Version: | 8.5 | ||||||
| Target Milestone: | rc | ||||||
| Target Release: | --- | ||||||
| Hardware: | x86_64 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2022-06-13 10:52:01 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
Note In Rhel 9 the LVM filter will not used by default the system.devices file will be. see man lvmdevices and man vgimportdevices. Could you please provide us logs. You can find them in /tmp/*log in the installation environment or /var/log/anaconda/*log after the installation. I'm reassigning this issue to LVM because the issue is not with Anaconda but LVM itself. The customer facing the issue has a IMSM RAID configured in the BIOS on disks sda and sdb. He **doesn't** use the IMSM RAID for the OS tree, but only /home, which is the condition to make the issue happen. The OS tree is on another disk (a NVME disk). When installing the system and rebooting, the /home partition is NOT being mounted, causing Emergency to be entered. This happens because, due to /home not being part of the critical OS tree, dracut **didn't** include "mdraid" dracut module in the initramfs, which is **expected behaviour** (/home is not required to mount the / on /sysroot). Due to having a IMSM RAID, LVM **hijacks** the partition hosting /home and seen in the initramfs phase as /dev/sda1 and /dev/sdb1. This creates a "Duplicated PVs" warning and prevents /home from being mounted after switch root. The issue is clearly on LVM side which considers /dev/sda1 and /dev/sdb1 as "normal partitions" **even though** they are on a IMSM RAID (which cannot be mounted in the initramfs due to absence of "mdraid" dracut module). LVM needs to be smarter and has built-in code to detect the IMSM RAID and ignore such partition. REPRODUCER: You don't need a real IMSM hardware here, just a QEMU/KVM with 3 disks: - /dev/vda for OS tree - /dev/sda and /dev/sdb configured as IMSM RAID and hosting "/home" Reproducer kickstart in attachment (bz2058386.ks). Due to not having real IMSM hardware, a hack is made. If you have real hardware, remove the "export IMSM_NO_PLATFORM=1" line. REPRODUCER STEPS: 1. Install the system with the kickstart 2. Upon reboot, STOP at Grub menu and add "rd.break" 3. Check that devices for "/home" were "hijacked" by LVM -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- switch_root:/# lvm lvs WARNING: Not using device /dev/sdb1 for PV 8tbqAB-ihqI-G4dZ-hyRt-hL32-mwqp-WrsbvS. WARNING: PV 8tbqAB-ihqI-G4dZ-hyRt-hL32-mwqp-WrsbvS prefers device /dev/sda1 because device was seen first. LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert home data -wi------- 1.99g root rhel -wi-ao---- 10.00g swap rhel -wi-a----- 2.00g -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- 4. Continue the boot, system will end up in Emergency mode -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- [FAILED] Failed to start LVM event activation on device 8:1. See 'systemctl status lvm2-pvscan@8:1.service' for details. [FAILED] Failed to start LVM event activation on device 8:17. See 'systemctl status lvm2-pvscan@8:17.service' for details. [ ***] A start job is running for dev-mapp…ta\x2dhome.device (29s / 1min 30s) ... -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- 5. On non-real IMSM hardware, you can configure the IMSM Raid as shown below, but it won't be mountable anyway since LVM hijacked the devices and sees **duplicated PVs** -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- # export IMSM_NO_PLATFORM=1 # mdadm --assemble --scan -------- 8< ---------------- 8< ---------------- 8< ---------------- 8< -------- Created attachment 1889348 [details]
Kickstart to reproduce
Works fine with 8.6 |
Description of problem: New RHEL 8 installations on systems which have mdadm on direct disks fail to boot due to duplicate PV errors. An lvm filter has to be added after installation; the expectation is that the installer should do this automatically. Version-Release number of selected component (if applicable): RHEL 8.5 How reproducible: Manual installation in this situation always results in an unbootable system Steps to Reproduce: 1. Install RHEL 8 via the graphical installer on a system with BIOS RAID 1 of two equal SSDs 2. Reboot after installation 3. Boot fails with duplicate PV errors like: dracut-initqueue[610]: Scanning devices nvme0n1p3 sda1 sdb1 for LVM logical volumes vg01/root vg01/swap localhost.localdomain dracut-initqueue[610]: WARNING: Not using device /dev/sdb1 for PV XXXX dracut-initqueue[610]: WARNING: PV XXXX prefers device /dev/sda1 because device was seen first. Actual results: System fails to boot after installation due to duplicate PV errors Expected results: System should automatically create an lvm filter to prevent this. Additional info: This is the filter added after installation which allows the system to boot: filter = [ "a|/dev/md*|", "a|/dev/nvme*|", "r|.*|" ] This partition scheme fails to boot: -------------------------------------------------------- part pv.2521 --fstype="lvmpv" --ondisk=nvme0n1 --size=486761 part pv.3885 --fstype="lvmpv" --ondisk=Volume0_0 --size=927916 raid --device=imsm --level=CONTAINER --noformat --spares=1 --useexisting -------------------------------------------------------- But this works: -------------------------------------------------------- part pv.2521 --fstype="lvmpv" --ondisk=nvme0n1 --size=486761 part raid.3885 --ondisk=Volume0_0 --size=927916 raid pv.3885 --device=imsm --level=CONTAINER --noformat --spares=1 --useexisting --------------------------------------------------------