Description of problem: A machine installed with ovirt-node which was built with lvm2-2.03.14-1.el8.x86_64 fails during boot, while trying to mount /var, due to the filter added to /etc/lvm/lvm.conf by vdsm-tool config-lvm-filter (which is ran on ovirt-node by imgbase from imgbased project during installation). Version-Release number of selected component (if applicable): Current master vdsm, lvm2-2.03.14-1.el8.x86_64 How reproducible: Always Steps to Reproduce: 1. See above 2. 3. Actual results: Machine boots into an emergency shell failing to mount /var (and /var/log, /var/log/audit - all are separate filesystems in node). Expected results: Machine successfully boots Additional info: Nir already pushed a patch for this bug [1]. This reverts the fix for bug 1635614, so we need to decide if it's enough as-is, also making sure we do not regress. [1] https://gerrit.ovirt.org/c/vdsm/+/117748
That build of lvm somehow contains RHEL9-only features and changes, and should never have appeared in RHEL8. That build needs to be removed as soon as possible.
(In reply to David Teigland from comment #1) > That build of lvm somehow contains RHEL9-only features and changes, and > should never have appeared in RHEL8. That build needs to be removed as soon > as possible. Do we need LVM bug for this?
Adding more info from internal mail thread: On the broken system: ===================== journalctl -o json-pretty has: "MESSAGE" : "/dev/vda3 excluded by filters: device is rejected by filter config.", ... "_CMDLINE" : "/usr/sbin/lvm pvscan --cache --listvg --checkcomplete --vgonline --udevoutput --journal=output /dev/vda3", [root@localhost ~]# grep ^filter /etc/lvm/lvm.conf filter = ["a|^/dev/disk/by-id/lvm-pv-uuid-5M4lBs-4jIZ-2NBF-DRSl-dHwR-6Vfs-PQIOEB$|", "r|.*|"] /dev/disk/by-id/lvm-pv-uuid-5M4lBs-4jIZ-2NBF-DRSl-dHwR-6Vfs-PQIOEB is a symlink to /dev/vda3. This makes it fail to mount /var (and /var/log{,audit}). If I 'vgchange -ay', it does successfully mount them. lvm version: lvm2-2.03.14-1.el8.x86_64 On the working system: ====================== "MESSAGE" : " pvscan[1238] PV /dev/vda3 online, VG onn_ibm-p8-kvm-03-guest-02 is complete.", ... "_CMDLINE" : "/usr/sbin/lvm pvscan --cache --activate ay 252:3", (filter line and the symlink are similar (with a different ID)). lvm version: lvm2-2.03.12-10.el8.x86_64 ... > "_CMDLINE" : "/usr/sbin/lvm pvscan --cache --listvg > --checkcomplete --vgonline --udevoutput --journal=output /dev/vda3", This is in the udev rule for the new RHEL9 autoactivation method. Related change: https://github.com/lvmteam/lvm2/commit/67722b312390cdab29c076c912e14bd739c5c0f6#diff-6a1e9a3e15f9d614cbda0b5b26084c30ffa609a9c951ed1f89fd8a25a12edbb3R82 But this happens on Centos Stream 8: lvm2-2.03.14-1.el8.x86_64
From vdsm point of view, this regression in lvm shows that we cannot depend on the udev links (/dev/disk/by-id/lvm-pv-uuid-*). These link do not work for multipath devices (bug 2016173), and they are very fragile and can break when lvm changes the udev rule (this bug). The reason we use these links is that device names are not stable (bug 1635614). This bug will be resolved by the switch to lvm devices (bug 2012830).
Vdsm does not use the udev links /dev/disk/by-id/lvm-pv-uuid-xxx now. LVM filter created by vdsm will use the device name /dev/sd{x} for scsi devices, and /dev/mapper/{wwid} for multipath devices. This should fix the issue when booting from SAN. When adding a host to engine, new filter will use the new format. To upgrade a host with older lvm filter to the new format, run: vdsm-tool config-lvm-filter This change will be available in the next ovirt-4.5 build.
This seems to affect oVirt 4.4.9 new installs as well. Can you please confirm? If that's the case we need an urgent backport to 4.4 as well.
Diego Ercolani is reporting failure on oVirt Italia Telegram channel with: [root@ovirt-node2 ~]# rpm -qa | grep systemd systemd-container-239-51.el8.x86_64 systemd-libs-239-51.el8.x86_64 systemd-pam-239-51.el8.x86_64 systemd-udev-239-51.el8.x86_64 python3-systemd-234-8.el8.x86_64 clevis-systemd-15-4.el8.x86_64 systemd-239-51.el8.x86_64 [root@ovirt-node2 ~]# rpm -qa | grep vdsm vdsm-jsonrpc-4.40.90.4-1.el8.noarch vdsm-python-4.40.90.4-1.el8.noarch vdsm-gluster-4.40.90.4-1.el8.x86_64 vdsm-common-4.40.90.4-1.el8.noarch vdsm-client-4.40.90.4-1.el8.noarch vdsm-4.40.90.4-1.el8.x86_64 vdsm-yajsonrpc-4.40.90.4-1.el8.noarch vdsm-http-4.40.90.4-1.el8.noarch vdsm-network-4.40.90.4-1.el8.x86_64 vdsm-api-4.40.90.4-1.el8.noarch
(In reply to Sandro Bonazzola from comment #7) > Diego Ercolani is reporting failure on oVirt Italia Telegram channel with: This issues exists only on Centos Stream 8 or RHEL 8.6 nightly, both have broken lvm (bug 2026640). I don't know about any issue with RHEL 8.5. To debug, please get output of: rpm -q lvm2 grep ^filter /etc/lvm/lvm.conf lsinitrd -f /etc/lvm/lvm.conf | grep ^filter
I reproduced the same issue with Centos Stream 8 host. 1. Installed new host from CentOS-Stream-8-x86_64-20211206-dvd1.iso 2. dnf update 3. Add ovirt-release44.rpm 4. Add host to engine 4.5 master 5. reboot host Host failed to boot. Restarting using rescue mode, I fond that the host was using the expected /dev/disk/by-id/lvm-pv-uuid-xxx link in the lvm filter. Replacing the filter to /dev/vda2 and "dracut -f" fixes the issue. Packages: lvm2-2.03.14-1.el8.x86_64 vdsm-4.40.90.4-1.el8.x86_64 Working configuration: # grep ^filter /etc/lvm/lvm.conf filter = ["a|^/dev/vda2$|", "r|.*|"] # lsinitrd -f /etc/lvm/lvm.conf | grep ^filter filter = ["a|^/dev/vda2$|", "r|.*|"] Running vdsm-tool config-lvm-filter suggests to replace the working filter: # vdsm-tool config-lvm-filter Analyzing host... Found these mounted logical volumes on this host: logical volume: /dev/mapper/cs-root mountpoint: / devices: /dev/disk/by-id/lvm-pv-uuid-klrMLR-8GHy-L3nS-qLrS-32Wp-IjeD-DPoMux logical volume: /dev/mapper/cs-swap mountpoint: [SWAP] devices: /dev/disk/by-id/lvm-pv-uuid-klrMLR-8GHy-L3nS-qLrS-32Wp-IjeD-DPoMux This is the recommended LVM filter for this host: filter = [ "a|^/dev/disk/by-id/lvm-pv-uuid-klrMLR-8GHy-L3nS-qLrS-32Wp-IjeD-DPoMux$|", "r|.*|" ] This filter allows LVM to access the local devices used by the hypervisor, but not shared storage owned by Vdsm. If you add a new device to the volume group, you will need to edit the filter manually. This is the current LVM filter: filter = [ "a|^/dev/vda2$|", "r|.*|" ]
Ok, I'm going to blacklist the broken lvm2 build in node build kickstart. Can you backport the vdsm fix to 4.4.10? From Diego Ercolani: lvm2-libs-2.03.14-1.el8.x86_64 libblockdev-lvm-2.24-7.el8.x86_64 llvm-compat-libs-12.0.1-3.module_el8.6.0+1029+6594c364.x86_64 lvm2-2.03.14-1.el8.x86_64 [root@ovirt-node2 ~]# uname -a Linux ovirt-node2.ovirt 4.18.0-348.2.1.el8_5.x86_64 #1 SMP Tue Nov 16 14:42:35 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
(In reply to Sandro Bonazzola from comment #10) > Ok, I'm going to blacklist the broken lvm2 build in node build kickstart. > Can you backport the vdsm fix to 4.4.10? I don't want to change this in 4.4.10. This is an issue only when using the broken lvm version which should be fixed before rhel 8.6 will be released, so RHV users should never see this issue. This change also disabled the fix for bug 1635614, so delivering this in 4.4.10 may break users that needed that fix, for solving an issue they don't have. Porting this to 4.4.10 makes sense only in upstream, if we want to deliver it on Centos Stream 8, but I understand that we don't plan such release. This fix will not be needed once we switch to lvm devieces, replacing lvm filter, see bug 2012830. Blacklisting the broken lvm2 build sounds like the right way, for now.
This bug resulted from an issue with lvm-filter which has been replaced with lvm-devices in oVirt 4.5
This bugzilla is included in oVirt 4.5.0 release, published on April 20th 2022. Since the problem described in this bug report should be resolved in oVirt 4.5.0 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.