Description of problem: My Debian bullseye systems occasionally fail to boot due to missing LVs: systemd times out waiting for devices to appear for various local filesystems and the emergency shell is invoked. However, an immediate `vgchange -ay` activates all LVs in all VGs and lets the boot continue to success. Version-Release number of selected component (if applicable): 2.03.11-2.1 How reproducible: happens once out of 20 boots approximately Steps to Reproduce: 1. have two VGs on two disks (each disk a PV) 2. use LVs from both to mount filesystems in fstab 3. keep rebooting until the boot fails into the emergency shell Actual results: Eventually the boot fails as described above. Expected results: Successful boots only. Additional info: This is a default initramfs-tools based boot, not Dracut. Running systemd-udevd with the `--debug` option in the initramfs and grepping for `pvscan` in the log gives: ``` vdb: /usr/lib/udev/rules.d/69-lvm-metad.rules:127 RUN '/sbin/lvm pvscan --cache --activate ay --major $major --minor $minor' vdb: Running command "/sbin/lvm pvscan --cache --activate ay --major 254 --minor 16" vdb: Starting '/sbin/lvm pvscan --cache --activate ay --major 254 --minor 16' vdc: /usr/lib/udev/rules.d/69-lvm-metad.rules:127 RUN '/sbin/lvm pvscan --cache --activate ay --major $major --minor $minor' vdc: Running command "/sbin/lvm pvscan --cache --activate ay --major 254 --minor 32" vdc: Starting '/sbin/lvm pvscan --cache --activate ay --major 254 --minor 32' vdb: '/sbin/lvm pvscan --cache --activate ay --major 254 --minor 16'(out) ' pvscan[147] PV /dev/vdb online, VG ivy is complete.' vdb: '/sbin/lvm pvscan --cache --activate ay --major 254 --minor 16'(out) ' pvscan[147] VG ivy run autoactivation.' vdc: '/sbin/lvm pvscan --cache --activate ay --major 254 --minor 32'(out) ' pvscan[148] PV /dev/vdc online, VG ldap_ivy is complete.' vdc: '/sbin/lvm pvscan --cache --activate ay --major 254 --minor 32'(out) ' pvscan[148] VG ldap_ivy run autoactivation.' vdc: '/sbin/lvm pvscan --cache --activate ay --major 254 --minor 32'(err) ' /dev/mapper/control: mknod failed: File exists' vdc: '/sbin/lvm pvscan --cache --activate ay --major 254 --minor 32'(err) ' Failure to communicate with kernel device-mapper driver.' vdc: '/sbin/lvm pvscan --cache --activate ay --major 254 --minor 32'(err) ' Check that device-mapper is available in the kernel.' vdc: '/sbin/lvm pvscan --cache --activate ay --major 254 --minor 32'(err) ' Incompatible libdevmapper 1.02.175 (2021-01-08) and kernel driver (unknown version).' vdc: '/sbin/lvm pvscan --cache --activate ay --major 254 --minor 32'(out) ' 0 logical volume(s) in volume group "ldap_ivy" now active' vdc: '/sbin/lvm pvscan --cache --activate ay --major 254 --minor 32'(err) ' ldap_ivy: autoactivation failed.' vdb: '/sbin/lvm pvscan --cache --activate ay --major 254 --minor 16'(out) ' 4 logical volume(s) in volume group "ivy" now active' vdc: Process '/sbin/lvm pvscan --cache --activate ay --major 254 --minor 32' failed with exit code 5. vdc: Command "/sbin/lvm pvscan --cache --activate ay --major 254 --minor 32" returned 5 (error), ignoring. vdb: Process '/sbin/lvm pvscan --cache --activate ay --major 254 --minor 16' succeeded. ``` However, `/run/lvm/pvs_online/` at this point contains files indicating that both VGs (ldap_ivy and ivy) are online. The opposite result can happen as well, when the ldap_ivy VG gets activated and the ivy VG stays inactive. Most of the time the above error doesn't appear and both VGs activate for real. But in all cases `/run/lvm/online/` indicates full activation, even if only one VG was activated successfully. Creating `/dev/mapper/control` with the right major and minor device numbers before starting systemd-udevd in the initramfs works around the problem and results in reliable booting. I guess the message comes from `_create_control()` in `libdm-iface.c` and branching back to the `_control_exists()` call if `mknod()` fails with EEXIST would avoid the problem. Thanks for your time, Feri.
lvmetad Debian udev rule is likely some 'left-over' relict from the past since 2.03 version of lvm2 no longer provides lvmetad daemon (replaced with other way of autoactivation). However there could have been a race within creation of /dev/mapper/control device as stated in the last line - so here comes the upstream patch: https://listman.redhat.com/archives/lvm-devel/2023-February/024597.html
Great, thanks for the fix! Looks like the lvm-metad udev rule is not present in the latest Debian package, so that's probably fixed already. By the way man/lvmautoactivation.7_main still references 69-dm-lvm-metad.rules, is that intended?
Ahh - thanks for noticing - this is going to be updated soon - there are some ongoing reworks...