Bug 2090169
Summary: | Invalid entry in /etc/multipath/wwids causes unbootable ovirt-node | |||
---|---|---|---|---|
Product: | [oVirt] vdsm | Reporter: | Jean-Louis Dupond <jean-louis> | |
Component: | General | Assignee: | Nir Soffer <nsoffer> | |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Yaning Wang <yaniwang> | |
Severity: | high | Docs Contact: | ||
Priority: | unspecified | |||
Version: | 4.50.0.13 | CC: | aefrat, aesteve, ahadas, bugs, bzlotnik, cshao, michal.skrivanek, sbonazzo, ymankad | |
Target Milestone: | ovirt-4.5.1 | Flags: | pm-rhel:
ovirt-4.5?
michal.skrivanek: exception+ |
|
Target Release: | 4.50.1.4 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | vdsm-4.50.1.4 | Doc Type: | Bug Fix | |
Doc Text: |
Cause:
LVM check for multipath components using multipath wwids file is incorrect. When configuring lvm devices, lvm may skip some devices
because it thinks they are multipath component.
Consequence:
On the next boot, the host cannot find the missing devices and the boot end in emergency mode.
Fix:
Disable lvm check for using wwids. This check is not useful when using lvm devices or filter, always used by RHV.
Result:
Host boot correctly in the case it failed to boot before.
|
Story Points: | --- | |
Clone Of: | ||||
: | 2095588 (view as bug list) | Environment: | ||
Last Closed: | 2022-06-23 07:55:04 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 2095588 |
Description
Jean-Louis Dupond
2022-05-25 09:35:40 UTC
Albert, any chance it's related to https://github.com/oVirt/vdsm/pull/228 ? > Albert, any chance it's related to https://github.com/oVirt/vdsm/pull/228 ?
It sounds similar, but I don't think so.
What triggered that change was an update in LVM that affected nodes running in rhel9 systems.
LVM now only uses event activation in rhel9, so the flag 'event_activation=0' was causing a misbehavior that left LVs inactive and broke the boot.
In this case, it seems to be a problem with LVM and multipath.
We didn't test this with node/rhvh yet so this might be a node-issue that we'll face once we test upgrades with RHVH I think this is a duplicate of bug 2076262 David already fixed this in LVM, but I don't know when the fix will be available. Jean-Louis, do you want to test the fix? you can use the rpms built by github here: https://github.com/oVirt/vdsm/actions/runs/2470998957 Updating severity, this cause node upgrade to fail and require fixing the host in emergency mode. Nir: What would be a proper way to test on ovirt-node? Can't I just change the lvmlocal.conf and rebuild the initramfs (how to do that correctly?) (In reply to Jean-Louis Dupond from comment #8) > Nir: What would be a proper way to test on ovirt-node? Can't I just change > the lvmlocal.conf and rebuild the initramfs (how to do that correctly?) Testing on existing hosts that has this issue (device in /etc/multiapth/wwids) and running "vdsm-tool config-lvm-filter" does not import the vgs devices to the devices file is good case to test. Steps: 1. Update the lvmlocal.conf with changes from the patch (new option, new revision) 2. Configure lvm: vdsm-tool config-lvm-filter 3. reboot Expected results: - use_devicesfile should be enabled in lvm.conf - lvm filter should be removed from lvm.conf - lvmdevices command should report all the relevant devices used by all host vgs. - host should reboot successfully I'm not sure that rebuilding the initramfs is needed since lvm does not use the devices file during early boot. If you want to be sure you can run dracut -f This may not be enough for ovirt-node. The other use case we need to test is upgrade - "vdsm-tool configue" installs a new lvmlocal.conf and we need to make sure the new file is use in the new layer after rebooting. |