Description of problem: Customer has tried to update a RHV-H via the gui upon repeated failure performed a yum update on the RHVH node itself. The manager portal reported install failed. The node is part of a replicated glusterfs storage pool. Failures in the logs show... warning: %post(redhat-virtualization-host-image-update-4.1-20180410.1.el7_5.noarch) scriptlet failed, exit status 1 2018-04-17 09:58:37 DEBUG otopi.plugins.otopi.packagers.yumpackager yumpackager.verbose:76 Yum Done: redhat-virtualization-host-image-update.noarch 0:4.1-20180410.1.el7_5 - u 2018-04-17 09:58:37 ERROR otopi.plugins.otopi.packagers.yumpackager yumpackager.error:85 Yum Non-fatal POSTIN scriptlet failure in rpm package redhat-virtualization-host-image-update-4.1-20180410.1.el7_5.noarch 2018-04-17 09:58:37 DEBUG otopi.plugins.otopi.packagers.yumpackager yumpackager.verbose:76 Yum Done: redhat-virtualization-host-image-update-4.1-20180410.1.el7_5.noarch 2018-04-17 09:58:37 DEBUG otopi.plugins.otopi.packagers.yumpackager yumpackager.verbose:76 Yum Done: redhat-virtualization-host-image-update-4.1-20180410.1.el7_5.noarch 2018-04-17 09:58:37 INFO otopi.plugins.otopi.packagers.yumpackager yumpackager.info:80 Yum updated: 2/2: redhat-virtualization-host-image-update 2018-04-17 09:58:37 DEBUG otopi.plugins.otopi.packagers.yumpackager yumpackager.verbose:76 Yum Script sink: D: ========== +++ redhat-virtualization-host-image-update-4.1-20180314.0.el7_4 noarch-linux 0x0 D: erase: redhat-virtualization-host-image-update-4.1-20180314.0.el7_4 has 3 files D: erase 100644 1 ( 0, 0) 235 /usr/share/redhat-virtualization-host/image/redhat-virtualization-host-4.1-20180314.0.el7_4.squashfs.img.meta D: erase 100644 1 ( 0, 0)635158528 /usr/share/redhat-virtualization-host/image/redhat-virtualization-host-4.1-20180314.0.el7_4.squashfs.img D: skip 040755 2 ( 0, 0) 4096 /usr/share/redhat-virtualization-host/image 2018-04-17 09:58:37 DEBUG otopi.plugins.otopi.packagers.yumpackager yumpackager.verbose:76 Yum Done: redhat-virtualization-host-image-update-4.1-20180314.0.el7_4.noarch 2018-04-17 09:58:37 INFO otopi.plugins.otopi.packagers.yumpackager yumpackager.info:80 Yum Verify: 1/2: redhat-virtualization-host-image-update.noarch 0:4.1-20180410.1.el7_5 - u 2018-04-17 09:58:37 INFO otopi.plugins.otopi.packagers.yumpackager yumpackager.info:80 Yum Verify: 2/2: redhat-virtualization-host-image-update.noarch 0:4.1-20180314.0.el7_4 - ud 2018-04-17 09:58:37 DEBUG otopi.plugins.otopi.packagers.yumpackager yumpackager.verbose:76 Yum Transaction processed 2018-04-17 09:58:37 DEBUG otopi.context context._executeMethod:142 method exception Traceback (most recent call last): File "/tmp/ovirt-tnFFlJNVEl/pythonlib/otopi/context.py", line 132, in _executeMethod method['method']() File "/tmp/ovirt-tnFFlJNVEl/otopi-plugins/otopi/packagers/yumpackager.py", line 261, in _packages self._miniyum.processTransaction() File "/tmp/ovirt-tnFFlJNVEl/pythonlib/otopi/miniyum.py", line 1050, in processTransaction _('One or more elements within Yum transaction failed') RuntimeError: One or more elements within Yum transaction failed 2018-04-17 09:58:37 ERROR otopi.context context._executeMethod:151 Failed to execute stage 'Package installation': One or more elements within Yum transaction failed 2018-04-17 09:58:37 DEBUG otopi.transaction transaction.abort:119 aborting 'Yum Transaction' 2018-04-17 09:58:37 INFO otopi.plugins.otopi.packagers.yumpackager yumpackager.info:80 Yum Performing yum transaction rollback Version-Release number of selected component (if applicable): rhevm-4.1.10.3-0.1.el7.noarch redhat-virtualization-host-image-update-4.1-20180314.0.el7_4.noarch as well as latest RHV-H How reproducible: Unknown Actual results: Upgrade fails in %post due to yum error of "One or more elements within Yum transaction failed", then aborting, and rolling back. Expected results: Upgrade completes without error to the latest redhat-virtualization-host-image-update package. Additional info: Attachments to be linked in private
This is actually a failure case we haven't seen before. Is LVM ok on this system? lvmdiskscan shows LVs, but the sosreport shows nothing under: # cat sos_commands/lvm2/lvs_-a_-o_lv_tags_devices_--config_global_locking_type_0 WARNING: Locking disabled. Be careful! This could corrupt your metadata. # imgbased is very dependent on LVM. Here's what I'm seeing from the logs -- There have been a number of failed upgrades. Those logs are gone, so I can't tell what happened there. What's happening now is: - imgbased believes the running layer is rhvh-4.1-0-20171101.0+1 (possibly due to LVM problems) - In updating, it's trying to grab fstab from /dev/rhvh/rhvh-4.1-0-20171101.0+1. fstab on that layer does not have /var (maybe it was never migrated due to a previously failed upgrade?), so we look for /etc/systemd/system/var.mount, which doesn't exist, because /var is actually in fstab - Since that fails, we can't ensure the partition layout is NIST 800-53 compliant, and we fail - Successive upgrades fail because the new LV is there What I would ask so we can find a root cause is: - The output of `imgbase layer --current` - The output of `lvs -o lv_name,tags` - The above after 'vgchange -ay --select vg_tags = imgbased:vg' - Remove all failed upgrade LVs (something basically like "for lv in `lvs --noheadings -o lv_name`; do echo $lv | grep -q `imgbase layer --current | sed -e 's/\+1//'` || lvremove rhvh/$lv; done - See what `imgbase layer --current` says now If it correctly points to 20170706, please re-try the upgrade. Unfortunately, I cannot say how it got to the current state, but it definitely looks like LVM is not ok on the system. Ultimately, this comes from: 2018-04-14 12:12:12,008 [DEBUG] (MainThread) Fetching image for '/' 2018-04-14 12:12:12,008 [DEBUG] (MainThread) Calling binary: (['findmnt', '--noheadings', '-o', 'SOURCE', '/'],) {} 2018-04-14 12:12:12,008 [DEBUG] (MainThread) Calling: (['findmnt', '--noheadings', '-o', 'SOURCE', '/'],) {'close_fds': True, 'stderr': -2} 2018-04-14 12:12:12,016 [DEBUG] (MainThread) Returned: /dev/mapper/rhvh-rhvh--4.1--0.20170706.0+1 2018-04-14 12:12:12,017 [DEBUG] (MainThread) Found '/dev/mapper/rhvh-rhvh--4.1--0.20170706.0+1' But later, LVM appears to go haywire. A patch is up, to work around this
*** Bug 1578857 has been marked as a duplicate of this bug. ***
*** Bug 1583700 has been marked as a duplicate of this bug. ***
Reproducing this requires a RHHI environment, with custom LVM filtering. In general, RHVH makes an attempt to ensure all RHVH LVs are activated before starting upgrades. However, a failed upgrade for other causes can result in an activated LV from a failed upgrade, but with no actual upgrade data in it. Neither engineering nor Virt QE has a reproducer, and the patch was written on the basis of log output.
VERIFIED on the basis of logs and patch review. If this is encountered again, please re-open.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:1820
BZ<2>Jira Resync
sync2jira