Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1573334 - RHV-H update to latest version fails on RHV 4.1 due to yum transaction failure
RHV-H update to latest version fails on RHV 4.1 due to yum transaction failure
Status: CLOSED ERRATA
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: imgbased (Show other bugs)
4.1.10
x86_64 Linux
unspecified Severity high
: ovirt-4.2.3-1
: ---
Assigned To: Ryan Barry
Yaning Wang
: Rebase, ZStream
: 1578857 1583700 (view as bug list)
Depends On:
Blocks: imgbased-1.0.17
  Show dependency treegraph
 
Reported: 2018-04-30 17:14 EDT by Robert McSwain
Modified: 2018-06-11 02:57 EDT (History)
21 users (show)

See Also:
Fixed In Version: imgbased-1.0.17
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2018-06-11 02:56:53 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Node
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
oVirt gerrit 90803 master MERGED osupdater: activate our VGs earlier 2018-05-23 07:32 EDT
oVirt gerrit 91527 ovirt-4.2 MERGED osupdater: activate our VGs earlier 2018-05-23 07:33 EDT
Red Hat Product Errata RHSA-2018:1820 None None None 2018-06-11 02:57 EDT

  None (edit)
Description Robert McSwain 2018-04-30 17:14:56 EDT
Description of problem:
Customer has tried to update a RHV-H via the gui upon repeated failure performed a yum update on the RHVH node itself.  The manager portal reported install failed. The node is part of a replicated glusterfs storage pool.

Failures in the logs show... 
warning: %post(redhat-virtualization-host-image-update-4.1-20180410.1.el7_5.noarch) scriptlet failed, exit status 1

2018-04-17 09:58:37 DEBUG otopi.plugins.otopi.packagers.yumpackager yumpackager.verbose:76 Yum Done: redhat-virtualization-host-image-update.noarch 0:4.1-20180410.1.el7_5 - u
2018-04-17 09:58:37 ERROR otopi.plugins.otopi.packagers.yumpackager yumpackager.error:85 Yum Non-fatal POSTIN scriptlet failure in rpm package redhat-virtualization-host-image-update-4.1-20180410.1.el7_5.noarch
2018-04-17 09:58:37 DEBUG otopi.plugins.otopi.packagers.yumpackager yumpackager.verbose:76 Yum Done: redhat-virtualization-host-image-update-4.1-20180410.1.el7_5.noarch
2018-04-17 09:58:37 DEBUG otopi.plugins.otopi.packagers.yumpackager yumpackager.verbose:76 Yum Done: redhat-virtualization-host-image-update-4.1-20180410.1.el7_5.noarch
2018-04-17 09:58:37 INFO otopi.plugins.otopi.packagers.yumpackager yumpackager.info:80 Yum updated: 2/2: redhat-virtualization-host-image-update
2018-04-17 09:58:37 DEBUG otopi.plugins.otopi.packagers.yumpackager yumpackager.verbose:76 Yum Script sink: D: ========== +++ redhat-virtualization-host-image-update-4.1-20180314.0.el7_4 noarch-linux 0x0
D:     erase: redhat-virtualization-host-image-update-4.1-20180314.0.el7_4 has 3 files
D: erase      100644  1 (   0,   0)   235 /usr/share/redhat-virtualization-host/image/redhat-virtualization-host-4.1-20180314.0.el7_4.squashfs.img.meta
D: erase      100644  1 (   0,   0)635158528 /usr/share/redhat-virtualization-host/image/redhat-virtualization-host-4.1-20180314.0.el7_4.squashfs.img
D: skip       040755  2 (   0,   0)  4096 /usr/share/redhat-virtualization-host/image

2018-04-17 09:58:37 DEBUG otopi.plugins.otopi.packagers.yumpackager yumpackager.verbose:76 Yum Done: redhat-virtualization-host-image-update-4.1-20180314.0.el7_4.noarch
2018-04-17 09:58:37 INFO otopi.plugins.otopi.packagers.yumpackager yumpackager.info:80 Yum Verify: 1/2: redhat-virtualization-host-image-update.noarch 0:4.1-20180410.1.el7_5 - u
2018-04-17 09:58:37 INFO otopi.plugins.otopi.packagers.yumpackager yumpackager.info:80 Yum Verify: 2/2: redhat-virtualization-host-image-update.noarch 0:4.1-20180314.0.el7_4 - ud
2018-04-17 09:58:37 DEBUG otopi.plugins.otopi.packagers.yumpackager yumpackager.verbose:76 Yum Transaction processed
2018-04-17 09:58:37 DEBUG otopi.context context._executeMethod:142 method exception
Traceback (most recent call last):
  File "/tmp/ovirt-tnFFlJNVEl/pythonlib/otopi/context.py", line 132, in _executeMethod
    method['method']()
  File "/tmp/ovirt-tnFFlJNVEl/otopi-plugins/otopi/packagers/yumpackager.py", line 261, in _packages
    self._miniyum.processTransaction()
  File "/tmp/ovirt-tnFFlJNVEl/pythonlib/otopi/miniyum.py", line 1050, in processTransaction
    _('One or more elements within Yum transaction failed')
RuntimeError: One or more elements within Yum transaction failed
2018-04-17 09:58:37 ERROR otopi.context context._executeMethod:151 Failed to execute stage 'Package installation': One or more elements within Yum transaction failed
2018-04-17 09:58:37 DEBUG otopi.transaction transaction.abort:119 aborting 'Yum Transaction'
2018-04-17 09:58:37 INFO otopi.plugins.otopi.packagers.yumpackager yumpackager.info:80 Yum Performing yum transaction rollback


Version-Release number of selected component (if applicable):
rhevm-4.1.10.3-0.1.el7.noarch                               
redhat-virtualization-host-image-update-4.1-20180314.0.el7_4.noarch as well as latest RHV-H

How reproducible:
Unknown

Actual results:
Upgrade fails in %post due to yum error of "One or more elements within Yum transaction failed", then aborting, and rolling back.

Expected results:
Upgrade completes without error to the latest redhat-virtualization-host-image-update package.

Additional info:
Attachments to be linked in private
Comment 2 Ryan Barry 2018-05-01 09:52:25 EDT
This is actually a failure case we haven't seen before.

Is LVM ok on this system? lvmdiskscan shows LVs, but the sosreport shows nothing under:

# cat sos_commands/lvm2/lvs_-a_-o_lv_tags_devices_--config_global_locking_type_0 
  WARNING: Locking disabled. Be careful! This could corrupt your metadata.
#

imgbased is very dependent on LVM.

Here's what I'm seeing from the logs --

There have been a number of failed upgrades. Those logs are gone, so I can't tell what happened there. What's happening now is:

- imgbased believes the running layer is rhvh-4.1-0-20171101.0+1 (possibly due to LVM problems)
- In updating, it's trying to grab fstab from /dev/rhvh/rhvh-4.1-0-20171101.0+1. fstab on that layer does not have /var (maybe it was never migrated due to a previously failed upgrade?), so we look for /etc/systemd/system/var.mount, which doesn't exist, because /var is actually in fstab
- Since that fails, we can't ensure the partition layout is NIST 800-53 compliant, and we fail
- Successive upgrades fail because the new LV is there

What I would ask so we can find a root cause is:

- The output of `imgbase layer --current`
- The output of `lvs -o lv_name,tags`
- The above after 'vgchange -ay --select vg_tags = imgbased:vg'
- Remove all failed upgrade LVs (something basically like "for lv in `lvs --noheadings -o lv_name`; do echo $lv | grep -q `imgbase layer --current | sed -e 's/\+1//'` || lvremove rhvh/$lv; done
- See what `imgbase layer --current` says now

If it correctly points to 20170706, please re-try the upgrade.

Unfortunately, I cannot say how it got to the current state, but it definitely looks like LVM is not ok on the system.

Ultimately, this comes from:

2018-04-14 12:12:12,008 [DEBUG] (MainThread) Fetching image for '/'
2018-04-14 12:12:12,008 [DEBUG] (MainThread) Calling binary: (['findmnt', '--noheadings', '-o', 'SOURCE', '/'],) {}
2018-04-14 12:12:12,008 [DEBUG] (MainThread) Calling: (['findmnt', '--noheadings', '-o', 'SOURCE', '/'],) {'close_fds': True, 'stderr': -2}
2018-04-14 12:12:12,016 [DEBUG] (MainThread) Returned: /dev/mapper/rhvh-rhvh--4.1--0.20170706.0+1
2018-04-14 12:12:12,017 [DEBUG] (MainThread) Found '/dev/mapper/rhvh-rhvh--4.1--0.20170706.0+1'

But later, LVM appears to go haywire. A patch is up, to work around this
Comment 3 Ryan Barry 2018-05-16 10:06:50 EDT
*** Bug 1578857 has been marked as a duplicate of this bug. ***
Comment 5 Ryan Barry 2018-05-29 10:16:15 EDT
*** Bug 1583700 has been marked as a duplicate of this bug. ***
Comment 18 Ryan Barry 2018-06-05 05:34:12 EDT
Reproducing this requires a RHHI environment, with custom LVM filtering.

In general, RHVH makes an attempt to ensure all RHVH LVs are activated before starting upgrades. However, a failed upgrade for other causes can result in an activated LV from a failed upgrade, but with no actual upgrade data in it.

Neither engineering nor Virt QE has a reproducer, and the patch was written on the basis of log output.
Comment 20 Ryan Barry 2018-06-07 08:16:53 EDT
VERIFIED on the basis of logs and patch review.

If this is encountered again, please re-open.
Comment 22 errata-xmlrpc 2018-06-11 02:56:53 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:1820

Note You need to log in before you can comment on or make changes to this bug.