Bug 1573334
Summary: | RHV-H update to latest version fails on RHV 4.1 due to yum transaction failure | ||
---|---|---|---|
Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Robert McSwain <rmcswain> |
Component: | imgbased | Assignee: | Ryan Barry <rbarry> |
Status: | CLOSED ERRATA | QA Contact: | Yaning Wang <yaniwang> |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | 4.1.10 | CC: | cshao, dfediuck, huzhao, inetkach, jiaczhan, kshukla, lsurette, mkalinin, obockows, pstehlik, qiyuan, rbarry, rmcswain, sasundar, srevivo, weiwang, yaniwang, ycui, ykaul, yzhao |
Target Milestone: | ovirt-4.2.3-1 | Keywords: | Rebase, ZStream |
Target Release: | --- | Flags: | lsvaty:
testing_plan_complete-
|
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | imgbased-1.0.17 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2018-06-11 06:56:53 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | Node | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1582433 |
Description
Robert McSwain
2018-04-30 21:14:56 UTC
This is actually a failure case we haven't seen before. Is LVM ok on this system? lvmdiskscan shows LVs, but the sosreport shows nothing under: # cat sos_commands/lvm2/lvs_-a_-o_lv_tags_devices_--config_global_locking_type_0 WARNING: Locking disabled. Be careful! This could corrupt your metadata. # imgbased is very dependent on LVM. Here's what I'm seeing from the logs -- There have been a number of failed upgrades. Those logs are gone, so I can't tell what happened there. What's happening now is: - imgbased believes the running layer is rhvh-4.1-0-20171101.0+1 (possibly due to LVM problems) - In updating, it's trying to grab fstab from /dev/rhvh/rhvh-4.1-0-20171101.0+1. fstab on that layer does not have /var (maybe it was never migrated due to a previously failed upgrade?), so we look for /etc/systemd/system/var.mount, which doesn't exist, because /var is actually in fstab - Since that fails, we can't ensure the partition layout is NIST 800-53 compliant, and we fail - Successive upgrades fail because the new LV is there What I would ask so we can find a root cause is: - The output of `imgbase layer --current` - The output of `lvs -o lv_name,tags` - The above after 'vgchange -ay --select vg_tags = imgbased:vg' - Remove all failed upgrade LVs (something basically like "for lv in `lvs --noheadings -o lv_name`; do echo $lv | grep -q `imgbase layer --current | sed -e 's/\+1//'` || lvremove rhvh/$lv; done - See what `imgbase layer --current` says now If it correctly points to 20170706, please re-try the upgrade. Unfortunately, I cannot say how it got to the current state, but it definitely looks like LVM is not ok on the system. Ultimately, this comes from: 2018-04-14 12:12:12,008 [DEBUG] (MainThread) Fetching image for '/' 2018-04-14 12:12:12,008 [DEBUG] (MainThread) Calling binary: (['findmnt', '--noheadings', '-o', 'SOURCE', '/'],) {} 2018-04-14 12:12:12,008 [DEBUG] (MainThread) Calling: (['findmnt', '--noheadings', '-o', 'SOURCE', '/'],) {'close_fds': True, 'stderr': -2} 2018-04-14 12:12:12,016 [DEBUG] (MainThread) Returned: /dev/mapper/rhvh-rhvh--4.1--0.20170706.0+1 2018-04-14 12:12:12,017 [DEBUG] (MainThread) Found '/dev/mapper/rhvh-rhvh--4.1--0.20170706.0+1' But later, LVM appears to go haywire. A patch is up, to work around this *** Bug 1578857 has been marked as a duplicate of this bug. *** *** Bug 1583700 has been marked as a duplicate of this bug. *** Reproducing this requires a RHHI environment, with custom LVM filtering. In general, RHVH makes an attempt to ensure all RHVH LVs are activated before starting upgrades. However, a failed upgrade for other causes can result in an activated LV from a failed upgrade, but with no actual upgrade data in it. Neither engineering nor Virt QE has a reproducer, and the patch was written on the basis of log output. VERIFIED on the basis of logs and patch review. If this is encountered again, please re-open. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:1820 BZ<2>Jira Resync sync2jira sync2jira |