Bug 1770893 - Manager shows successful upgrade even when upgrade failed in imgbased
Summary: Manager shows successful upgrade even when upgrade failed in imgbased
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: redhat-virtualization-host
Version: 4.3.6
Hardware: All
OS: Linux
high
high
Target Milestone: ovirt-4.4.1
: ---
Assignee: Nir Levy
QA Contact: peyu
URL:
Whiteboard:
: 1859792 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-11-11 12:50 UTC by nijin ashok
Modified: 2023-10-06 18:45 UTC (History)
16 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-08-04 16:22:08 UTC
oVirt Team: Node
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 5243561 0 None None None 2020-07-23 05:26:35 UTC
Red Hat Product Errata RHEA-2020:3316 0 None None None 2020-08-04 16:22:28 UTC
oVirt gerrit 105581 0 master MERGED core: added image-updated file under /var/imgbased 2021-02-19 06:48:52 UTC
oVirt gerrit 108303 0 master MERGED ansible: check image update status for node-ng 2021-02-19 06:48:52 UTC
oVirt gerrit 110088 0 master MERGED core: Fix undefined variable issue in host deploy 2021-02-19 06:48:51 UTC

Description nijin ashok 2019-11-11 12:50:03 UTC
Description of problem:

The error in imgbased is not considered when the upgrade is initiated by the manager. The manager shows a "successful upgrade" even if it fails in the imgbased phase.

Version-Release number of selected component (if applicable):

rhvm-4.3.6.7-0.1.el7.noarch

How reproducible:

100%

Steps to Reproduce:

1. I created an LV with the same name so that imgbased will fail during the creation of LV.

====
2019-11-11 17:39:32,837 [DEBUG] (MainThread) Calling binary: (['lvcreate', '--thin', '--virtualsize', u'25941770240B', '--name', 'rhvh-4.3.6.2-0.20191022.0', u'rhvh/pool00'],) {}
2019-11-11 17:39:32,837 [DEBUG] (MainThread) Calling: (['lvcreate', '--thin', '--virtualsize', u'25941770240B', '--name', 'rhvh-4.3.6.2-0.20191022.0', u'rhvh/pool00'],) {'close_fds': True, 'stderr': -2}
2019-11-11 17:39:32,884 [DEBUG] (MainThread) Exception!   Logical Volume "rhvh-4.3.6.2-0.20191022.0" already exists in volume group "rhvh"
====


2. The manager shows a successful upgrade and even rebooted the machine.

===
2019-11-11 07:10:13,783-05 INFO  [org.ovirt.engine.core.bll.SshHostRebootCommand] (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [7ebbc04d] Opening SSH reboot session on host 192.168.0.13

2019-11-11 07:10:15,798-05 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [10ae0e2b-8a3b-494d-9983-aa23ee0162aa] EVENT_ID: HOST_UPGRADE_FINISHED(842), Host 192.168.0.13 upgrade was completed successfully.
====


Actual results:

RHV-M is not showing the correct status of the upgrade.

Expected results:

RHV-M should capture the correct error message and show it to the user if the upgrade failed. 

Additional info:

Similar bug 1496371 was opened which was fixed in 4.1.

Comment 2 Yuval Turgeman 2019-11-11 14:59:36 UTC
I just went over bug 1496371 - not sure how it was verified but rpm will completely ignore any scriptlet error codes other than in %prein %preun and %pretrans, perhaps we could touch a file when the script ends somewhere and have the playbook check if that file exists ?

Comment 6 peyu 2019-11-28 02:21:40 UTC
I reproduced this bug.

Version-Release number of selected component (if applicable):
RHVM 4.3.4.3-0.1.el7
rhvh-4.3.6.2-0.20191022.0


Steps to Reproduce:
1. Install RHVH-4.2-20191108.0-RHVH-x86_64-dvd1.iso and reboot
2. Login to RHVH
3. Create a LV named rhvh-4.3.6.2-0.20191022.0
   # lvcreate -V 2G --thin -n rhvh-4.3.6.2-0.20191022.0 rhvh/pool00
   # lvs
4. Set local repository in RHVH and point to "redhat-virtualization-host-image-update-4.3.6-20191023.0.el7_7.noarch.rpm"
5. Add RHVH to RHVM
6. Upgrade RHVH from RHVM UI
7. Focus on the status of RHVH in RHVM and /var/log/imgbased.log

Actual results:
1. RHVH upgrade failed and the error in /var/log/imgbased.log is as follows:

2019-11-28 01:44:55,928 [DEBUG] (MainThread) Returned: 
Traceback (most recent call last):
  File "/usr/lib64/python2.7/runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/tmp/tmp.cibD0vHQB2/usr/lib/python2.7/site-packages/imgbased/__main__.py", line 53, in <module>
    CliApplication()
  File "/tmp/tmp.cibD0vHQB2/usr/lib/python2.7/site-packages/imgbased/__init__.py", line 82, in CliApplication
    app.hooks.emit("post-arg-parse", args)
  File "/tmp/tmp.cibD0vHQB2/usr/lib/python2.7/site-packages/imgbased/hooks.py", line 120, in emit
    cb(self.context, *args)
  File "/tmp/tmp.cibD0vHQB2/usr/lib/python2.7/site-packages/imgbased/plugins/update.py", line 75, in post_argparse
    six.reraise(*exc_info)
  File "/tmp/tmp.cibD0vHQB2/usr/lib/python2.7/site-packages/imgbased/plugins/update.py", line 65, in post_argparse
    base, _ = LiveimgExtractor(app.imgbase).extract(args.FILENAME)
  File "/tmp/tmp.cibD0vHQB2/usr/lib/python2.7/site-packages/imgbased/plugins/update.py", line 137, in extract
    new_base = add_tree(rootfs.target, "%s" % size, nvr)
  File "/tmp/tmp.cibD0vHQB2/usr/lib/python2.7/site-packages/imgbased/plugins/update.py", line 104, in add_base_with_tree
    new_base = self.imgbase.add_base(size, nvr, lvs)
  File "/tmp/tmp.cibD0vHQB2/usr/lib/python2.7/site-packages/imgbased/imgbase.py", line 324, in add_base
    new_base_lv = pool.create_thinvol(new_base.lv_name, size)
  File "/tmp/tmp.cibD0vHQB2/usr/lib/python2.7/site-packages/imgbased/lvm.py", line 365, in create_thinvol
    self.lvm_name])
  File "/tmp/tmp.cibD0vHQB2/usr/lib/python2.7/site-packages/imgbased/utils.py", line 419, in lvcreate
    return self.call(["lvcreate"] + args, **kwargs)
  File "/tmp/tmp.cibD0vHQB2/usr/lib/python2.7/site-packages/imgbased/utils.py", line 407, in call
    stdout = call(*args, **kwargs)
  File "/tmp/tmp.cibD0vHQB2/usr/lib/python2.7/site-packages/imgbased/utils.py", line 171, in call
    return subprocess.check_output(*args, **kwargs).strip()
  File "/usr/lib64/python2.7/subprocess.py", line 575, in check_output
    raise CalledProcessError(retcode, cmd, output=output)
subprocess.CalledProcessError: Command '['lvcreate', '--thin', '--virtualsize', u'2543526608896B', '--name', 'rhvh-4.3.6.2-0.20191022.0', u'rhvh/pool00']' returned non-zero exit status 5


2. RHVM UI shows that RHVH upgrade was successful and RHVH status is “Up” after reboot.


Expected results:
RHVM should capture the error message and show it to the user if the upgrade failed.

Comment 7 Sandro Bonazzola 2020-03-18 11:49:10 UTC
Can this be reproduced on 4.4/el8?

Comment 9 peyu 2020-03-23 09:19:16 UTC
This bug can be reproduced on 4.4/el8.

Version-Release number of selected component (if applicable):
RHVM: 4.4.0-0.26.master.el8ev
RHVH: redhat-virtualization-host-4.4.0-20200212.0.el8_2, redhat-virtualization-host-4.4.0-20200318.0.el8_2

Test steps:
1. Install redhat-virtualization-host-4.4.0-20200212.0.el8_2 and reboot
2. Login to RHVH
3. Create a LV named rhvh-4.4.0.14-0.20200318.0
   # lvcreate -V 2G --thin -n rhvh-4.4.0.14-0.20200318.0 rhvh-unsigned-iso/pool00
   # lvs
4. Set local repository in RHVH and point to "redhat-virtualization-host-4.4.0-20200318.0.el8_2"
5. Add RHVH to RHVM
6. Upgrade RHVH from RHVM UI
7. Focus on /var/log/imgbased.log and the host status in RHVM

Test result:
1. RHVH upgrade failed and the error log in /var/log/imgbased.log is as follows:
2020-03-23 08:49:42,349 [ERROR] (MainThread) Update failed, resetting registered LVs
2020-03-23 08:49:42,349 [DEBUG] (MainThread) Calling: (['sync'],) {'close_fds': True, 'stderr': -2}
Traceback (most recent call last):
  File "/usr/lib64/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib64/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/tmp/tmp.efunBBMtqZ/usr/lib/python3.6/site-packages/imgbased/__main__.py", line 53, in <module>
    CliApplication()
  File "/tmp/tmp.efunBBMtqZ/usr/lib/python3.6/site-packages/imgbased/__init__.py", line 82, in CliApplication
    app.hooks.emit("post-arg-parse", args)
  File "/tmp/tmp.efunBBMtqZ/usr/lib/python3.6/site-packages/imgbased/hooks.py", line 120, in emit
    cb(self.context, *args)
  File "/tmp/tmp.efunBBMtqZ/usr/lib/python3.6/site-packages/imgbased/plugins/update.py", line 75, in post_argparse
    six.reraise(*exc_info)
  File "/tmp/tmp.efunBBMtqZ/usr/lib/python3.6/site-packages/six.py", line 675, in reraise
    raise value
  File "/tmp/tmp.efunBBMtqZ/usr/lib/python3.6/site-packages/imgbased/plugins/update.py", line 66, in post_argparse
    base, _ = LiveimgExtractor(app.imgbase).extract(args.FILENAME)
  File "/tmp/tmp.efunBBMtqZ/usr/lib/python3.6/site-packages/imgbased/plugins/update.py", line 148, in extract
    "%s" % size, nvr)
  File "/tmp/tmp.efunBBMtqZ/usr/lib/python3.6/site-packages/imgbased/plugins/update.py", line 114, in add_base_with_tree
    new_base = self.imgbase.add_base(size, nvr, lvs)
  File "/tmp/tmp.efunBBMtqZ/usr/lib/python3.6/site-packages/imgbased/imgbase.py", line 331, in add_base
    new_base_lv = pool.create_thinvol(new_base.lv_name, size)
  File "/tmp/tmp.efunBBMtqZ/usr/lib/python3.6/site-packages/imgbased/lvm.py", line 369, in create_thinvol
    self.lvm_name])
  File "/tmp/tmp.efunBBMtqZ/usr/lib/python3.6/site-packages/imgbased/utils.py", line 333, in lvcreate
    return self.call(["lvcreate"] + args, **kwargs)
  File "/tmp/tmp.efunBBMtqZ/usr/lib/python3.6/site-packages/imgbased/utils.py", line 321, in call
    stdout = command.call(*args, **kwargs)
  File "/tmp/tmp.efunBBMtqZ/usr/lib/python3.6/site-packages/imgbased/command.py", line 14, in call
    return subprocess.check_output(*args, **kwargs).strip()
  File "/usr/lib64/python3.6/subprocess.py", line 356, in check_output
    **kwargs).stdout
  File "/usr/lib64/python3.6/subprocess.py", line 438, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['lvcreate', '--thin', '--virtualsize', '446588518400B', '--name', 'rhvh-4.4.0.14-0.20200318.0', 'rhvh-unsigned-iso/pool00']' returned non-zero exit status 5.

2. In RHVM, it shows that RHVH upgrade was successful and host status is “Up” after reboot.

Comment 10 Yuval Turgeman 2020-03-23 09:24:56 UTC
Yes, it's not done yet - the patch in https://gerrit.ovirt.org/105581 is the first step, the next one would be to change the ansible playbook to check if the /var/imgbased/.image-updated file exists

Comment 11 Michal Skrivanek 2020-07-03 09:28:29 UTC
(In reply to Yuval Turgeman from comment #10)
> Yes, it's not done yet - the patch in https://gerrit.ovirt.org/105581 is the
> first step, the next one would be to change the ansible playbook to check if
> the /var/imgbased/.image-updated file exists

AFAICT the check is incorrectly looking at cluster level 4.4 when trying to see if imgbased is touching /var/imgbased/.image-updated. This is not correct, it's very likely that after upgrade you are running hosts in older cluster level compatiblity. Sometimes for years.
It must check for imgbased version or something similar instead.

Sandro, do you want to reopen this and fix in 4.4.2?

Comment 12 Sandro Bonazzola 2020-07-03 09:33:03 UTC
(In reply to Michal Skrivanek from comment #11)
> (In reply to Yuval Turgeman from comment #10)
> > Yes, it's not done yet - the patch in https://gerrit.ovirt.org/105581 is the
> > first step, the next one would be to change the ansible playbook to check if
> > the /var/imgbased/.image-updated file exists
> 
> AFAICT the check is incorrectly looking at cluster level 4.4 when trying to
> see if imgbased is touching /var/imgbased/.image-updated. This is not
> correct, it's very likely that after upgrade you are running hosts in older
> cluster level compatiblity. Sometimes for years.
> It must check for imgbased version or something similar instead.
> 
> Sandro, do you want to reopen this and fix in 4.4.2?

Yes, reopening.

Comment 15 peyu 2020-07-06 10:22:32 UTC
QE tested this issue, and when upgrade failed, RHVM could detect the upgrade error message.

Test version:
RHVH: redhat-virtualization-host-4.4.1-20200705.0.el8_2
RHVM: 4.4.1.7-0.3.el8ev

Test steps:
The test steps are the same as Comment 9(upgrade RHVH from "redhat-virtualization-host-4.4.0-20200518.0.el8_2" to "redhat-virtualization-host-4.4.1-20200705.0.el8_2")

Test results:
1. Host status is "InstallFailed" on RHVM, the information in Events is as follows:
~~~~~~
Failed to upgrade Host peyu-verify-upgrade-bug-01 (User: admin@internal-authz).
Update of host peyu-verify-upgrade-bug-01. Check if image-updated file exists.
Update of host peyu-verify-upgrade-bug-01. Check if image was updated.
......
~~~~~~

So I will move bug status to VERIFIED.

Comment 16 Sandro Bonazzola 2020-07-30 07:31:52 UTC
*** Bug 1859792 has been marked as a duplicate of this bug. ***

Comment 18 errata-xmlrpc 2020-08-04 16:22:08 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (RHV Host (redhat-virtualization-host) 4.4), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:3316


Note You need to log in before you can comment on or make changes to this bug.