Description of problem: The error in imgbased is not considered when the upgrade is initiated by the manager. The manager shows a "successful upgrade" even if it fails in the imgbased phase. Version-Release number of selected component (if applicable): rhvm-4.3.6.7-0.1.el7.noarch How reproducible: 100% Steps to Reproduce: 1. I created an LV with the same name so that imgbased will fail during the creation of LV. ==== 2019-11-11 17:39:32,837 [DEBUG] (MainThread) Calling binary: (['lvcreate', '--thin', '--virtualsize', u'25941770240B', '--name', 'rhvh-4.3.6.2-0.20191022.0', u'rhvh/pool00'],) {} 2019-11-11 17:39:32,837 [DEBUG] (MainThread) Calling: (['lvcreate', '--thin', '--virtualsize', u'25941770240B', '--name', 'rhvh-4.3.6.2-0.20191022.0', u'rhvh/pool00'],) {'close_fds': True, 'stderr': -2} 2019-11-11 17:39:32,884 [DEBUG] (MainThread) Exception! Logical Volume "rhvh-4.3.6.2-0.20191022.0" already exists in volume group "rhvh" ==== 2. The manager shows a successful upgrade and even rebooted the machine. === 2019-11-11 07:10:13,783-05 INFO [org.ovirt.engine.core.bll.SshHostRebootCommand] (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [7ebbc04d] Opening SSH reboot session on host 192.168.0.13 2019-11-11 07:10:15,798-05 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [10ae0e2b-8a3b-494d-9983-aa23ee0162aa] EVENT_ID: HOST_UPGRADE_FINISHED(842), Host 192.168.0.13 upgrade was completed successfully. ==== Actual results: RHV-M is not showing the correct status of the upgrade. Expected results: RHV-M should capture the correct error message and show it to the user if the upgrade failed. Additional info: Similar bug 1496371 was opened which was fixed in 4.1.
I just went over bug 1496371 - not sure how it was verified but rpm will completely ignore any scriptlet error codes other than in %prein %preun and %pretrans, perhaps we could touch a file when the script ends somewhere and have the playbook check if that file exists ?
I reproduced this bug. Version-Release number of selected component (if applicable): RHVM 4.3.4.3-0.1.el7 rhvh-4.3.6.2-0.20191022.0 Steps to Reproduce: 1. Install RHVH-4.2-20191108.0-RHVH-x86_64-dvd1.iso and reboot 2. Login to RHVH 3. Create a LV named rhvh-4.3.6.2-0.20191022.0 # lvcreate -V 2G --thin -n rhvh-4.3.6.2-0.20191022.0 rhvh/pool00 # lvs 4. Set local repository in RHVH and point to "redhat-virtualization-host-image-update-4.3.6-20191023.0.el7_7.noarch.rpm" 5. Add RHVH to RHVM 6. Upgrade RHVH from RHVM UI 7. Focus on the status of RHVH in RHVM and /var/log/imgbased.log Actual results: 1. RHVH upgrade failed and the error in /var/log/imgbased.log is as follows: 2019-11-28 01:44:55,928 [DEBUG] (MainThread) Returned: Traceback (most recent call last): File "/usr/lib64/python2.7/runpy.py", line 162, in _run_module_as_main "__main__", fname, loader, pkg_name) File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/tmp/tmp.cibD0vHQB2/usr/lib/python2.7/site-packages/imgbased/__main__.py", line 53, in <module> CliApplication() File "/tmp/tmp.cibD0vHQB2/usr/lib/python2.7/site-packages/imgbased/__init__.py", line 82, in CliApplication app.hooks.emit("post-arg-parse", args) File "/tmp/tmp.cibD0vHQB2/usr/lib/python2.7/site-packages/imgbased/hooks.py", line 120, in emit cb(self.context, *args) File "/tmp/tmp.cibD0vHQB2/usr/lib/python2.7/site-packages/imgbased/plugins/update.py", line 75, in post_argparse six.reraise(*exc_info) File "/tmp/tmp.cibD0vHQB2/usr/lib/python2.7/site-packages/imgbased/plugins/update.py", line 65, in post_argparse base, _ = LiveimgExtractor(app.imgbase).extract(args.FILENAME) File "/tmp/tmp.cibD0vHQB2/usr/lib/python2.7/site-packages/imgbased/plugins/update.py", line 137, in extract new_base = add_tree(rootfs.target, "%s" % size, nvr) File "/tmp/tmp.cibD0vHQB2/usr/lib/python2.7/site-packages/imgbased/plugins/update.py", line 104, in add_base_with_tree new_base = self.imgbase.add_base(size, nvr, lvs) File "/tmp/tmp.cibD0vHQB2/usr/lib/python2.7/site-packages/imgbased/imgbase.py", line 324, in add_base new_base_lv = pool.create_thinvol(new_base.lv_name, size) File "/tmp/tmp.cibD0vHQB2/usr/lib/python2.7/site-packages/imgbased/lvm.py", line 365, in create_thinvol self.lvm_name]) File "/tmp/tmp.cibD0vHQB2/usr/lib/python2.7/site-packages/imgbased/utils.py", line 419, in lvcreate return self.call(["lvcreate"] + args, **kwargs) File "/tmp/tmp.cibD0vHQB2/usr/lib/python2.7/site-packages/imgbased/utils.py", line 407, in call stdout = call(*args, **kwargs) File "/tmp/tmp.cibD0vHQB2/usr/lib/python2.7/site-packages/imgbased/utils.py", line 171, in call return subprocess.check_output(*args, **kwargs).strip() File "/usr/lib64/python2.7/subprocess.py", line 575, in check_output raise CalledProcessError(retcode, cmd, output=output) subprocess.CalledProcessError: Command '['lvcreate', '--thin', '--virtualsize', u'2543526608896B', '--name', 'rhvh-4.3.6.2-0.20191022.0', u'rhvh/pool00']' returned non-zero exit status 5 2. RHVM UI shows that RHVH upgrade was successful and RHVH status is “Up” after reboot. Expected results: RHVM should capture the error message and show it to the user if the upgrade failed.
Can this be reproduced on 4.4/el8?
This bug can be reproduced on 4.4/el8. Version-Release number of selected component (if applicable): RHVM: 4.4.0-0.26.master.el8ev RHVH: redhat-virtualization-host-4.4.0-20200212.0.el8_2, redhat-virtualization-host-4.4.0-20200318.0.el8_2 Test steps: 1. Install redhat-virtualization-host-4.4.0-20200212.0.el8_2 and reboot 2. Login to RHVH 3. Create a LV named rhvh-4.4.0.14-0.20200318.0 # lvcreate -V 2G --thin -n rhvh-4.4.0.14-0.20200318.0 rhvh-unsigned-iso/pool00 # lvs 4. Set local repository in RHVH and point to "redhat-virtualization-host-4.4.0-20200318.0.el8_2" 5. Add RHVH to RHVM 6. Upgrade RHVH from RHVM UI 7. Focus on /var/log/imgbased.log and the host status in RHVM Test result: 1. RHVH upgrade failed and the error log in /var/log/imgbased.log is as follows: 2020-03-23 08:49:42,349 [ERROR] (MainThread) Update failed, resetting registered LVs 2020-03-23 08:49:42,349 [DEBUG] (MainThread) Calling: (['sync'],) {'close_fds': True, 'stderr': -2} Traceback (most recent call last): File "/usr/lib64/python3.6/runpy.py", line 193, in _run_module_as_main "__main__", mod_spec) File "/usr/lib64/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/tmp/tmp.efunBBMtqZ/usr/lib/python3.6/site-packages/imgbased/__main__.py", line 53, in <module> CliApplication() File "/tmp/tmp.efunBBMtqZ/usr/lib/python3.6/site-packages/imgbased/__init__.py", line 82, in CliApplication app.hooks.emit("post-arg-parse", args) File "/tmp/tmp.efunBBMtqZ/usr/lib/python3.6/site-packages/imgbased/hooks.py", line 120, in emit cb(self.context, *args) File "/tmp/tmp.efunBBMtqZ/usr/lib/python3.6/site-packages/imgbased/plugins/update.py", line 75, in post_argparse six.reraise(*exc_info) File "/tmp/tmp.efunBBMtqZ/usr/lib/python3.6/site-packages/six.py", line 675, in reraise raise value File "/tmp/tmp.efunBBMtqZ/usr/lib/python3.6/site-packages/imgbased/plugins/update.py", line 66, in post_argparse base, _ = LiveimgExtractor(app.imgbase).extract(args.FILENAME) File "/tmp/tmp.efunBBMtqZ/usr/lib/python3.6/site-packages/imgbased/plugins/update.py", line 148, in extract "%s" % size, nvr) File "/tmp/tmp.efunBBMtqZ/usr/lib/python3.6/site-packages/imgbased/plugins/update.py", line 114, in add_base_with_tree new_base = self.imgbase.add_base(size, nvr, lvs) File "/tmp/tmp.efunBBMtqZ/usr/lib/python3.6/site-packages/imgbased/imgbase.py", line 331, in add_base new_base_lv = pool.create_thinvol(new_base.lv_name, size) File "/tmp/tmp.efunBBMtqZ/usr/lib/python3.6/site-packages/imgbased/lvm.py", line 369, in create_thinvol self.lvm_name]) File "/tmp/tmp.efunBBMtqZ/usr/lib/python3.6/site-packages/imgbased/utils.py", line 333, in lvcreate return self.call(["lvcreate"] + args, **kwargs) File "/tmp/tmp.efunBBMtqZ/usr/lib/python3.6/site-packages/imgbased/utils.py", line 321, in call stdout = command.call(*args, **kwargs) File "/tmp/tmp.efunBBMtqZ/usr/lib/python3.6/site-packages/imgbased/command.py", line 14, in call return subprocess.check_output(*args, **kwargs).strip() File "/usr/lib64/python3.6/subprocess.py", line 356, in check_output **kwargs).stdout File "/usr/lib64/python3.6/subprocess.py", line 438, in run output=stdout, stderr=stderr) subprocess.CalledProcessError: Command '['lvcreate', '--thin', '--virtualsize', '446588518400B', '--name', 'rhvh-4.4.0.14-0.20200318.0', 'rhvh-unsigned-iso/pool00']' returned non-zero exit status 5. 2. In RHVM, it shows that RHVH upgrade was successful and host status is “Up” after reboot.
Yes, it's not done yet - the patch in https://gerrit.ovirt.org/105581 is the first step, the next one would be to change the ansible playbook to check if the /var/imgbased/.image-updated file exists
(In reply to Yuval Turgeman from comment #10) > Yes, it's not done yet - the patch in https://gerrit.ovirt.org/105581 is the > first step, the next one would be to change the ansible playbook to check if > the /var/imgbased/.image-updated file exists AFAICT the check is incorrectly looking at cluster level 4.4 when trying to see if imgbased is touching /var/imgbased/.image-updated. This is not correct, it's very likely that after upgrade you are running hosts in older cluster level compatiblity. Sometimes for years. It must check for imgbased version or something similar instead. Sandro, do you want to reopen this and fix in 4.4.2?
(In reply to Michal Skrivanek from comment #11) > (In reply to Yuval Turgeman from comment #10) > > Yes, it's not done yet - the patch in https://gerrit.ovirt.org/105581 is the > > first step, the next one would be to change the ansible playbook to check if > > the /var/imgbased/.image-updated file exists > > AFAICT the check is incorrectly looking at cluster level 4.4 when trying to > see if imgbased is touching /var/imgbased/.image-updated. This is not > correct, it's very likely that after upgrade you are running hosts in older > cluster level compatiblity. Sometimes for years. > It must check for imgbased version or something similar instead. > > Sandro, do you want to reopen this and fix in 4.4.2? Yes, reopening.
QE tested this issue, and when upgrade failed, RHVM could detect the upgrade error message. Test version: RHVH: redhat-virtualization-host-4.4.1-20200705.0.el8_2 RHVM: 4.4.1.7-0.3.el8ev Test steps: The test steps are the same as Comment 9(upgrade RHVH from "redhat-virtualization-host-4.4.0-20200518.0.el8_2" to "redhat-virtualization-host-4.4.1-20200705.0.el8_2") Test results: 1. Host status is "InstallFailed" on RHVM, the information in Events is as follows: ~~~~~~ Failed to upgrade Host peyu-verify-upgrade-bug-01 (User: admin@internal-authz). Update of host peyu-verify-upgrade-bug-01. Check if image-updated file exists. Update of host peyu-verify-upgrade-bug-01. Check if image was updated. ...... ~~~~~~ So I will move bug status to VERIFIED.
*** Bug 1859792 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (RHV Host (redhat-virtualization-host) 4.4), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2020:3316