Description of problem: The manager is only checking if the package "redhat-virtualization-host-image-update" installation is successful and is not considered if anything fails at imgbased. The package installation will be completed successfully even if the imgbased process failed because of some reason. So manager thinks that the upgrade is successful and will give the output to the user as "upgrade was completed successfully" in the event tab and will even reboot the host. However, the host will be still using the old layer. Version-Release number of selected component (if applicable): rhevm-4.1.6.2-0.1.el7.noarch How reproducible: 100% Steps to Reproduce: 1. Fail the imgbased upgrade process someway. I created a lv with the same name what the upgrade script will create so that it will fail while creating the lv. For the customer, the upgrade was failing with i/o error during the imgbased copy operation. 2. The RHV-M will show as "upgrade was completed successfully" although it failed. Actual results: RHV-M is not showing the correct status of the upgrade. Expected results: RHV-M should capture the correct message and show to the user if upgrade failed. Additional info:
We can intentionally fail the %post scriptlet if something goes wrong, but I suspect that RHVM will not capture the complete output from imgbased-copy-bootfiles, since that is not visible from yum RHVM only knows whether yum completed. It does not know the inner workings of imgbased/rhvh. However, in case where manual intervention is necessary (duplicate LV names or i/o errors), logging into the system to get logs will required in any case
Running a similar flow with `set -e` in %post sets the status in the UI to "Install Failed", and looking in ovirt-host-deploy logs, you can see something like: "Yum Script sink: warning: %post(pkgname...) scriptlet failed, exit status 1" Is that an acceptable fix (it's already like that in upstream btw) ?
That definitely works for me
Created attachment 1333179 [details] Comment 8: All logs from host
Moving back to assigned according to comment #11
Moving back to modified after discussing the status with Yuval
According to comment 11, this bug is fixed in rhvh-4.1-0.20171002.0. And another Bug 1502681 will trace the new issue in comment 11, that upgrade failed from non-nist system to another nist system after upgrade from non-nist system to nist-system. So I will verify this bug.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:3140