Created attachment 1496304 [details] All logs from host(sosreport and all files in /var/log) Description of problem: Upgrade again to newer build failed from the older build after the first upgrade. For example: Upgrade from build1 to build2 successful, then upgrade from build1 to build3 failed. Version-Release number of selected component (if applicable): Build1: rhvh-4.1-0.20180410.0 Build2: rhvh-4.2.5.0-0.20180724.0 Build3: redhat-virtualization-host-4.2-20181017.2 How reproducible: 100% Steps to Reproduce: 1. Install redhat-virtualization-host-4.1-20180410.1 2. Upgrade rhvh from 4.1 to 4.2.5 rhvh-4.2.5.0-0.20180724.0 3. Reboot rhvh to old layer rhvh-4.1-0.20180410.0 4. Upgrade rhvh from 4.1 to 4.2.7 redhat-virtualization-host-4.2-20181017.2 Actual results: 1. After step2, upgrade successful. # imgbase layout rhvh-4.1-0.20180410.0 +- rhvh-4.1-0.20180410.0+1 rhvh-4.2.5.0-0.20180724.0 +- rhvh-4.2.5.0-0.20180724.0+1 2. After step4, upgrade failed. ------ Running transaction Updating : redhat-virtualization-host-image-update-4.2-20181017.2.el7_6.noarch 1/2 warning: %post(redhat-virtualization-host-image-update-4.2-20181017.2.el7_6.noarch) scriptlet failed, exit status 1 Non-fatal POSTIN scriptlet failure in rpm package redhat-virtualization-host-image-update-4.2-20181017.2.el7_6.noarch Cleanup : redhat-virtualization-host-image-update-4.2-20180724.0.el7_5.noarch 2/2 Uploading Package Profile Verifying : redhat-virtualization-host-image-update-4.2-20181017.2.el7_6.noarch 1/2 Verifying : redhat-virtualization-host-image-update-4.2-20180724.0.el7_5.noarch 2/2 Updated: redhat-virtualization-host-image-update.noarch 0:4.2-20181017.2.el7_6 Complete! ------ ------ Traceback (most recent call last): File "/usr/lib64/python2.7/runpy.py", line 162, in _run_module_as_main "__main__", fname, loader, pkg_name) File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/tmp/tmp.qyPKyiiQyq/usr/lib/python2.7/site-packages/imgbased/__main__.py", line 53, in <module> CliApplication() File "/tmp/tmp.qyPKyiiQyq/usr/lib/python2.7/site-packages/imgbased/__init__.py", line 82, in CliApplication app.hooks.emit("post-arg-parse", args) File "/tmp/tmp.qyPKyiiQyq/usr/lib/python2.7/site-packages/imgbased/hooks.py", line 120, in emit cb(self.context, *args) File "/tmp/tmp.qyPKyiiQyq/usr/lib/python2.7/site-packages/imgbased/plugins/update.py", line 74, in post_argparse six.reraise(*exc_info) File "/tmp/tmp.qyPKyiiQyq/usr/lib/python2.7/site-packages/imgbased/plugins/update.py", line 64, in post_argparse base, _ = LiveimgExtractor(app.imgbase).extract(args.FILENAME) File "/tmp/tmp.qyPKyiiQyq/usr/lib/python2.7/site-packages/imgbased/plugins/update.py", line 129, in extract new_base = add_tree(rootfs.target, "%s" % size, nvr) File "/tmp/tmp.qyPKyiiQyq/usr/lib/python2.7/site-packages/imgbased/plugins/update.py", line 111, in add_base_with_tree new_layer_lv = self.imgbase.add_layer(new_base) File "/tmp/tmp.qyPKyiiQyq/usr/lib/python2.7/site-packages/imgbased/imgbase.py", line 192, in add_layer self.hooks.emit("new-layer-added", prev_lv, new_lv) File "/tmp/tmp.qyPKyiiQyq/usr/lib/python2.7/site-packages/imgbased/hooks.py", line 120, in emit cb(self.context, *args) File "/tmp/tmp.qyPKyiiQyq/usr/lib/python2.7/site-packages/imgbased/plugins/osupdater.py", line 133, in on_new_layer raise ConfigMigrationError() imgbased.plugins.osupdater.ConfigMigrationError ------- # imgbase layout rhvh-4.1-0.20180410.0 +- rhvh-4.1-0.20180410.0+1 rhvh-4.2.5.0-0.20180724.0 +- rhvh-4.2.5.0-0.20180724.0+1 Expected results: After step4, should upgrade successful Additional info:
What do we expect to happen here ? /var/crash was created during the first upgrade, but is not mounted in the old layer, so when trying to update again, it will fail because /var/crash is already a volume
(In reply to Yuval Turgeman from comment #1) > What do we expect to happen here ? /var/crash was created during the first > upgrade, but is not mounted in the old layer, so when trying to update > again, it will fail because /var/crash is already a volume I am not sure if there is such scenario in customers' ENV, rollback to the old layer build1 to get some info and upgrade to newer layer build3 from the old layer. So expect to delete the middle layer build2.
Update from qin: Customers want to upgrade from the old layer, may because the new layer is broken, if encounter /var/crash issue, may can use the tool provided by https://bugzilla.redhat.com/show_bug.cgi?id=1613931
Posted a proposal that should really put an end this exception - Since the user would like to sync from the current layer, whenever a volume exists and is not mounted, imgbase would rename the existing volume (var_crash -> var_crash.$timestamp) and untag the volume from imgbased. This would let the upgrade to continue and sync from the current layer and not lose any data that was stored on the original volume.
The bug is fixed in redhat-virtualization-host-4.2-20181024.1.el7_6. Test version: Build1: rhvh-4.1-0.20180410.0 Build2: rhvh-4.2.5.0-0.20180724.0 Build3: redhat-virtualization-host-4.2-20181024.1.el7_6 Test steps: Same as comment 0 Test results: After step4, upgrade successful. # imgbase layout rhvh-4.1-0.20180410.0 +- rhvh-4.1-0.20180410.0+1 rhvh-4.2.7.3-0.20181024.0 +- rhvh-4.2.7.3-0.20181024.0+1 # lvs -a LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert home rhvh_dell-per730-34 Vwi-aotz-- 1.00g pool00 65.00 [lvol0_pmspare] rhvh_dell-per730-34 ewi------- 228.00m pool00 rhvh_dell-per730-34 twi-aotz-- 441.23g 4.61 2.88 [pool00_tdata] rhvh_dell-per730-34 Twi-ao---- 441.23g [pool00_tmeta] rhvh_dell-per730-34 ewi-ao---- 1.00g rhvh-4.1-0.20180410.0 rhvh_dell-per730-34 Vwi---tz-k 414.23g pool00 root rhvh-4.1-0.20180410.0+1 rhvh_dell-per730-34 Vwi---tz-- 414.23g pool00 rhvh-4.1-0.20180410.0 rhvh-4.2.7.3-0.20181024.0 rhvh_dell-per730-34 Vri---tz-k 414.23g pool00 rhvh-4.2.7.3-0.20181024.0+1 rhvh_dell-per730-34 Vwi-aotz-- 414.23g pool00 rhvh-4.2.7.3-0.20181024.0 2.07 root rhvh_dell-per730-34 Vwi---tz-- 414.23g pool00 swap rhvh_dell-per730-34 -wi-ao---- <15.69g tmp rhvh_dell-per730-34 Vwi-aotz-- 1.00g pool00 4.85 var rhvh_dell-per730-34 Vwi-aotz-- 15.00g pool00 2.47 var_crash rhvh_dell-per730-34 Vwi-aotz-- 10.00g pool00 2.86 var_crash.20181025092922 rhvh_dell-per730-34 Vwi---tz-- 10.00g pool00 var_log rhvh_dell-per730-34 Vwi-aotz-- 8.00g pool00 3.32 var_log_audit rhvh_dell-per730-34 Vwi-aotz-- 2.00g pool00 4.81 So change the status to VERIFIED.
This bugzilla is included in oVirt 4.2.7 release, published on November 2nd 2018. Since the problem described in this bug report should be resolved in oVirt 4.2.7 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.