1641543 – Upgrade again to newer build failed from the older build after the first upgrade

Bug 1641543 - Upgrade again to newer build failed from the older build after the first upgrade

Summary: Upgrade again to newer build failed from the older build after the first upgrade

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	imgbased
Classification:	oVirt
Component:	General
Sub Component:
Version:	---
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	high
Target Milestone:	ovirt-4.2.7
Target Release:	---
Assignee:	Yuval Turgeman
QA Contact:	Huijuan Zhao
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1619590
TreeView+	depends on / blocked

Reported:	2018-10-22 08:08 UTC by Huijuan Zhao
Modified:	2018-11-02 14:34 UTC (History)
CC List:	11 users (show)
Fixed In Version:	imgbased-1.0.29
Doc Type:	Bug Fix
Doc Text:	Cause: When trying to create volumes for NIST partitioning, some volumes exist but are not mounted Consequence: imgbased will exit with an error saying "Path is already a volume" Fix: Rename the existing unmounted volume to <name>.$timestamp, then create and sync the new volume Result: The upgrade process finishes successfuly, and the data from the existing LV is not lost
Clone Of:
Environment:
Last Closed:	2018-11-02 14:34:24 UTC
oVirt Team:	Node
Embargoed:
Flags:	rule-engine: ovirt-4.2+ cshao: testing_ack+

Attachments	(Terms of Use)
All logs from host(sosreport and all files in /var/log) (10.17 MB, application/x-gzip) 2018-10-22 08:08 UTC, Huijuan Zhao	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
oVirt gerrit	95060	0	master	MERGED	volumes: rename and untag volumes on upgrade	2018-10-23 13:09:01 UTC
oVirt gerrit	95091	0	ovirt-4.2	MERGED	volumes: rename and untag volumes on upgrade	2018-10-23 13:20:19 UTC

Description Huijuan Zhao 2018-10-22 08:08:07 UTC

Created attachment 1496304 [details]
All logs from host(sosreport and all files in /var/log)

Description of problem:
Upgrade again to newer build failed from the older build after the first upgrade.

For example:
Upgrade from build1 to build2 successful, then upgrade from build1 to build3 failed.

Version-Release number of selected component (if applicable):
Build1: rhvh-4.1-0.20180410.0
Build2: rhvh-4.2.5.0-0.20180724.0
Build3: redhat-virtualization-host-4.2-20181017.2


How reproducible:
100%

Steps to Reproduce:
1. Install redhat-virtualization-host-4.1-20180410.1
2. Upgrade rhvh from 4.1 to 4.2.5 rhvh-4.2.5.0-0.20180724.0
3. Reboot rhvh to old layer rhvh-4.1-0.20180410.0
4. Upgrade rhvh from 4.1 to 4.2.7 redhat-virtualization-host-4.2-20181017.2

Actual results:
1. After step2, upgrade successful.
# imgbase layout
rhvh-4.1-0.20180410.0
 +- rhvh-4.1-0.20180410.0+1
rhvh-4.2.5.0-0.20180724.0
 +- rhvh-4.2.5.0-0.20180724.0+1

2. After step4, upgrade failed.
------
Running transaction
  Updating   : redhat-virtualization-host-image-update-4.2-20181017.2.el7_6.noarch                                                                                1/2
warning: %post(redhat-virtualization-host-image-update-4.2-20181017.2.el7_6.noarch) scriptlet failed, exit status 1
Non-fatal POSTIN scriptlet failure in rpm package redhat-virtualization-host-image-update-4.2-20181017.2.el7_6.noarch
  Cleanup    : redhat-virtualization-host-image-update-4.2-20180724.0.el7_5.noarch                                                                                2/2
Uploading Package Profile
  Verifying  : redhat-virtualization-host-image-update-4.2-20181017.2.el7_6.noarch                                                                                1/2
  Verifying  : redhat-virtualization-host-image-update-4.2-20180724.0.el7_5.noarch                                                                                2/2

Updated:
  redhat-virtualization-host-image-update.noarch 0:4.2-20181017.2.el7_6                                                                                              

Complete!
------

------
Traceback (most recent call last):
  File "/usr/lib64/python2.7/runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/tmp/tmp.qyPKyiiQyq/usr/lib/python2.7/site-packages/imgbased/__main__.py", line 53, in <module>
    CliApplication()
  File "/tmp/tmp.qyPKyiiQyq/usr/lib/python2.7/site-packages/imgbased/__init__.py", line 82, in CliApplication
    app.hooks.emit("post-arg-parse", args)
  File "/tmp/tmp.qyPKyiiQyq/usr/lib/python2.7/site-packages/imgbased/hooks.py", line 120, in emit
    cb(self.context, *args)
  File "/tmp/tmp.qyPKyiiQyq/usr/lib/python2.7/site-packages/imgbased/plugins/update.py", line 74, in post_argparse
    six.reraise(*exc_info)
  File "/tmp/tmp.qyPKyiiQyq/usr/lib/python2.7/site-packages/imgbased/plugins/update.py", line 64, in post_argparse
    base, _ = LiveimgExtractor(app.imgbase).extract(args.FILENAME)
  File "/tmp/tmp.qyPKyiiQyq/usr/lib/python2.7/site-packages/imgbased/plugins/update.py", line 129, in extract
    new_base = add_tree(rootfs.target, "%s" % size, nvr)
  File "/tmp/tmp.qyPKyiiQyq/usr/lib/python2.7/site-packages/imgbased/plugins/update.py", line 111, in add_base_with_tree
    new_layer_lv = self.imgbase.add_layer(new_base)
  File "/tmp/tmp.qyPKyiiQyq/usr/lib/python2.7/site-packages/imgbased/imgbase.py", line 192, in add_layer
    self.hooks.emit("new-layer-added", prev_lv, new_lv)
  File "/tmp/tmp.qyPKyiiQyq/usr/lib/python2.7/site-packages/imgbased/hooks.py", line 120, in emit
    cb(self.context, *args)
  File "/tmp/tmp.qyPKyiiQyq/usr/lib/python2.7/site-packages/imgbased/plugins/osupdater.py", line 133, in on_new_layer
    raise ConfigMigrationError()
imgbased.plugins.osupdater.ConfigMigrationError
-------

# imgbase layout
rhvh-4.1-0.20180410.0
 +- rhvh-4.1-0.20180410.0+1
rhvh-4.2.5.0-0.20180724.0
 +- rhvh-4.2.5.0-0.20180724.0+1

Expected results:
After step4, should upgrade successful


Additional info:

Comment 1 Yuval Turgeman 2018-10-22 08:43:26 UTC

What do we expect to happen here ?  /var/crash was created during the first upgrade, but is not mounted in the old layer, so when trying to update again, it will fail because /var/crash is already a volume

Comment 2 Huijuan Zhao 2018-10-22 10:26:42 UTC

(In reply to Yuval Turgeman from comment #1)
> What do we expect to happen here ?  /var/crash was created during the first
> upgrade, but is not mounted in the old layer, so when trying to update
> again, it will fail because /var/crash is already a volume

I am not sure if there is such scenario in customers' ENV, rollback to the old layer build1 to get some info and upgrade to newer layer build3 from the old layer. So expect to delete the middle layer build2.

Comment 3 cshao 2018-10-22 11:22:12 UTC

Update from qin:
Customers want to upgrade from the old layer, may because the new layer is broken, if encounter /var/crash issue, may can use the tool provided by https://bugzilla.redhat.com/show_bug.cgi?id=1613931

Comment 4 Yuval Turgeman 2018-10-22 15:00:36 UTC

Posted a proposal that should really put an end this exception - Since the user would like to sync from the current layer, whenever a volume exists and is not mounted, imgbase would rename the existing volume (var_crash -> var_crash.$timestamp) and untag the volume from imgbased.
This would let the upgrade to continue and sync from the current layer and not lose any data that was stored on the original volume.

Comment 5 Huijuan Zhao 2018-10-25 09:49:54 UTC

The bug is fixed in redhat-virtualization-host-4.2-20181024.1.el7_6.

Test version:
Build1: rhvh-4.1-0.20180410.0
Build2: rhvh-4.2.5.0-0.20180724.0
Build3: redhat-virtualization-host-4.2-20181024.1.el7_6

Test steps:
Same as comment 0

Test results:
After step4, upgrade successful.

# imgbase layout
rhvh-4.1-0.20180410.0
 +- rhvh-4.1-0.20180410.0+1
rhvh-4.2.7.3-0.20181024.0
 +- rhvh-4.2.7.3-0.20181024.0+1

# lvs -a
  LV                          VG                  Attr       LSize   Pool   Origin                    Data%  Meta%  Move Log Cpy%Sync Convert
  home                        rhvh_dell-per730-34 Vwi-aotz--   1.00g pool00                           65.00                                  
  [lvol0_pmspare]             rhvh_dell-per730-34 ewi------- 228.00m                                                                         
  pool00                      rhvh_dell-per730-34 twi-aotz-- 441.23g                                  4.61   2.88                            
  [pool00_tdata]              rhvh_dell-per730-34 Twi-ao---- 441.23g                                                                         
  [pool00_tmeta]              rhvh_dell-per730-34 ewi-ao----   1.00g                                                                         
  rhvh-4.1-0.20180410.0       rhvh_dell-per730-34 Vwi---tz-k 414.23g pool00 root                                                             
  rhvh-4.1-0.20180410.0+1     rhvh_dell-per730-34 Vwi---tz-- 414.23g pool00 rhvh-4.1-0.20180410.0                                            
  rhvh-4.2.7.3-0.20181024.0   rhvh_dell-per730-34 Vri---tz-k 414.23g pool00                                                                  
  rhvh-4.2.7.3-0.20181024.0+1 rhvh_dell-per730-34 Vwi-aotz-- 414.23g pool00 rhvh-4.2.7.3-0.20181024.0 2.07                                   
  root                        rhvh_dell-per730-34 Vwi---tz-- 414.23g pool00                                                                  
  swap                        rhvh_dell-per730-34 -wi-ao---- <15.69g                                                                         
  tmp                         rhvh_dell-per730-34 Vwi-aotz--   1.00g pool00                           4.85                                   
  var                         rhvh_dell-per730-34 Vwi-aotz--  15.00g pool00                           2.47                                   
  var_crash                   rhvh_dell-per730-34 Vwi-aotz--  10.00g pool00                           2.86                                   
  var_crash.20181025092922    rhvh_dell-per730-34 Vwi---tz--  10.00g pool00                                                                  
  var_log                     rhvh_dell-per730-34 Vwi-aotz--   8.00g pool00                           3.32                                   
  var_log_audit               rhvh_dell-per730-34 Vwi-aotz--   2.00g pool00                           4.81


So change the status to VERIFIED.

Comment 6 Sandro Bonazzola 2018-11-02 14:34:24 UTC

This bugzilla is included in oVirt 4.2.7 release, published on November 2nd 2018.

Since the problem described in this bug report should be
resolved in oVirt 4.2.7 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.

Note You need to log in before you can comment on or make changes to this bug.