Bug 1936972 - [RHVH] Failed to reinstall persisted RPMs
Summary: [RHVH] Failed to reinstall persisted RPMs
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: imgbased
Version: 4.4.4
Hardware: All
OS: Linux
high
medium
Target Milestone: ovirt-4.4.6
: ---
Assignee: Lev Veyde
QA Contact: peyu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-03-09 15:14 UTC by Ulhas Surse
Modified: 2021-06-03 10:25 UTC (History)
12 users (show)

Fixed In Version: imgbase-1.2.19
Doc Type: Bug Fix
Doc Text:
Previously, old RPM files were not properly removed during package removal (uninistall) or upgrade. As a result, removed packages were reinstalled, or, during and upgrade, the system tried to install two or more different versions at once, causing the upgrade to fail. In this release, the dnf plugin has been fixed, and RPM packages are now properly removed. The new version will also auto-heal the broken system by removing RPM packages which are not supposed to be in the persisted-rpms directory.
Clone Of:
Environment:
Last Closed: 2021-06-03 10:24:29 UTC
oVirt Team: Node
Target Upstream Version:
Embargoed:
peyu: testing_plan_complete+


Attachments (Terms of Use)
host logs from /var/log (3.43 MB, application/gzip)
2021-04-01 06:47 UTC, peyu
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 5672751 0 None None None 2021-03-09 15:17:25 UTC
Red Hat Product Errata RHSA-2021:2239 0 None None None 2021-06-03 10:25:14 UTC
oVirt gerrit 113865 0 master MERGED Fixed dnf plugin and rpmpersistence imgbased plugin 2021-03-16 19:27:44 UTC
oVirt gerrit 114298 0 master MERGED Fixed problems with rpm persistence (added auto-healing) 2021-04-20 12:33:19 UTC

Description Ulhas Surse 2021-03-09 15:14:43 UTC
Description of problem:
Upgrade of host failed with persisted rpms. 

Version-Release number of selected component (if applicable):
RHV 4.4.4

How reproducible:
If there are persisted rpms installed. 

Steps to Reproduce:
1. Install some different rpms then standard repo
2. There are multiple versions of the same package
3. try to upgrade the host. 


Actual results:
It fails with the error:
2021-02-19 16:59:09,328 [ERROR] (MainThread) Failed to reinstall persisted RPMs
Traceback (most recent call last):
  File "/tmp/tmp.SkqM3nc7NF/usr/lib/python3.6/site-packages/imgbased/plugins/rpmpersistence.py", line 94, in install
    subprocess.check_output(cmd, stderr=subprocess.STDOUT)
  File "/usr/lib64/python3.6/subprocess.py", line 356, in check_output
    **kwargs).stdout
  File "/usr/lib64/python3.6/subprocess.py", line 438, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command.....

/var/imgbased/persisted-rpms/nagios-plugins-disk-2.3.3-4.el8.x86_64.rpm']' returned non-zero exit status 1.



Expected results:
Upgrade should run 

Additional info:

Workaround is to remove those duplicate old version packages.

Comment 2 peyu 2021-03-10 05:32:05 UTC
QE reproduced this issue.

Test Version:
RHVM: 4.4.4.6-0.1.el8ev
RHVH: redhat-virtualization-host-4.4.4-20210307.0.el8_3

Test Steps:
1. Install RHVH-4.4-20210202.0-RHVH-x86_64-dvd1.iso
2. Add host to RHVM
3. Set up local repo and point to "redhat-virtualization-host-4.4.4-20210307.0.el8_3"
4. Install more than one version of rpm packages, such as:
   # yum install vdsm-hook-nestedvt-4.40.35-1.el8ev.noarch.rpm
   # yum install vdsm-hook-nestedvt-4.40.39-1.el8ev.noarch.rpm
5. Upgrade the host
   # yum update
6. Check the result of RHVH upgrade 

Test results:
RHVH upgrade failed.

~~~~~~
# ll /var/imgbased/persisted-rpms/
total 16
-rw-r--r--. 1 root root 8148 Mar 10 00:05 vdsm-hook-nestedvt-4.40.35-1.el8ev.noarch.rpm
-rw-r--r--. 1 root root 8152 Mar 10 00:06 vdsm-hook-nestedvt-4.40.39-1.el8ev.noarch.rpm


# yum update
Updating Subscription Management repositories.
Unable to read consumer identity

This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.

Last metadata expiration check: 0:03:52 ago on Wed 10 Mar 2021 12:03:53 AM EST.
Dependencies resolved.
=============================================================================================================================
 Package                                           Architecture     Version                           Repository        Size
=============================================================================================================================
Installing:
 redhat-virtualization-host-image-update           noarch           4.4.4-20210307.0.el8_3            update           821 M
     replacing  redhat-virtualization-host-image-update-placeholder.noarch 4.4.3-2.el8ev

Transaction Summary
=============================================================================================================================
Install  1 Package

Total download size: 821 M
Is this ok [y/N]: y
Downloading Packages:
redhat-virtualization-host-image-update-latest.rpm                                            82 MB/s | 821 MB     00:10    
-----------------------------------------------------------------------------------------------------------------------------
Total                                                                                         82 MB/s | 821 MB     00:10     
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
  Preparing        :                                                                                                     1/1 
  Running scriptlet: redhat-virtualization-host-image-update-4.4.4-20210307.0.el8_3.noarch                               1/2 
  Installing       : redhat-virtualization-host-image-update-4.4.4-20210307.0.el8_3.noarch                               1/2 
  Running scriptlet: redhat-virtualization-host-image-update-4.4.4-20210307.0.el8_3.noarch                               1/2 
warning: %post(redhat-virtualization-host-image-update-4.4.4-20210307.0.el8_3.noarch) scriptlet failed, exit status 1

Error in POSTIN scriptlet in rpm package redhat-virtualization-host-image-update
  Obsoleting       : redhat-virtualization-host-image-update-placeholder-4.4.3-2.el8ev.noarch                            2/2 
  Verifying        : redhat-virtualization-host-image-update-4.4.4-20210307.0.el8_3.noarch                               1/2 
  Verifying        : redhat-virtualization-host-image-update-placeholder-4.4.3-2.el8ev.noarch                            2/2 
Unpersisting: redhat-virtualization-host-image-update-placeholder-4.4.3-2.el8ev.noarch.rpm
Installed products updated.

Installed:
  redhat-virtualization-host-image-update-4.4.4-20210307.0.el8_3.noarch                                                      

Complete!
~~~~~~


The error message in /var/log/imgbased.log
~~~~~~
2021-03-10 00:10:50,103 [ERROR] (MainThread) Failed to reinstall persisted RPMs
Traceback (most recent call last):
  File "/tmp/tmp.w6Pfysiwl7/usr/lib/python3.6/site-packages/imgbased/plugins/rpmpersistence.py", line 94, in install
    subprocess.check_output(cmd, stderr=subprocess.STDOUT)
  File "/usr/lib64/python3.6/subprocess.py", line 356, in check_output
    **kwargs).stdout
  File "/usr/lib64/python3.6/subprocess.py", line 438, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['systemd-nspawn', '--uuid', '302cc76d43d74c09a40106ae7e0a8324', '--machine', 'dell-per730-35.lab.eng.pek2.redhat.com', '-D', '/tmp/mnt.q3Z8n//', 'yum', 'install', '-y', '--noplugins', '/var/imgbased/persisted-rpms/vdsm-hook-nestedvt-4.40.35-1.el8ev.noarch.rpm', '/var/imgbased/persisted-rpms/vdsm-hook-nestedvt-4.40.39-1.el8ev.noarch.rpm']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/tmp/tmp.w6Pfysiwl7/usr/lib/python3.6/site-packages/imgbased/plugins/rpmpersistence.py", line 58, in on_os_upgraded
    reinstall_rpms(imgbase, new_lv, previous_layer_lv)
  File "/tmp/tmp.w6Pfysiwl7/usr/lib/python3.6/site-packages/imgbased/plugins/rpmpersistence.py", line 75, in reinstall_rpms
    install_rpms(new_fs)
  File "/tmp/tmp.w6Pfysiwl7/usr/lib/python3.6/site-packages/imgbased/plugins/rpmpersistence.py", line 105, in install_rpms
    install(["yum", "install", "-y", "--noplugins"] + rpms)
  File "/tmp/tmp.w6Pfysiwl7/usr/lib/python3.6/site-packages/imgbased/plugins/rpmpersistence.py", line 97, in install
    log.info("Result: " + e.output)
TypeError: must be str, not bytes
2021-03-10 00:10:50,122 [ERROR] (MainThread) Failed to update OS
Traceback (most recent call last):
  File "/tmp/tmp.w6Pfysiwl7/usr/lib/python3.6/site-packages/imgbased/plugins/rpmpersistence.py", line 94, in install
    subprocess.check_output(cmd, stderr=subprocess.STDOUT)
  File "/usr/lib64/python3.6/subprocess.py", line 356, in check_output
    **kwargs).stdout
  File "/usr/lib64/python3.6/subprocess.py", line 438, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['systemd-nspawn', '--uuid', '302cc76d43d74c09a40106ae7e0a8324', '--machine', 'dell-per730-35.lab.eng.pek2.redhat.com', '-D', '/tmp/mnt.q3Z8n//', 'yum', 'install', '-y', '--noplugins', '/var/imgbased/persisted-rpms/vdsm-hook-nestedvt-4.40.35-1.el8ev.noarch.rpm', '/var/imgbased/persisted-rpms/vdsm-hook-nestedvt-4.40.39-1.el8ev.noarch.rpm']' returned non-zero exit status 1.
~~~~~~

Comment 3 Sandro Bonazzola 2021-03-10 07:24:37 UTC
(In reply to peyu from comment #2)

> 4. Install more than one version of rpm packages, such as:
>    # yum install vdsm-hook-nestedvt-4.40.35-1.el8ev.noarch.rpm
>    # yum install vdsm-hook-nestedvt-4.40.39-1.el8ev.noarch.rpm

Does this reproduce even with the following?
    # yum install vdsm-hook-nestedvt-4.40.35-1.el8ev.noarch.rpm
    # yum update vdsm-hook-nestedvt-4.40.39-1.el8ev.noarch.rpm

Comment 4 peyu 2021-03-10 08:21:30 UTC
(In reply to Sandro Bonazzola from comment #3)
> (In reply to peyu from comment #2)
> 
> > 4. Install more than one version of rpm packages, such as:
> >    # yum install vdsm-hook-nestedvt-4.40.35-1.el8ev.noarch.rpm
> >    # yum install vdsm-hook-nestedvt-4.40.39-1.el8ev.noarch.rpm
> 
> Does this reproduce even with the following?
>     # yum install vdsm-hook-nestedvt-4.40.35-1.el8ev.noarch.rpm
>     # yum update vdsm-hook-nestedvt-4.40.39-1.el8ev.noarch.rpm

Yes.

Comment 5 RHEL Program Management 2021-03-10 08:32:14 UTC
The documentation text flag should only be set after 'doc text' field is provided. Please provide the documentation text and set the flag to '?' again.

Comment 6 Sandro Bonazzola 2021-03-10 08:33:09 UTC
Lev, this seems to be a bug in yum plugin we use for persisting rpms across upgrades.

Comment 7 Lev Veyde 2021-03-11 23:01:26 UTC
(In reply to Sandro Bonazzola from comment #6)
> Lev, this seems to be a bug in yum plugin we use for persisting rpms across
> upgrades.

Indeed.

The reason it failed in such a way though, is due to multiple issues:

- the root cause is a bug in the dnf plugin, that fails to remove the packages during removal/upgrade

  This causes an old/removed RPM package(s) to be left on disk, i.e. aside the new one(s) as happens during an update.
  The side effect of this bug is that removed RPM package would be re-installed on RHV-H update.
  And in case the package was updated (as we see in this case), upon the update of the RHV-H, the re-installation of the packages will fail, as it will attempt to install both old and new packages at once.


- the additional issue is due to improper code in handling of the exception, which caused triggering of another exception.


Without a second issue the RHV-H update would still continue, only throwing an error message about inability to re-install persistent rpms, something like:


2021-03-11 13:32:47,472 [DEBUG] (MainThread) Running ['systemd-nspawn', '--uuid', '79f332bacb584173af7e0ebecab83d09', '--machine', '<hostname>', '-D', '/tmp/mnt.SZBRx//', 'yum', 'install', '-y', '--noplugins', '/var/imgbased/persisted-rpms/vdsm-hook-nestedvt-4.40.39-1.el8ev.n
oarch.rpm', '/var/imgbased/persisted-rpms/vdsm-hook-nestedvt-4.40.35-1.el8ev.noarch.rpm']
2021-03-11 13:32:50,327 [INFO] (MainThread) Failed to reinstall persisted RPMs!
2021-03-11 13:32:50,328 [INFO] (MainThread) Result: Spawning container <machine_fqdn> on /tmp/mnt.SZBRx.
Press ^] three times within 1s to kill container.
Last metadata expiration check: 1 day, 13:28:57 ago on Wed Mar 10 00:03:53 2021.
Error: 
 Problem: cannot install both vdsm-hook-nestedvt-4.40.35-1.el8ev.noarch and vdsm-hook-nestedvt-4.40.39-1.el8ev.noarch
  - conflicting requests
(try to add '--allowerasing' to command line to replace conflicting packages or '--skip-broken' to skip uninstallable packages or '--nobest' to use not 
only best candidate packages)
Container <hostname> failed with error code 1.


Note that in that case, we could see the actual cause of the issue.

Comment 13 peyu 2021-03-18 04:46:54 UTC
pending on new build to verify

Comment 15 Eli Marcus 2021-03-25 17:09:19 UTC
Hi Lev, please review the doc text for release notes and errata: 

Previously, old RPM files were not properly removed during package removal (uninistall) or upgrade. As a result, removed packages were reinstalled, or, during and upgrade, the system tried to install two or more different versions at once, causing the upgrade to fail.
In this release, the dnf plugin has been fixed, and  RPM packages are now properly removed.
Note however, that any previously removed or upgrade packages (old version) need to be manually removed prior to the upgrade, as before the upgrade completes and the system is rebooted, it still runs the old code.

Comment 16 Lev Veyde 2021-03-25 17:57:20 UTC
(In reply to Eli Marcus from comment #15)
> Hi Lev, please review the doc text for release notes and errata: 
> 
> Previously, old RPM files were not properly removed during package removal
> (uninistall) or upgrade. As a result, removed packages were reinstalled, or,
> during and upgrade, the system tried to install two or more different
> versions at once, causing the upgrade to fail.
> In this release, the dnf plugin has been fixed, and  RPM packages are now
> properly removed.
> Note however, that any previously removed or upgrade packages (old version)
> need to be manually removed prior to the upgrade, as before the upgrade
> completes and the system is rebooted, it still runs the old code.

Yes, looks fine.

Comment 17 peyu 2021-04-01 06:42:23 UTC
Verified this issue on redhat-virtualization-host-4.4.5-20210330.0.el8_3, the host upgrade was successful, but the installed rpm disappeared on new layer.

Test Version:
RHVM: 4.4.4.6-0.1.el8ev
RHVH: redhat-virtualization-host-4.4.5-20210330.0.el8_3 

Test Steps:
1. Install RHVH-4.4-20210307.0-RHVH-x86_64-dvd1.iso
2. Add host to RHVM
3. Set up local repo and point to "redhat-virtualization-host-4.4.5-20210330.0.el8_3"
4. Install more than one version of rpm packages, such as:
   # yum install vdsm-hook-nestedvt-4.40.35-1.el8ev.noarch.rpm
   # yum install vdsm-hook-nestedvt-4.40.39-1.el8ev.noarch.rpm
5. Upgrade the host via RHVM
6. After upgrade, check the installed rpm
   # rpm -qa | grep vdsm-hook-nestedvt


Test results:
RHVH upgrade was successful, but the installed rpm cannot be found from the new layer rhvh-4.4.5.4-0.20210330.0+1 

Before upgrade:
~~~~~~
# imgbase w
You are on rhvh-4.4.4.2-0.20210307.0+1

# imgbase layout
rhvh-4.4.4.2-0.20210307.0
 +- rhvh-4.4.4.2-0.20210307.0+1

# rpm -qa | grep vdsm-hook-nestedvt
vdsm-hook-nestedvt-4.40.39-1.el8ev.noarch

# ll /var/imgbased/persisted-rpms/
total 16
-rw-r--r--. 1 root root 8156 Apr  1 01:44 vdsm-hook-nestedvt-4.40.35.1-1.el8ev.noarch.rpm
-rw-r--r--. 1 root root 8152 Apr  1 01:44 vdsm-hook-nestedvt-4.40.39-1.el8ev.noarch.rpm
~~~~~~


After upgrade:
~~~~~~
# imgbase w
You are on rhvh-4.4.5.4-0.20210330.0+1

# imgbase layout
rhvh-4.4.4.2-0.20210307.0
 +- rhvh-4.4.4.2-0.20210307.0+1
rhvh-4.4.5.4-0.20210330.0
 +- rhvh-4.4.5.4-0.20210330.0+1


[root@dell-per730-35 ~]# rpm -qa | grep vdsm-hook-nestedvt
[root@dell-per730-35 ~]# 
[root@dell-per730-35 ~]# ll /var/imgbased/persisted-rpms/
total 16
-rw-r--r--. 1 root root 8156 Apr  1 01:44 vdsm-hook-nestedvt-4.40.35.1-1.el8ev.noarch.rpm
-rw-r--r--. 1 root root 8152 Apr  1 01:44 vdsm-hook-nestedvt-4.40.39-1.el8ev.noarch.rpm
~~~~~~

So will move back to “ASSIGNED”

Comment 18 peyu 2021-04-01 06:47:41 UTC
Created attachment 1768203 [details]
host logs from /var/log

Comment 19 Sandro Bonazzola 2021-04-02 08:28:06 UTC
Re-targeting to 4.4.6 not being a blocker for 4.4.5.

Comment 21 Lev Veyde 2021-04-14 17:13:09 UTC
As previously described the original patch was fixing the broken code.

Thus in order to test the fix one was required to install the version of RHV-H with the new code and then test it by performing any of the following actions using yum/dnf:

- a) install any RPM, and then remove it

- b) install any RPM, and then upgrade it

- c) install any RPM, then upgrade it, then downgrade it


With the RHV-H that includes the *current* fixed version the contents of the /var/imgbased/persisted-rpms/ directory should include only the currently installed RPMs.

Thus the following state were expected in broken vs fixed env. :

- a) old RPM(s) still being located in the /var/imgbased/persisted-rpms/ vs. no RPM file in the directory

- b) both old and new versions being located in the /var/imgbased/persisted-rpms/ vs. only the latest version being located there

- c) all ever installed versions being located in the /var/imgbased/persisted-rpms/ vs. only the currently installed version being located there


However, since the fixed code was only affecting the RPM installation logic on a new, fixed version of the RHV-H, it couldn't fix already broken system(s).
Thus broken system required a manual cleanup, in case "a" as either before, or after the upgrade, in order to get the expected configuration.
In the worst cases, i.e. cases "b" and/or "c" it required manual removal of the files before a successful upgrade could be performed.
And that caused confusion with the QA and got the BZ to flagged as FailedQA, as there was an expectation that a new version will also contain self-healing
(or auto-healing) abilities to fix an already broken system.

Upon further investigation, it was decided to add yet more code to allow this auto-healing during the upgrade process.

The patch that includes these new abilities was sent and now we'll need to verify that now we're not just handling the install/remove/upgrade processes
appropriately, but indeed we're able to fix a broken RHV-H system as well.

Comment 22 Lev Veyde 2021-04-14 17:17:32 UTC
Auto-healing capabilities are supposed to be included in the imgbase-1.2.19 which is going to be built once the patch goes through the review process.

Comment 23 peyu 2021-04-15 01:48:12 UTC
(In reply to Lev Veyde from comment #22)
> Auto-healing capabilities are supposed to be included in the imgbase-1.2.19
> which is going to be built once the patch goes through the review process.

Got it, thank you. QE will use imgbase-1.2.19 to verify again.

Comment 26 peyu 2021-04-23 01:49:38 UTC
Pending on new buid of imgbase-1.2.19 to verify this bug.

Comment 27 peyu 2021-04-27 06:52:31 UTC
Verified this issue on "redhat-virtualization-host-4.4.6-20210426.0.el8_4".


Test1 Steps:
1. Install redhat-virtualization-host-4.4.5-20210330.0.el8_3 
2. Add host to RHVM
3. Set up local repo and point to "redhat-virtualization-host-4.4.6-20210426.0.el8_4"
4. Install more than one version of rpm packages, such as:
   # yum install vdsm-hook-nestedvt-4.40.35-1.el8ev.noarch.rpm
   # yum install vdsm-hook-nestedvt-4.40.40-1.el8ev.noarch.rpm
5. Upgrade the host via RHVM
6. After upgrade, check the installed rpm
   # rpm -qa | grep vdsm-hook-nestedvt


Test2 Steps:
1. Install redhat-virtualization-host-4.4.4-20210307.0.el8_3
2. Add host to RHVM
3. Set up local repo and point to "redhat-virtualization-host-4.4.6-20210426.0.el8_4"
4. Install more than one version of rpm packages, such as:
   # yum install vdsm-hook-nestedvt-4.40.35-1.el8ev.noarch.rpm
   # yum install vdsm-hook-nestedvt-4.40.40-1.el8ev.noarch.rpm
5. Upgrade the host via RHVM
6. After upgrade, check the installed rpm
   # rpm -qa | grep vdsm-hook-nestedvt


Test results:
RHVH upgrade was successful, the installed rpm can be found from the new layer rhvh-4.4.6.1-0.20210426.0+1

So move bug status to “VERIFIED”.

Comment 39 errata-xmlrpc 2021-06-03 10:24:29 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat Virtualization Host security update [ovirt-4.4.6]), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2239


Note You need to log in before you can comment on or make changes to this bug.