Bug 1277269 - Installing large number of errata updates causes rpmdb failures
Installing large number of errata updates causes rpmdb failures
Status: CLOSED ERRATA
Product: Red Hat Satellite 6
Classification: Red Hat
Component: Errata Management (Show other bugs)
6.1.4
Unspecified Unspecified
high Severity high (vote)
: 6.1.5
: --
Assigned To: Chris Duryee
sthirugn@redhat.com
: Triaged
: 1289229 (view as bug list)
Depends On: 1289229
Blocks:
  Show dependency treegraph
 
Reported: 2015-11-02 16:06 EST by Mike McCune
Modified: 2017-07-26 15:40 EDT (History)
11 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-12-15 04:20:23 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
gofer 2.6.7 tarball (302.47 KB, application/x-gzip)
2015-11-20 19:59 EST, Jeff Ortel
no flags Details
gofer 2.6.7 srpm (347.85 KB, application/x-rpm)
2015-11-20 20:02 EST, Jeff Ortel
no flags Details
client /var/log/messages (10.61 KB, text/plain)
2015-12-10 17:08 EST, sthirugn@redhat.com
no flags Details
gofer-2.6.8.tar.gz (302.53 KB, application/x-gzip)
2015-12-10 18:44 EST, Jeff Ortel
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Pulp Redmine 1364 Urgent CLOSED - CURRENTRELEASE gofer sets __debug__ flag in python interpreter, resulting in yum exception 2016-03-23 14:30 EDT

  None (edit)
Description Mike McCune 2015-11-02 16:06:16 EST
Installing large #s of errata updates to a host will cause undetermined rpmdb errors:

"Rpmdb checksum is invalid: dCDPT(pkg checksums): tzdata.noarch 0:2015a-1.el6 - u"

This can happen on different errata, regardless of which errata is selected during the install, we have seen customers experience this on different packages.

This only seems to occur when you update over 100 errata at a time, selecting the individual errata to update will not show this error.
Comment 2 Chris Duryee 2015-11-02 19:50:05 EST
There is some yum code that performs some rpmdb cache cleanup if an error condition occurs:

        for txmbr in precache:
            (n, a, e, v, r) = txmbr.pkgtup
            pkg = self.searchNevra(n, e, v, r, a)
            if not pkg:
                # Wibble?
                self._deal_with_bad_rpmdbcache("dCDPT(pkg checksums)")
                continue


The _deal_with_bad_rpmdbcache() method performs some cleanup, and then does:

  if __debug__:
    raise Errors.PackageSackError, 'Rpmdb checksum is invalid: %s' % caller

However, python by default will set __debug__ to true, unless -O is set[1].

I was able to not raise the exception by setting the shebang in /usr/bin/goferd to:

#!/usr/bin/python -O

This sets __debug__ to false, and lets the yum transaction complete. I believe this is safe because the error was happening in dropCachedDataPostTransaction.

[1] https://docs.python.org/2/library/constants.html
Comment 3 pulp-infra@redhat.com 2015-11-10 08:42:29 EST
The Pulp upstream bug status is at NEW. Updating the external tracker on this bug.
Comment 4 pulp-infra@redhat.com 2015-11-10 08:42:31 EST
The Pulp upstream bug priority is at Normal. Updating the external tracker on this bug.
Comment 5 pulp-infra@redhat.com 2015-11-17 14:00:15 EST
The Pulp upstream bug priority is at High. Updating the external tracker on this bug.
Comment 8 pulp-infra@redhat.com 2015-11-18 12:00:20 EST
The Pulp upstream bug status is at ASSIGNED. Updating the external tracker on this bug.
Comment 11 pulp-infra@redhat.com 2015-11-19 14:00:19 EST
The Pulp upstream bug priority is at Urgent. Updating the external tracker on this bug.
Comment 12 pulp-infra@redhat.com 2015-11-20 16:30:17 EST
The Pulp upstream bug status is at POST. Updating the external tracker on this bug.
Comment 13 Jeff Ortel 2015-11-20 19:59 EST
Created attachment 1097377 [details]
gofer 2.6.7 tarball
Comment 14 Jeff Ortel 2015-11-20 20:02 EST
Created attachment 1097379 [details]
gofer 2.6.7 srpm
Comment 16 Mike McCune 2015-11-23 14:01:47 EST
Please see this comment here for HOTFIX instructions on how to apply large # of errata with Satellite 6:

https://bugzilla.redhat.com/show_bug.cgi?id=1269509#c18
Comment 17 pulp-infra@redhat.com 2015-11-30 12:31:48 EST
The Pulp upstream bug status is at MODIFIED. Updating the external tracker on this bug.
Comment 19 sthirugn@redhat.com 2015-12-07 13:41:11 EST
Failed - the failure reason is shown in https://bugzilla.redhat.com/show_bug.cgi?id=1289229
Comment 20 sthirugn@redhat.com 2015-12-07 13:49:30 EST
*** Bug 1289229 has been marked as a duplicate of this bug. ***
Comment 22 sthirugn@redhat.com 2015-12-10 17:07:43 EST
Failed on rhel 6 content host:
Attempted to install 329 errata on a content host and it ran for a while and failed.  The error is the same as mentioned in the original description of the bug.  /var/log/messages error is attached.

But the result looks weird to me because yum history looks like this:
# yum history
Loaded plugins: package_upload, product-id, security, subscription-manager
ID     | Login user               | Date and time    | Action(s)      | Altered
-------------------------------------------------------------------------------
     8 | root <root>              | 2015-12-10 16:37 | I, U           |  554 **
     7 | root <root>              | 2015-12-10 16:00 | Update         |    2   
     6 | root <root>              | 2015-12-10 11:30 | Install        |   11 E<
     5 | System <unset>           | 2015-12-09 17:08 | Install        |    1 > 
     4 | System <unset>           | 2015-12-09 17:07 | Install        |    1   
     3 | System <unset>           | 2015-12-09 17:03 | Install        |    1   
     2 | System <unset>           | 2015-12-09 17:01 | Install        |   12   
     1 | System <unset>           | 2015-12-09 16:51 | Install        |  641   

Note:
1. Looking at id=8, there are 554 packages altered and they are infact installed in the content host.
2. Also satellite shows success for this task and does not show these errata as applicable to the content host anymore.

So it looks like the installation of packages completed in content host and also satellite is informed of the completion but the rpmchecksum error happened just after that.
Comment 23 sthirugn@redhat.com 2015-12-10 17:08 EST
Created attachment 1104508 [details]
client /var/log/messages
Comment 25 Jeff Ortel 2015-12-10 18:44 EST
Created attachment 1104515 [details]
gofer-2.6.8.tar.gz
Comment 26 Jeff Ortel 2015-12-10 18:45:13 EST
Comment on attachment 1104515 [details]
gofer-2.6.8.tar.gz

export PYTHONOPTIMIZE in the init script.
Comment 30 sthirugn@redhat.com 2015-12-14 15:52:07 EST
Verified in Sat 6.1.5 - all passed.

rhel7 - PASS - I performed this test twice but noticed performance issue in one Satellite (see https://bugzilla.redhat.com/show_bug.cgi?id=1290867) and not other.  The other satellite had a slightly better cpu specifications.

rhel6 - PASS - 
Tested rhel 6.5 with 329 erratas - PASS
Test completed in 25 minutes

rhel5 - PASS - 
rhel 5.11 - PASS
70 erratas - 11.51 AM - took 20 min 
rhel 5.8 - FAIL (see https://bugzilla.redhat.com/show_bug.cgi?id=1291424) - Errata applicability is not shown in satellite

Version tested:
Content host:
# rpm -qa | grep katello-agent
katello-agent-2.2.6-1.el5

# rpm -qa | grep gofer
python-gofer-2.6.8-1.el5
python-gofer-proton-2.6.8-1.el5
gofer-2.6.8-1.el5

Satellite:
* candlepin-0.9.49.9-1.el7.noarch
* candlepin-common-1.0.22-1.el7.noarch
* candlepin-guice-3.0-2_redhat_1.el7.noarch
* candlepin-scl-1-5.el7.noarch
* candlepin-scl-quartz-2.1.5-6.el7.noarch
* candlepin-scl-rhino-1.7R3-3.el7.noarch
* candlepin-scl-runtime-1-5.el7.noarch
* candlepin-selinux-0.9.49.9-1.el7.noarch
* candlepin-tomcat-0.9.49.9-1.el7.noarch
* docker-1.8.2-8.el7.x86_64
* docker-selinux-1.8.2-8.el7.x86_64
* elasticsearch-0.90.10-7.el7.noarch
* foreman-1.7.2.48-1.el7sat.noarch
* foreman-compute-1.7.2.48-1.el7sat.noarch
* foreman-debug-1.7.2.48-1.el7sat.noarch
* foreman-discovery-image-3.0.5-3.el7sat.noarch
* foreman-gce-1.7.2.48-1.el7sat.noarch
* foreman-libvirt-1.7.2.48-1.el7sat.noarch
* foreman-ovirt-1.7.2.48-1.el7sat.noarch
* foreman-postgresql-1.7.2.48-1.el7sat.noarch
* foreman-proxy-1.7.2.7-1.el7sat.noarch
* foreman-selinux-1.7.2.17-1.el7sat.noarch
* foreman-vmware-1.7.2.48-1.el7sat.noarch
* katello-2.2.0.16-1.el7sat.noarch
* katello-certs-tools-2.2.1-1.el7sat.noarch
* katello-common-2.2.0.16-1.el7sat.noarch
* katello-debug-2.2.0.16-1.el7sat.noarch
* katello-default-ca-1.0-1.noarch
* katello-installer-2.3.22-1.el7sat.noarch
* katello-installer-base-2.3.22-1.el7sat.noarch
* katello-server-ca-1.0-1.noarch
* katello-service-2.2.0.16-1.el7sat.noarch
* openldap-2.4.40-8.el7.x86_64
* pulp-docker-plugins-0.2.5-1.el7sat.noarch
* pulp-katello-0.5-1.el7sat.noarch
* pulp-nodes-common-2.6.0.17-1.el7sat.noarch
* pulp-nodes-parent-2.6.0.17-1.el7sat.noarch
* pulp-puppet-plugins-2.6.0.17-1.el7sat.noarch
* pulp-puppet-tools-2.6.0.17-1.el7sat.noarch
* pulp-rpm-plugins-2.6.0.17-1.el7sat.noarch
* pulp-selinux-2.6.0.17-1.el7sat.noarch
* pulp-server-2.6.0.17-1.el7sat.noarch
* python-ldap-2.4.15-2.el7.x86_64
* python-pulp-docker-common-0.2.5-1.el7sat.noarch
* ruby193-rubygem-docker-api-1.17.0-1.1.el7sat.noarch
* ruby193-rubygem-foreman_docker-1.2.0.24-1.el7sat.noarch
* ruby193-rubygem-ldap_fluff-0.3.2-1.el7.noarch
* ruby193-rubygem-net-ldap-0.3.1-3.el7sat.noarch
* ruby193-rubygem-runcible-1.3.5-1.el7sat.noarch
* rubygem-hammer_cli_foreman_docker-0.0.3.10-1.el7sat.noarch
Comment 32 errata-xmlrpc 2015-12-15 04:20:23 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2015:2622
Comment 33 pulp-infra@redhat.com 2016-02-11 16:00:32 EST
The Pulp upstream bug status is at ON_QA. Updating the external tracker on this bug.
Comment 34 pulp-infra@redhat.com 2016-03-23 14:30:37 EDT
The Pulp upstream bug status is at CLOSED - CURRENTRELEASE. Updating the external tracker on this bug.

Note You need to log in before you can comment on or make changes to this bug.