Bug 1482551 - repoquery reports "Check uncompressed DB failed" during openshift-ansible upgrade
Summary: repoquery reports "Check uncompressed DB failed" during openshift-ansible upg...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 3.6.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 3.7.0
Assignee: Luke Meyer
QA Contact: Johnny Liu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-08-17 14:15 UTC by Justin Pierce
Modified: 2017-11-28 22:07 UTC (History)
6 users (show)

Fixed In Version: openshift-ansible-3.7.0-0.126.4
Doc Type: No Doc Update
Doc Text:
undefined
Clone Of:
Environment:
Last Closed: 2017-11-28 22:07:41 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2017:3188 0 normal SHIPPED_LIVE Moderate: Red Hat OpenShift Container Platform 3.7 security, bug, and enhancement update 2017-11-29 02:34:54 UTC

Description Justin Pierce 2017-08-17 14:15:28 UTC
Description of problem:
Encountered this error during large cluster node upgrade:

fatal: [starter-us-west-2-node-compute-07d37]: FAILED! => {"changed": false, "failed": true, "msg": {"cmd": "/usr/bin/repoquery --plugins --quiet --pkgnarrow=repos --queryformat=%{version}|%{release}|%{arch}|%{repo}|%{version}-%{release} --config=/tmp/tmphbrdnv atomic-openshift-excluder", "package_found": false, "results": {}, "returncode": 1, "stderr": "rhel-7-server-rpms: Check uncompressed DB failed\n", "stdout": ""}}


Version-Release number of selected component (if applicable):
3.6.173.0.5

How reproducible:
Low

Steps to Reproduce:
1. Run node upgrade on large cluster & hope

Additional info:
After encountering this error, I ran repoquery from the same node and it did not report an error.

Comment 1 Scott Dodson 2017-08-21 12:49:13 UTC
This bug is a general class of problems associated with the need to retry operations related to rpmdb, yum, and repoquery. Luke is proposing a strategy for adding a retry pattern that we could apply as a general solution to this problem.

https://github.com/openshift/openshift-ansible/pull/5125

Comment 2 Luke Meyer 2017-09-08 13:33:59 UTC
That PR is waiting on some package spec issues to be worked out in https://github.com/openshift/openshift-ansible/pull/4264 so that the new action_plugin path can be added (otherwise tasks would be totally broken under the RPM).

Comment 3 Luke Meyer 2017-09-13 19:05:18 UTC
With https://github.com/openshift/openshift-ansible/pull/5125 stalled at the moment, I disentangled the repoquery fixes and made a new PR: https://github.com/openshift/openshift-ansible/pull/5401

Comment 4 Johnny Liu 2017-09-19 07:03:51 UTC
This bug is really hard to be reproduced in QE's cluster, so QE only verify this bug via code review and make sure no regression is introduced.

Re-test this bug with openshift-ansible-3.7.0-0.126.4.git.0.3fc2b9b.el7.noarch, the PR is merged, and not introduce any regression bug. 

But the retries are not been added for repoquery_cmd in playbooks/common/openshift-cluster/upgrades/docker/upgrade_check.yml.

Comment 5 Luke Meyer 2017-09-19 13:12:07 UTC
You are right, I missed that one because I wasn't looking in the playbooks for tasks. Thank you for catching this.

Comment 6 Luke Meyer 2017-09-20 19:13:32 UTC
https://github.com/openshift/openshift-ansible/pull/5401 merged to fix this further.

Comment 7 Luke Meyer 2017-09-20 19:14:26 UTC
My apologies. The follow-on PR was https://github.com/openshift/openshift-ansible/pull/5464

Comment 8 Johnny Liu 2017-09-22 09:40:40 UTC
Verified this bug with openshift-ansible-3.7.0-0.127.0.git.0.b9941e4.el7.noarch, and PASS.


PR is merged, and no regression bug is found.

Comment 9 Luke Meyer 2017-10-06 17:07:19 UTC
I don't think this change needs to be documented as it really only addresses the issue partially. I'd rather say something about it once the yum retries PR merges.

Comment 12 errata-xmlrpc 2017-11-28 22:07:41 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:3188


Note You need to log in before you can comment on or make changes to this bug.