Bug 1781109 - [aws] Cluster operator cloud-credential is reporting a failure: 1 of 4 credentials requests are failing to sync
Summary: [aws] Cluster operator cloud-credential is reporting a failure: 1 of 4 creden...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cloud Credential Operator
Version: 4.3.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.4.0
Assignee: Joel Diaz
QA Contact: Xiaoli Tian
URL:
Whiteboard:
Depends On:
Blocks: 1776700 1783963
TreeView+ depends on / blocked
 
Reported: 2019-12-09 10:52 UTC by Vadim Rutkovsky
Modified: 2020-05-13 21:54 UTC (History)
0 users

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: CredentialsRequests metrics improperly reporting errors after conditions have cleared. Consequence: Continuing alerts after condition has resolved. Fix: Always start the metrics publishing with a zero count before finding any items with conditions. Result: When conditions clear, the metrics will now reflect the actual state which will clear any alerts.
Clone Of:
: 1783963 (view as bug list)
Environment:
Last Closed: 2020-05-13 21:54:13 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Github openshift cloud-credential-operator pull 146 None closed Bug 1781109: pre-populate conditions with count of zero 2020-09-02 11:33:54 UTC
Red Hat Product Errata RHBA-2020:0581 None None None 2020-05-13 21:54:16 UTC

Comment 2 Oleg Nesterov 2019-12-16 09:53:50 UTC
I've tested this issue during upgrade from 4.4.0-0.nightly-2019-12-14-103510 to 4.4.0-0.nightly-2019-12-14-103510. 
Currently I don't observe reported failures by cco. We will leave cluster running for some days to check if failures are appeared for the some period

Comment 4 Oleg Nesterov 2019-12-16 10:24:00 UTC
As I can see the target release is 4.4 for this fix. Could you please check it on 4.4 too?

Comment 5 Vadim Rutkovsky 2019-12-16 10:50:24 UTC
https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/pr-logs/pull/openshift_release/6396/rehearse-6396-pull-ci-openshift-cluster-kube-apiserver-operator-master-e2e-aws-upgrade/2 is 4.4 and so is https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade/12666 (this is 4.4 nightly -> 4.4 nightly upgrade).

However both ran on Dec 13 payloads, so the PR might not have merged by that time. Lets give it a few more days to run.

Comment 6 Oleg Nesterov 2019-12-19 08:57:42 UTC
Verified on 4.4.0-0.nightly-2019-12-14-103510. I've checked logs on cco pod after two days after install and did not observe this issue.

Comment 9 errata-xmlrpc 2020-05-13 21:54:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0581


Note You need to log in before you can comment on or make changes to this bug.