Bug 1742753 - Upgrade from 4.1 to 4.2 fails due to cloud-credential-operator
Summary: Upgrade from 4.1 to 4.2 fails due to cloud-credential-operator
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cloud Credential Operator
Version: 4.2.0
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 4.2.0
Assignee: Joel Diaz
QA Contact: Oleg Nesterov
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-08-16 20:13 UTC by Clayton Coleman
Modified: 2019-10-16 06:36 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-10-16 06:36:19 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cloud-credential-operator pull 103 0 None closed Bug 1742753: clear other conditions when ignoring a cred request 2020-08-25 14:10:10 UTC
Red Hat Product Errata RHBA-2019:2922 0 None None None 2019-10-16 06:36:30 UTC

Description Clayton Coleman 2019-08-16 20:13:23 UTC
https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade-4.1-to-4.2/275

https://storage.googleapis.com/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade-4.1-to-4.2/275/artifacts/e2e-aws-upgrade/e2e.log

A 4.1 to 4.2 upgrade failed in CI due to cloud-credential-operator being wedged.  The upgrade job eventually timed out.

Aug 16 04:45:12.752: INFO: cluster upgrade is Failing: Cluster operator cloud-credential is still updating
Aug 16 04:45:22.752: INFO: cluster upgrade is Progressing: Unable to apply 4.2.0-0.ci-2019-08-16-001726: the cluster operator cloud-credential has not yet successfully rolled out

Urgent because upgrades must always work.

Comment 1 Joel Diaz 2019-08-19 19:47:36 UTC
It appears that some CredentialsRequest objects are set to Ignored==True (b/c they are for a different cloud/platform), but a previous ProvisionFailure==True is also set and causing cloud-cred-operator to be degraded.

PR https://github.com/openshift/cloud-credential-operator/pull/103 addresses this case by clearing other conditions when deciding to ignore a CredentialsRequest. Will need to backport this to 4.2 once merged.

Comment 3 Oleg Nesterov 2019-08-20 09:17:42 UTC
Verified. Cluster operator cloud-credential is upgraded successfully. 

[onest@localhost bugzilla]$ oc get clusteroperator
NAME                                 VERSION   AVAILABLE   PROGRESSING   DEGRADED   SINCE
authentication                       4.1.11    True        False         False      7m57s
cloud-credential                     4.1.11    True        False         False      20m
cluster-autoscaler                   4.1.11    True        False         False      20m
console                              4.1.11    True        False         False      11m
dns                                  4.1.11    True        False         False      19m
image-registry                       4.1.11    True        False         False      14m
ingress                              4.1.11    True        False         False      14m
kube-apiserver                       4.1.11    True        False         False      17m
kube-controller-manager              4.1.11    True        False         False      19m
kube-scheduler                       4.1.11    True        False         False      17m
machine-api                          4.1.11    True        False         False      20m
machine-config                       4.1.11    True        False         False      19m
marketplace                          4.1.11    True        False         False      14m
monitoring                           4.1.11    True        False         False      13m
network                              4.1.11    True        False         False      19m
node-tuning                          4.1.11    True        False         False      16m
openshift-apiserver                  4.1.11    True        False         False      16m
openshift-controller-manager         4.1.11    True        False         False      19m
openshift-samples                    4.1.11    True        False         False      14m
operator-lifecycle-manager           4.1.11    True        False         False      19m
operator-lifecycle-manager-catalog   4.1.11    True        False         False      19m
service-ca                           4.1.11    True        False         False      20m
service-catalog-apiserver            4.1.11    True        False         False      16m
service-catalog-controller-manager   4.1.11    True        False         False      16m
storage                              4.1.11    True        False         False      15m

[onest@localhost bugzilla]$ oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.1.11    True        False         7m49s   Cluster version is 4.1.11
[onest@localhost bugzilla]$ oc adm upgrade --to-image registry.svc.ci.openshift.org/ocp/release:4.1.0-0.nightly-2019-08-14-005846
Updating to release image registry.svc.ci.openshift.org/ocp/release:4.1.0-0.nightly-2019-08-14-005846

[onest@localhost bugzilla]$ oc get clusteroperator
NAME                                 VERSION                             AVAILABLE   PROGRESSING   DEGRADED   SINCE
authentication                       4.1.0-0.nightly-2019-08-14-005846   True        False         False      48m
cloud-credential                     4.1.0-0.nightly-2019-08-14-005846   True        False         False      60m
cluster-autoscaler                   4.1.0-0.nightly-2019-08-14-005846   True        False         False      61m
console                              4.1.0-0.nightly-2019-08-14-005846   True        False         False      51m
dns                                  4.1.0-0.nightly-2019-08-14-005846   True        False         False      60m
image-registry                       4.1.0-0.nightly-2019-08-14-005846   True        False         False      12m
ingress                              4.1.0-0.nightly-2019-08-14-005846   True        False         False      54m
kube-apiserver                       4.1.0-0.nightly-2019-08-14-005846   True        False         False      58m
kube-controller-manager              4.1.0-0.nightly-2019-08-14-005846   True        False         False      59m
kube-scheduler                       4.1.0-0.nightly-2019-08-14-005846   True        False         False      57m
machine-api                          4.1.0-0.nightly-2019-08-14-005846   True        False         False      60m
machine-config                       4.1.0-0.nightly-2019-08-14-005846   True        False         False      60m
marketplace                          4.1.0-0.nightly-2019-08-14-005846   True        False         False      9m46s
monitoring                           4.1.0-0.nightly-2019-08-14-005846   True        False         False      12m
network                              4.1.0-0.nightly-2019-08-14-005846   True        False         False      60m
node-tuning                          4.1.0-0.nightly-2019-08-14-005846   True        False         False      30m
openshift-apiserver                  4.1.0-0.nightly-2019-08-14-005846   True        False         False      12m
openshift-controller-manager         4.1.0-0.nightly-2019-08-14-005846   True        False         False      60m
openshift-samples                    4.1.0-0.nightly-2019-08-14-005846   True        False         False      31m
operator-lifecycle-manager           4.1.0-0.nightly-2019-08-14-005846   True        False         False      59m
operator-lifecycle-manager-catalog   4.1.0-0.nightly-2019-08-14-005846   True        False         False      59m
service-ca                           4.1.0-0.nightly-2019-08-14-005846   True        False         False      60m
service-catalog-apiserver            4.1.0-0.nightly-2019-08-14-005846   True        False         False      57m
service-catalog-controller-manager   4.1.0-0.nightly-2019-08-14-005846   True        False         False      57m
storage                              4.1.0-0.nightly-2019-08-14-005846   True        False         False      31m
[onest@localhost bugzilla]$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.1.0-0.nightly-2019-08-14-005846   True        False         7m46s   Cluster version is 4.1.0-0.nightly-2019-08-14-005846

Comment 4 Oleg Nesterov 2019-08-20 09:19:27 UTC
oh, sorry It is wrong results. I will provide results for upgrade to 4.2 a bit later

Comment 5 Oleg Nesterov 2019-08-20 15:19:24 UTC
The issue with cloud credential operator upgrade is fixed
Upgrade operation is still working because of dns, machine-config, marketplace and network operator are not upgraded

[onest@localhost bugzilla]$ oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.1.11    True        False         42m     Cluster version is 4.1.11
[onest@localhost bugzilla]$ oc get clusteroperator
NAME                                 VERSION   AVAILABLE   PROGRESSING   DEGRADED   SINCE
authentication                       4.1.11    True        False         False      42m
cloud-credential                     4.1.11    True        False         False      53m
cluster-autoscaler                   4.1.11    True        False         False      53m
console                              4.1.11    True        False         False      45m
dns                                  4.1.11    True        False         False      53m
image-registry                       4.1.11    True        False         False      47m
ingress                              4.1.11    True        False         False      47m
kube-apiserver                       4.1.11    True        False         False      52m
kube-controller-manager              4.1.11    True        False         False      51m
kube-scheduler                       4.1.11    True        False         False      51m
machine-api                          4.1.11    True        False         False      53m
machine-config                       4.1.11    True        False         False      52m
marketplace                          4.1.11    True        False         False      47m
monitoring                           4.1.11    True        False         False      46m
network                              4.1.11    True        False         False      53m
node-tuning                          4.1.11    True        False         False      42m
openshift-apiserver                  4.1.11    True        False         False      50m
openshift-controller-manager         4.1.11    True        False         False      52m
openshift-samples                    4.1.11    True        False         False      48m
operator-lifecycle-manager           4.1.11    True        False         False      52m
operator-lifecycle-manager-catalog   4.1.11    True        False         False      52m
service-ca                           4.1.11    True        False         False      53m
service-catalog-apiserver            4.1.11    True        False         False      50m
service-catalog-controller-manager   4.1.11    True        False         False      50m
storage                              4.1.11    True        False         False      48m

[onest@localhost bugzilla]$ oc get clusteroperator
NAME                                       VERSION                        AVAILABLE   PROGRESSING   DEGRADED   SINCE
authentication                             4.2.0-0.ci-2019-08-20-015949   True        False         False      5h20m
cloud-credential                           4.2.0-0.ci-2019-08-20-015949   True        False         False      5h31m
cluster-autoscaler                         4.2.0-0.ci-2019-08-20-015949   True        False         False      5h31m
console                                    4.2.0-0.ci-2019-08-20-015949   True        False         False      4h25m
dns                                        4.1.11                         True        False         False      5h31m
image-registry                             4.2.0-0.ci-2019-08-20-015949   True        False         False      5h25m
ingress                                    4.2.0-0.ci-2019-08-20-015949   True        False         False      5h25m
insights                                   4.2.0-0.ci-2019-08-20-015949   True        False         False      4h30m
kube-apiserver                             4.2.0-0.ci-2019-08-20-015949   True        False         False      5h30m
kube-controller-manager                    4.2.0-0.ci-2019-08-20-015949   True        False         False      5h29m
kube-scheduler                             4.2.0-0.ci-2019-08-20-015949   True        False         False      5h29m
machine-api                                4.2.0-0.ci-2019-08-20-015949   True        False         False      5h31m
machine-config                             4.1.11                         True        False         False      5h31m
marketplace                                4.1.11                         True        False         False      5h26m
monitoring                                 4.2.0-0.ci-2019-08-20-015949   True        False         False      5h24m
network                                    4.1.11                         True        False         False      5h31m
node-tuning                                4.2.0-0.ci-2019-08-20-015949   True        False         False      4h29m
openshift-apiserver                        4.2.0-0.ci-2019-08-20-015949   True        False         False      5h28m
openshift-controller-manager               4.2.0-0.ci-2019-08-20-015949   True        False         False      5h31m
openshift-samples                          4.2.0-0.ci-2019-08-20-015949   True        False         False      4h20m
operator-lifecycle-manager                 4.2.0-0.ci-2019-08-20-015949   True        False         False      5h30m
operator-lifecycle-manager-catalog         4.2.0-0.ci-2019-08-20-015949   True        False         False      5h30m
operator-lifecycle-manager-packageserver   4.2.0-0.ci-2019-08-20-015949   True        False         False      4h29m
service-ca                                 4.2.0-0.ci-2019-08-20-015949   True        False         False      5h31m
service-catalog-apiserver                  4.2.0-0.ci-2019-08-20-015949   True        False         False      5h28m
service-catalog-controller-manager         4.2.0-0.ci-2019-08-20-015949   True        False         False      5h28m
storage                                    4.2.0-0.ci-2019-08-20-015949   True        False         False      4h30m
[onest@localhost bugzilla]$ oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.1.11    True        True          4h38m   Working towards registry.svc.ci.openshift.org/ocp/release:4.2.0-0.ci-2019-08-20-015949: downloading update

Comment 6 errata-xmlrpc 2019-10-16 06:36:19 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2922


Note You need to log in before you can comment on or make changes to this bug.