Bug 1689042 - Upgrade from 4.0.0-0.alpha-2019-03-14-164644 to 4.0.0-0.alpha-2019-03-14-014544 fails
Summary: Upgrade from 4.0.0-0.alpha-2019-03-14-164644 to 4.0.0-0.alpha-2019-03-14-0145...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cloud Credential Operator
Version: 4.1.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.1.0
Assignee: Devan Goodwin
QA Contact: Oleg Nesterov
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-03-15 02:38 UTC by Miciah Dashiel Butler Masters
Modified: 2019-06-04 10:46 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-06-04 10:45:52 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2019:0758 None None None 2019-06-04 10:45:59 UTC

Description Miciah Dashiel Butler Masters 2019-03-15 02:38:25 UTC
Created attachment 1544249 [details]
cloud-credential-operator logs

Description of problem:
I installed 4.0.0-0.alpha-2019-03-14-164644 and attempted an "upgrade" to 4.0.0-0.alpha-2019-03-14-014544, and the upgrade failed on openshift-cloud-credential-operator.

Version-Release number of selected component (if applicable):
openshift-install unreleased-master-522-g88623e799b30b728e640f9c9c631d598cffcaa2c
release payload 4.0.0-0.alpha-2019-03-14-164644 (starting version)
release payload 4.0.0-0.alpha-2019-03-14-014544 (target version)

How reproducible:
2/2 times.

Steps to Reproduce:
1. AWS_PROFILE=openshift-dev bin/openshift-install create cluster --dir=./clusters/openshift-dev-mmasters
2. oc get clusterversion/version
3. oc adm upgrade --to-image=registry.svc.ci.openshift.org/openshift/origin-release:4.0.0-0.alpha-2019-03-14-014544
4. oc get clusterversion/version
5. oc -n openshift-cloud-credential-operator logs deploy/cloud-credential-operator | sed -e 's/accessKeyID=\S\+/accessKeyID=[REDACTED]/g'
6. oc get clusteroperator/openshift-cloud-credential-operator -o yaml

Actual results:
After Step 2, I get the following output:

    % oc get clusterversion/version
    NAME      VERSION                           AVAILABLE   PROGRESSING   SINCE   STATUS
    version   4.0.0-0.alpha-2019-03-14-164644   True        False         62m     Cluster version is 4.0.0-0.alpha-2019-03-14-164644

After Step 4, I get the following output:

    % oc -n openshift-cloud-credential-operator logs deploy/cloud-credential-operator | less% oc get clusterversion/version
    NAME      VERSION                           AVAILABLE   PROGRESSING   SINCE   STATUS
    version   4.0.0-0.alpha-2019-03-14-014544   True        True          78m     Unable to apply 4.0.0-0.alpha-2019-03-14-014544: the cluster operator openshift-cloud-credential-operator is failing

Output from Step 5 is attached.

After Step 6, I get the following output:

    % oc get clusteroperator/openshift-cloud-credential-operator -o yaml
    apiVersion: config.openshift.io/v1
    kind: ClusterOperator
    metadata:
      creationTimestamp: 2019-03-14T23:58:05Z
      generation: 1
      name: openshift-cloud-credential-operator
      resourceVersion: "76143"
      selfLink: /apis/config.openshift.io/v1/clusteroperators/openshift-cloud-credential-operator
      uid: 02739e44-46b5-11e9-930e-0a4ffbc0776e
    spec: {}
    status:
      conditions:
      - lastTransitionTime: 2019-03-15T01:38:00Z
        message: 4 of 4 credentials requests are failing to sync.
        reason: CredentialsFailing
        status: "True"
        type: Failing
      - lastTransitionTime: 2019-03-15T01:38:00Z
        message: 0 of 4 credentials requests provisioned, 4 reporting errors.
        reason: Reconciling
        status: "True"
        type: Progressing
      - lastTransitionTime: 2019-03-14T23:58:05Z
        status: "True"
        type: Available
      extension: null
      version: ""


Expected results:
openshift-cloud-credential-operator should succeed and upgrade should proceed.

Comment 1 Devan Goodwin 2019-04-16 17:21:31 UTC
Credentials sync failure is a result of:

time="2019-03-15T02:26:22Z" level=error msg="error while validating cloud credentials: failed checking create cloud creds: error querying current username: RequestError: send request failed\ncaused by: Post https://iam.amazonaws.com/: dial tcp: lookup iam.amazonaws.com on 172.30.0.10:53: read udp 10.128.0.68:40839->172.30.0.10:53: i/o timeout" controller=secretannotator
time="2019-03-15T02:27:22Z" level=error msg="error getting user: {\n\n}" actuator=aws cr=openshift-cloud-credential-operator/openshift-image-registry error="RequestError: send request failed\ncaused by: Post https://iam.amazonaws.com/: dial tcp: i/o timeout"
time="2019-03-15T02:27:22Z" level=error msg="error determining whether a credentials update is needed" actuator=aws cr=openshift-cloud-credential-operator/openshift-image-registry error="unable to read info for username {\n\n}: RequestError: send request failed\ncaused by: Post https://iam.amazonaws.com/: dial tcp: i/o timeout"
time="2019-03-15T02:27:22Z" level=error msg="error syncing credentials: <nil>" controller=credreq cr=openshift-cloud-credential-operator/openshift-image-registry secret=openshift-image-registry/installer-cloud-credentials
time="2019-03-15T02:27:22Z" level=error msg="errored with condition: CredentialsProvisionFailure" controller=credreq cr=openshift-cloud-credential-operator/openshift-image-registry secret=openshift-image-registry/installer-cloud-credentials
time="2019-03-15T02:27:22Z" level=debug msg="updating credentials request status" controller=credreq cr=openshift-cloud-credential-operator/openshift-image-registry secret=openshift-image-registry/installer-cloud-credentials
time="2019-03-15T02:27:22Z" level=info msg="status has changed, updating" controller=credreq cr=openshift-cloud-credential-operator/openshift-image-registry secret=openshift-image-registry/installer-cloud-credentials

Given this bug is a month old and likely fixed via network upgrade fixes around this time, going to set ON_QE, this upgrade problem should have been resolved by now. (likely was not anything in cred operator itself)

Comment 2 Oleg Nesterov 2019-04-17 08:51:52 UTC
I successfully upgraded 4.0.0-0.nightly-2019-04-10-182914 to version is 4.0.0-0.nightly-2019-04-10-141956

Here there output of the step 6 after upgrade:

[cloud-user@preserve-olnester-workstation openshift]$ oc get clusteroperator/cloud-credential -o yaml
apiVersion: config.openshift.io/v1
kind: ClusterOperator
metadata:
  creationTimestamp: 2019-04-15T10:19:02Z
  generation: 1
  name: cloud-credential
  resourceVersion: "1290970"
  selfLink: /apis/config.openshift.io/v1/clusteroperators/cloud-credential
  uid: e3f6b5f3-5f67-11e9-8f5d-028e0071033a
spec: {}
status:
  conditions:
  - lastTransitionTime: 2019-04-15T10:19:04Z
    message: No credentials requests reporting errors.
    reason: NoCredentialsFailing
    status: "False"
    type: Failing
  - lastTransitionTime: 2019-04-17T08:13:57Z
    message: 4 of 4 credentials requests provisioned and reconciled.
    reason: ReconcilingComplete
    status: "False"
    type: Progressing
  - lastTransitionTime: 2019-04-15T10:19:02Z
    status: "True"
    type: Available
  extension: null
  versions:
  - name: operator
    version: 4.0.0-0.nightly-2019-04-10-141956

Comment 4 errata-xmlrpc 2019-06-04 10:45:52 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758


Note You need to log in before you can comment on or make changes to this bug.