Bug 1694219 - cluster upgrade was reported as canceled
Summary: cluster upgrade was reported as canceled
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 4.1.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.1.0
Assignee: Abhinav Dahiya
QA Contact: Johnny Liu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-03-29 19:49 UTC by Ben Parees
Modified: 2019-06-04 10:46 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-06-04 10:46:40 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2019:0758 None None None 2019-06-04 10:46:47 UTC

Description Ben Parees 2019-03-29 19:49:08 UTC
Description of problem:
Mar 29 16:10:13.719: INFO: cluster upgrade is failing: update was cancelled at 150/315

https://openshift-gce-devel.appspot.com/build/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade-4.0/745

Not clear what would have canceled the update, I don't think the upgrade test/job does that.

Comment 2 Scott Dodson 2019-04-12 13:04:23 UTC
Better error messages should now be presented to the user, the user will need to investigate the failing operators.

Comment 4 Johnny Liu 2019-05-08 11:18:20 UTC
Test this bug from 4.1.0-0.nightly-2019-05-07-233329 to 4.1.0-0.nightly-2019-05-08-012425, and PASS.


1. Before trigger upgrade from 4.1.0-0.nightly-2019-05-07-233329 to 4.1.0-0.nightly-2019-05-08-012425, remove 'tag' for your cluster hostzone, delete *.apps DNS.
# aws route53 list-tags-for-resource --resource-type hostedzone --resource-id  ZM5GW91LZO60L
{
    "ResourceTagSet": {
        "ResourceType": "hostedzone", 
        "ResourceId": "ZM5GW91LZO60L", 
        "Tags": [
            {
                "Value": "2019-05-08T09:45:02.332342+00:00", 
                "Key": "openshift_creationDate"
            }, 
            {
                "Value": "owned", 
                "Key": "kubernetes.io/cluster/jialiu-upi2-2bt5d"
            }, 
            {
                "Value": "2019-05-10T09:45:02.332342+00:00", 
                "Key": "openshift_expirationDate"
            }
        ]
    }
}

2. Trigger upgrade.
3. Watch clusterversion output
[root@preserve-jialiu-ansible 20190508]# oc adm upgrade --to-image=registry.svc.ci.openshift.org/ocp/release:4.1.0-0.nightly-2019-05-08-012425  --force
Updating to release image registry.svc.ci.openshift.org/ocp/release:4.1.0-0.nightly-2019-05-08-012425

[root@preserve-jialiu-ansible 20190508]# oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version             True        True          5s      Working towards registry.svc.ci.openshift.org/ocp/release:4.1.0-0.nightly-2019-05-08-012425: downloading update

[root@preserve-jialiu-ansible 20190508]# oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.1.0-0.nightly-2019-05-08-012425   True        True          17s     Working towards 4.1.0-0.nightly-2019-05-08-012425: 1% complete

[root@preserve-jialiu-ansible 20190508]# oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.1.0-0.nightly-2019-05-08-012425   True        True          7m29s   Unable to apply 4.1.0-0.nightly-2019-05-08-012425: an unknown error has occurred

[root@preserve-jialiu-ansible 20190508]# oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.1.0-0.nightly-2019-05-08-012425   True        True          14m     Working towards 4.1.0-0.nightly-2019-05-08-012425: 83% complete, waiting on authentication, openshift-controller-manager

[root@preserve-jialiu-ansible 20190508]# oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version             True        True          19m     Working towards registry.svc.ci.openshift.org/ocp/release:4.1.0-0.nightly-2019-05-08-012425: downloading update

[root@preserve-jialiu-ansible 20190508]# oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.1.0-0.nightly-2019-05-08-012425   True        True          22m     Working towards 4.1.0-0.nightly-2019-05-08-012425: 83% complete, waiting on authentication

[root@preserve-jialiu-ansible 20190508]# oc describe clusteroperator authentication
<--snip-->
Status:
  Conditions:
    Last Transition Time:  2019-05-08T10:55:32Z
    Message:               Degraded: error checking current version: unable to check route health: failed to GET route: dial tcp: lookup oauth-openshift.apps.jialiu-upi2.qe1.devcluster.openshift.com on 172.30.0.10:53: no such host
    Reason:                DegradedOperatorSyncLoopError
    Status:                True
    Type:                  Degraded
    Last Transition Time:  2019-05-08T10:35:02Z
    Reason:                AsExpected
    Status:                False
    Type:                  Progressing
    Last Transition Time:  2019-05-08T10:00:30Z
    Reason:                AsExpected
    Status:                True
    Type:                  Available
    Last Transition Time:  2019-05-08T09:43:47Z
    Reason:                NoData
    Status:                Unknown
    Type:                  Upgradeable
<--snip-->

Comment 6 errata-xmlrpc 2019-06-04 10:46:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758


Note You need to log in before you can comment on or make changes to this bug.