Bug 1886131
| Summary: | [upgrade] 4.4 -> 4.5 GCP upgrades failing | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | slowrie |
| Component: | Test Infrastructure | Assignee: | W. Trevor King <wking> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | |
| Severity: | low | Docs Contact: | |
| Priority: | low | ||
| Version: | 4.5 | CC: | skuznets, wking |
| Target Milestone: | --- | Keywords: | Upgrades |
| Target Release: | 4.7.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: |
operator install service-catalog-apiserver
|
|
| Last Closed: | 2020-11-02 16:31:04 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
slowrie
2020-10-07 17:34:37 UTC
From Suresh: > 2020-10-02 01:50:24.807234 C | etcdserver/membership: cluster cannot be downgraded (current version: 3.3.22 is lower than determined cluster version: 3.4). > ... > Did someone try to roll back on master-1? Otherwise this doesn't make much sense. And indeed, this was a 4.4 -> 4.5 -> 4.4 round-trip attempt: $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-gcp-upgrade-4.4-stable-to-4.5-ci/1311793781384679424/artifacts/e2e-gcp-upgrade/gather-extra/clusterversion.json | jq -r '.items[].status.history[] | .startedTime + " " + .completionTime + " " + .version + " " + .state + " " + (.verified | tostring)' 2020-10-01T23:49:45Z 4.4.27 Partial true 2020-10-01T23:08:11Z 2020-10-01T23:49:35Z 4.5.0-0.ci-2020-10-01-174117 Completed false 2020-10-01T22:31:41Z 2020-10-01T23:05:26Z 4.4.27 Completed false Assigning to test-infra, because we'll need an openshift/release PR to make these release informers a one-way 4.4->4.5 test. Took a stab at a PR^, but here are some notes: Some informer jobs are setting abort-at, e.g. release-openshift-origin-installer-e2e-aws-upgrade-rollback-4.4-to-4.5 [1]. But that's explicitly a rollback test, so that's ok. release-openshift-origin-installer-e2e-gcp-upgrade-4.4-stable-to-4.5-ci is not setting abort-at [2], so I'm not sure why it's rolling back, but: $ curl -s https://storage.googleapis.com/origin-ci-test/logs/release-openshift-origin-installer-e2e-gcp-upgrade-4.4-stable-to-4.5-ci/1311793781384679424/build-log.txt | grep aborted Oct 1 23:08:00.578: INFO: Upgrade will be aborted and the cluster will roll back to the current version after 100% of operators have upgraded Code [3,4] certainly reads to me like the default is zero, i.e. "no abort", but I don't see anything in [5,6] about abort-at. [7] looks like multi-step, though. [8] seems to be selecting a multi-step target. Only one update workflow for the release repo: $ git --no-pager grep 'workflow.*upgrade' ci-operator/config/openshift/release ci-operator/config/openshift/release/openshift-release-master__ocp-4.5-ci.yaml: workflow: openshift-upgrade-aws Which is the e2e-44-stable-to-45-ci entry [9] only used for the periodic-ci-openshift-release-master-ocp-4.5-ci-e2e-44-stable-to-45-ci job [10]. So I'm still a bit fuzzy on how abort-at is being set for release-openshift-origin-installer-e2e-gcp-upgrade-4.4-stable-to-4.5-ci. [1]: https://github.com/openshift/release/blob/f180f1a8fb7d7fc4d1a64f29e97c8c4f4f37f5b2/ci-operator/jobs/openshift/release/openshift-release-release-4.5-periodics.yaml#L5775-L5841 [2]: https://github.com/openshift/release/blob/f180f1a8fb7d7fc4d1a64f29e97c8c4f4f37f5b2/ci-operator/jobs/openshift/release/openshift-release-release-4.5-periodics.yaml#L7303-L7369 [3]: https://github.com/openshift/origin/blob/3b3a17d295c52bb6712eaeea2682e682524b83cc/test/e2e/upgrade/upgrade.go#L88-L113 [4]: https://github.com/openshift/origin/blob/3b3a17d295c52bb6712eaeea2682e682524b83cc/test/e2e/upgrade/upgrade.go#L261-L271 [5]: https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-gcp-upgrade-4.4-stable-to-4.5-ci/1311793781384679424/podinfo.json [6]: https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-gcp-upgrade-4.4-stable-to-4.5-ci/1311793781384679424/prowjob.json [7]: https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-gcp-upgrade-4.4-stable-to-4.5-ci/1311793781384679424/artifacts/e2e-gcp-upgrade/ [8]: https://github.com/openshift/release/blob/f180f1a8fb7d7fc4d1a64f29e97c8c4f4f37f5b2/ci-operator/jobs/openshift/release/openshift-release-release-4.5-periodics.yaml#L7313 [9]: https://github.com/openshift/release/blob/f180f1a8fb7d7fc4d1a64f29e97c8c4f4f37f5b2/ci-operator/config/openshift/release/openshift-release-master__ocp-4.5-ci.yaml#L17-L21 [10]: https://github.com/openshift/release/blob/f180f1a8fb7d7fc4d1a64f29e97c8c4f4f37f5b2/ci-operator/jobs/openshift/release/openshift-release-master-periodics.yaml#L455-L465 Ah, it had been using workflows, but was recently moved back to templates [1]. [1]: https://github.com/openshift/release/pull/12395 |