Hide Forgot
We see the Kube API to be unavailable during upgrades on AWS. This is not supposed to happen if graceful termination and LB endpoint reconcialation by the cloud provider work correctly. Note: openshift-apiserver APIs are unavailable to if the kube-apiserver is not serving correctly. This is an umbrella bug, cloned into releases and closed when we are happy with the upgrade stability.
Bug 1801885 gave [1] as a 4.3.1 -> 4.4.0-0.ci-2020-02-11-153441 example that failed on: [Disruptive] Cluster upgrade [Top Level] [Disruptive] Cluster upgrade should maintain a functioning cluster [Feature:ClusterUpgrade] [Serial] [Suite:openshift] with: fail [github.com/openshift/origin/test/extended/util/disruption/controlplane/controlplane.go:56]: Feb 11 17:07:56.230: API was unreachable during upgrade for at least 2m3s: The error message has evolved since, with "during upgrade" -> "during disruption". CI search turns up a number of recent hits [2], but the bulk are in release-openshift-origin-installer-e2e-aws-upgrade which is used by cluster-bot for update tests launched with all sorts of source and target versions. I don't see anything serious that's obviously 4.6-specific: $ w3m -dump -cols 200 'https://search.svc.ci.openshift.org/?name=release-openshift-.*aws.*upgrade&search=API%20was%20unreachable%20during%20disruption%20for%20at%20least' | grep upgrade release-openshift-origin-installer-e2e-aws-upgrade - 384 runs, 41% failed, 66% of failures match release-openshift-origin-installer-e2e-aws-upgrade-rollback-4.3 - 2 runs, 0% failed, 50% of runs match release-openshift-origin-installer-e2e-aws-upgrade-4.4-stable-to-4.4-ci - 2 runs, 0% failed, 50% of runs match release-openshift-origin-installer-e2e-aws-upgrade-rollback-4.2-to-4.3 - 3 runs, 100% failed, 67% of failures match release-openshift-origin-installer-e2e-aws-upgrade-4.2-to-4.3 - 4 runs, 0% failed, 75% of runs match release-openshift-origin-installer-e2e-aws-upgrade-4.2-nightly-to-4.3-nightly - 4 runs, 0% failed, 75% of runs match release-openshift-origin-installer-e2e-aws-upgrade-4.5-stable-to-4.6-ci - 12 runs, 42% failed, 80% of failures match release-openshift-origin-installer-e2e-aws-upgrade-4.1-to-4.2-to-4.3-nightly - 3 runs, 100% failed, 67% of failures match release-openshift-okd-installer-e2e-aws-upgrade - 32 runs, 28% failed, 189% of failures match release-openshift-origin-installer-e2e-aws-upgrade-rollback-4.4-to-4.5 - 1 runs, 100% failed, 100% of failures match release-openshift-origin-installer-e2e-aws-upgrade-rollback-4.4-to-4.4 - 1 runs, 0% failed, 100% of runs match release-openshift-origin-installer-e2e-aws-upgrade-4.1-to-4.2-to-4.3-to-4.4-nightly - 3 runs, 100% failed, 33% of failures match release-openshift-origin-installer-e2e-aws-upgrade-4.2-to-4.3-to-4.4-to-4.5-ci - 3 runs, 100% failed, 33% of failures match release-openshift-origin-installer-e2e-aws-upgrade-rollback-4.4 - 3 runs, 0% failed, 33% of runs match And even then, a fair number of those hits are the non-fatal informer flavor, with: ...this is currently sufficient to pass the test/job but not considered completely correct... Would be good to have folks link a 4.6 job that failed on this, if this is in fact still happening. [1] https://deck-ci.apps.ci.l2s4.p1.openshiftapps.com/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade/17130 [2]: https://search.svc.ci.openshift.org/?name=release-openshift-.*aws.*upgrade&search=API%20was%20unreachable%20during%20disruption%20for%20at%20least
*** Bug 1801885 has been marked as a duplicate of this bug. ***
Work in progress.
*** Bug 1865857 has been marked as a duplicate of this bug. ***
*** Bug 1868496 has been marked as a duplicate of this bug. ***
This is an umbrella bug for aws API disruption. Labelling with UpcomingSprint.
This bug hasn't had any activity in the last 30 days. Maybe the problem got resolved, was a duplicate of something else, or became less pressing for some reason - or maybe it's still relevant but just hasn't been looked at yet. As such, we're marking this bug as "LifecycleStale" and decreasing the severity/priority. If you have further information on the current state of the bug, please update it, otherwise this bug can be closed in about 7 days. The information can be, for example, that the problem still occurs, that you still want the feature, that more information is needed, or that the bug is (for whatever reason) no longer relevant. Additionally, you can add LifecycleFrozen into Keywords if you think this bug should never be marked as stale. Please consult with bug assignee before you do that.
Should this be closed as a dup of bug 1943804?
The LifecycleStale keyword was removed because the bug got commented on recently. The bug assignee was notified.
*** This bug has been marked as a duplicate of bug 1943804 ***