Bug 1868786
| Summary: | upgrades from 4.5.6 -> 4.6.0.0-nightly failing: Failed to upgrade openshift-apiserver, operator was degraded | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Micah Abbott <miabbott> |
| Component: | Etcd | Assignee: | Dan Mace <dmace> |
| Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | ge liu <geliu> |
| Severity: | low | Docs Contact: | |
| Priority: | low | ||
| Version: | 4.6 | CC: | aos-bugs, btenneti, mfojtik, mifiedle, rpattath, wking |
| Target Milestone: | --- | ||
| Target Release: | 4.6.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: |
[sig-arch][Feature:ClusterUpgrade] Cluster should remain functional during upgrade [Disruptive] [Serial] [Suite:openshift]
|
|
| Last Closed: | 2020-09-11 18:14:22 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Micah Abbott
2020-08-13 19:35:58 UTC
Network checker shows plenty of outages. Two of the three openshift-apiservers (those which are not available) get connection refused from etcd. Any update on this ? This is a blocker for verification needing for https://bugzilla.redhat.com/show_bug.cgi?id=1868750 I'm confused as to how this bug is evidence of any current blocker. It's referring to a couple of release promotion failures from mid August. As far as I can tell, the most recent CI and nightly promotion jobs are not being rejected in the last day due to any upgrade issue resembling the report: https://openshift-release.apps.ci.l2s4.p1.openshiftapps.com/#4.6.0-0.ci https://openshift-release.apps.ci.l2s4.p1.openshiftapps.com/#4.6.0-0.nightly The nearest equivalent AWS periodic jobs are not failing for any apparently related reason: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.6-informing#release-openshift-origin-installer-e2e-aws-upgrade-4.5-stable-to-4.6-ci What about the two failures from August support the claim there's a blocker today? Am I misunderstanding the current state of the promotion or periodic jobs? FWIW, I took a look at those two old jobs reported in August, and: 1. The apiserver, etcd, and MCO all report successful report and health by the end of the 4.6 rollout 2. The test seems to time out because of a perceived lack of MCO progress that I couldn't associate with any errors from the MCO, but I didn't dig deeper So even in a historical context, it's not clear these failures are of much interest re: etcd except perhaps in the context of ongoing disruption flakes which are already extensively covered elsewhere. I'm going to downgrade the severity and priority of this bug pending more detailed information about any perceived ongoing issue, and intend to close the bug soon lacking any such details. In the meantime I'll continue looking around in the release job pipeline to see if there's anything I'm missing. Yeah, I was just trying to do my best reporting problems while playing build watcher. Looking at the same job (release-openshift-origin-installer-e2e-aws-upgrade) I linked in the original report, it has been moderately healthy lately. I'm not aware of any underlying problems with the promotion jobs and not sure why this was tagged as an UpgradeBlocker without additional details. Understood, thanks! Going to go ahead and close this one out. |