Bug 1967388
Summary: | 4.7 network operator degrades pushing v1alpha1 FlowSchema to 4.8 API-servers | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | W. Trevor King <wking> |
Component: | Networking | Assignee: | Alexander Constantinescu <aconstan> |
Networking sub component: | openshift-sdn | QA Contact: | Ying Wang <yingwang> |
Status: | CLOSED ERRATA | Docs Contact: | |
Severity: | urgent | ||
Priority: | urgent | CC: | aconstan, skordas |
Version: | 4.7 | Keywords: | Upgrades |
Target Milestone: | --- | ||
Target Release: | 4.7.z | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2021-06-29 04:19:45 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1913399 | ||
Bug Blocks: |
Description
W. Trevor King
2021-06-03 04:43:08 UTC
Seems popular: $ w3m -dump -cols 200 'https://search.ci.openshift.org/?maxAge=48h&type=junit&search=flowcontrol.apiserver.k8s.io/v1alpha1.*FlowSchema.*the+server+cou ld+not+find+the+requested+resource' | grep 'failures match' | sort periodic-ci-openshift-release-master-ci-4.8-upgrade-from-from-stable-4.7-from-stable-4.6-e2e-aws-upgrade (all) - 1 runs, 100% failed, 100% of failures match = 100% impact periodic-ci-openshift-release-master-ci-4.8-upgrade-from-stable-4.7-e2e-aws-compact-upgrade (all) - 1 runs, 100% failed, 100% of failures match = 100% impact periodic-ci-openshift-release-master-ci-4.8-upgrade-from-stable-4.7-e2e-aws-ovn-upgrade (all) - 37 runs, 100% failed, 95% of failures match = 95% impact periodic-ci-openshift-release-master-ci-4.8-upgrade-from-stable-4.7-e2e-aws-ovn-upgrade-rollback (all) - 2 runs, 100% failed, 50% of failures match = 50% impact periodic-ci-openshift-release-master-ci-4.8-upgrade-from-stable-4.7-e2e-aws-upgrade (all) - 34 runs, 94% failed, 103% of failures match = 97% impact periodic-ci-openshift-release-master-ci-4.8-upgrade-from-stable-4.7-e2e-azure-ovn-upgrade (all) - 8 runs, 100% failed, 50% of failures match = 50% impact periodic-ci-openshift-release-master-ci-4.8-upgrade-from-stable-4.7-e2e-azure-upgrade (all) - 2 runs, 100% failed, 100% of failures match = 100% impact periodic-ci-openshift-release-master-ci-4.8-upgrade-from-stable-4.7-e2e-gcp-ovn-upgrade (all) - 8 runs, 100% failed, 100% of failures match = 100% impact periodic-ci-openshift-release-master-ci-4.8-upgrade-from-stable-4.7-e2e-gcp-upgrade (all) - 2 runs, 100% failed, 100% of failures match = 100% impact periodic-ci-openshift-release-master-ci-4.8-upgrade-from-stable-4.7-e2e-ovirt-upgrade (all) - 5 runs, 100% failed, 80% of failures match = 80% impact periodic-ci-openshift-release-master-ci-4.8-upgrade-from-stable-4.7-e2e-vsphere-upgrade (all) - 2 runs, 100% failed, 50% of failures match = 50% impact periodic-ci-openshift-release-master-ci-4.9-upgrade-from-stable-4.8-from-stable-4.7-e2e-aws-upgrade (all) - 1 runs, 100% failed, 100% of failures match = 100% impact periodic-ci-openshift-release-master-nightly-4.8-upgrade-from-stable-4.7-e2e-aws-upgrade (all) - 12 runs, 100% failed, 92% of failures match = 92% impact periodic-ci-openshift-release-master-nightly-4.8-upgrade-from-stable-4.7-e2e-metal-ipi-upgrade (all) - 12 runs, 100% failed, 67% of failures match = 67% impact pull-ci-openshift-ovn-kubernetes-master-4.8-upgrade-from-stable-4.7-e2e-aws-ovn-upgrade (all) - 34 runs, 88% failed, 97% of failures match = 85% impact rehearse-18937-periodic-ci-openshift-release-master-ci-4.9-upgrade-from-stable-4.8-from-stable-4.7-e2e-aws-upgrade (all) - 2 runs, 100% failed, 50% of failures match = 50% impact release-openshift-ocp-installer-upgrade-remote-libvirt-ppc64le-4.7-to-4.8 (all) - 4 runs, 100% failed, 100% of failures match = 100% impact release-openshift-ocp-installer-upgrade-remote-libvirt-s390x-4.7-to-4.8 (all) - 4 runs, 100% failed, 100% of failures match = 100% impact release-openshift-origin-installer-e2e-aws-upgrade (all) - 22 runs, 23% failed, 20% of failures match = 5% impact release-openshift-origin-installer-launch-aws (all) - 86 runs, 49% failed, 2% of failures match = 1% impact release-openshift-origin-installer-launch-gcp (all) - 220 runs, 32% failed, 3% of failures match = 1% impact release-openshift-origin-installer-old-rhcos-e2e-aws-4.8 (all) - 1 runs, 100% failed, 100% of failures match = 100% impact And just confirming that all of those^ are the network operator: $ curl -s 'https://search.ci.openshift.org/search?maxAge=48h&type=junit&search=flowcontrol.apiserver.k8s.io/v1alpha1.*FlowSchema.*the+server+could+not+find+the+requested+resource' | jq -r 'to_entries[].value | to_entries[].value[].context[]' | sed -n 's|.*v1alpha1.*Kind=FlowSchema) /\([^:]*\):.*could not find the requested resource.*|\1|p' | sort | uniq -c 154 openshift-ovn-kubernetes 155 openshift-sdn Marking this as blocker for 4.8 since it leads to a degraded network operator and failed upgrade. I have a PR up already: https://github.com/openshift/cluster-network-operator/pull/1118 which will need to get in on 4.7, but I need the API server peoples' input on that. Setting the target release to 4.7 and lowering the urgency since the upgrades are not really blocked by this bug. I've confirmed the behavior with API server and CVO teams. The CNO does go degraded due to this error and it blocks reconciliation, however the CVO should force the CNO to upgrade after a while to its 4.8 version. That should have the CNO push the right version of this resource and un-block it after a while. The only concern might be if the general upgrade gets hung after a while, which would not have the CVO force update the CNO. In any case, the fix should get in on 4.7, so the target release needs to change in any case. Have tried upgrade from 4.7.0-0.nightly-2021-06-10-082247 to 4.8.0-0.nightly-2021-06-11-024306 for both sdn and ovn. Both work fine, upgrading succeeded without operator degrades issue. https://mastern-jenkins-csb-openshift-qe.apps.ocp4.prod.psi.redhat.com/job/upgrade_CI/15078/console https://mastern-jenkins-csb-openshift-qe.apps.ocp4.prod.psi.redhat.com/job/upgrade_CI/15079/console *** Bug 1971835 has been marked as a duplicate of this bug. *** OpenShift engineering has decided to not ship Red Hat OpenShift Container Platform 4.7.17 due a regression https://bugzilla.redhat.com/show_bug.cgi?id=1973006. All the fixes which were part of 4.7.17 will be now part of 4.7.18 and planned to be available in candidate channel on June 23 2021 and in fast channel on June 28th. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.7.18 bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:2502 |