Description of problem: Upgrade a cluster from 4.5.4 to 4.6.0-0.nightly-2020-08-01-202952 and co/ingress is degraded, the new router pods report error as below: 2020-08-02T02:32:27.113714092Z E0802 02:32:27.113682 1 reflector.go:178] github.com/openshift/router/pkg/router/controller/factory/factory.go:126: Failed to list *v1beta1.EndpointSlice: endpointslices.discovery.k8s.io is forbidden: User "system:serviceaccount:openshift-ingress:router" cannot list resource "endpointslices" in API group "discovery.k8s.io" at the cluster scope Version-Release number of selected component (if applicable): 4.6.0-0.nightly-2020-08-01-202952 How reproducible: 100% Steps to Reproduce: 1. upgrade a cluster from 4.5 to 4.6 2. upgrade failed and ingress is degraded. Actual results: Name: ingress Namespace: Labels: <none> Annotations: <none> API Version: config.openshift.io/v1 Kind: ClusterOperator Metadata: Creation Timestamp: 2020-08-01T23:03:20Z Generation: 1 Managed Fields: API Version: config.openshift.io/v1 Fields Type: FieldsV1 fieldsV1: f:spec: f:status: .: f:extension: Manager: cluster-version-operator Operation: Update Time: 2020-08-01T23:03:20Z API Version: config.openshift.io/v1 Fields Type: FieldsV1 fieldsV1: f:status: f:conditions: f:relatedObjects: f:versions: Manager: ingress-operator Operation: Update Time: 2020-08-02T01:39:56Z Resource Version: 177747 Self Link: /apis/config.openshift.io/v1/clusteroperators/ingress UID: a84e3534-ad21-48df-9aa8-adb075fff9d1 Spec: Status: Conditions: Last Transition Time: 2020-08-02T00:39:54Z Message: desired and current number of IngressControllers are equal Reason: AsExpected Status: True Type: Available Last Transition Time: 2020-08-02T00:39:55Z Message: desired and current number of IngressControllers are equal Reason: AsExpected Status: False Type: Progressing Last Transition Time: 2020-08-02T01:39:56Z Message: Some ingresscontrollers are degraded: default Reason: IngressControllersDegraded Status: True Type: Degraded Extension: <nil> Related Objects: Group: Name: openshift-ingress-operator Resource: namespaces Group: operator.openshift.io Name: Namespace: openshift-ingress-operator Resource: IngressController Group: ingress.operator.openshift.io Name: Namespace: openshift-ingress-operator Resource: DNSRecord Group: Name: openshift-ingress Resource: namespaces Versions: Name: operator Version: 4.6.0-0.nightly-2020-08-01-202952 Name: ingress-controller Version: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:5b0b898b920fda44e72396f8d0e8b2fcabf260c2a150af2d667260c094017fad Events: <none> #### router pod logs: 2020-08-02T02:32:27.113714092Z E0802 02:32:27.113682 1 reflector.go:178] github.com/openshift/router/pkg/router/controller/factory/factory.go:126: Failed to list *v1beta1.EndpointSlice: endpointslices.discovery.k8s.io is forbidden: User "system:serviceaccount:openshift-ingress:router" cannot list resource "endpointslices" in API group "discovery.k8s.io" at the cluster scope 2020-08-02T02:32:30.652847537Z I0802 02:32:30.652813 1 healthz.go:200] [+]backend-proxy-http ok 2020-08-02T02:32:30.652847537Z [-]has-synced failed: reason withheld 2020-08-02T02:32:30.652847537Z [+]process-running ok 2020-08-02T02:32:30.652847537Z healthz check failed 2020-08-02T02:32:40.65278788Z I0802 02:32:40.652755 1 healthz.go:200] [+]backend-proxy-http ok 2020-08-02T02:32:40.65278788Z [-]has-synced failed: reason withheld 2020-08-02T02:32:40.65278788Z [+]process-running ok 2020-08-02T02:32:40.65278788Z healthz check failed Expected results: ingress should not report degraded during upgraded Additional info:
PRs that added endpointslices were (merged): - https://github.com/openshift/openshift-apiserver/pull/125 - https://github.com/openshift/cluster-ingress-operator/pull/426 Looking to see if I missed something.
Seems we're missing some reconciliation logic in the ingress operator. Will put a PR up shortly.
upgrade from 4.5.4 to 4.6.0-0.nightly-2020-08-04-035157 and the issue has been fixed. moving to verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196
Removing UpgradeBlocker from this older bug, to remove it from the suspect queue described in [1]. If you feel like this bug still needs to be a suspect, please add keyword again. [1]: https://github.com/openshift/enhancements/pull/475