Bug 2095229
Summary: | ingress-operator pod in CrashLoopBackOff in 4.11 after upgrade starting in 4.6 due to go panic | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Jon Uriarte <juriarte> |
Component: | Networking | Assignee: | Miciah Dashiel Butler Masters <mmasters> |
Networking sub component: | router | QA Contact: | Arvind iyengar <aiyengar> |
Status: | CLOSED ERRATA | Docs Contact: | |
Severity: | medium | ||
Priority: | high | CC: | aos-bugs, hongli, mmasters |
Version: | 4.11 | ||
Target Milestone: | --- | ||
Target Release: | 4.11.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | No Doc Update | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2022-08-10 11:17:14 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Jon Uriarte
2022-06-09 10:48:18 UTC
Verified in "4.11.0-0.nightly-2022-06-21-040754". Upgrading cluster from 4.10 to the fixed nightly release on an OSP16 environment, the process completes successful with no failure or ingress operator pod crashes: -------- Pre-upgrade: NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.10.18 True False 12m Cluster version is 4.10.18 oc get infrastructures.config.openshift.io cluster -ojsonpath='{.spec}' | jq . { "cloudConfig": { "key": "config", "name": "cloud-provider-config" }, "platformSpec": { "type": "OpenStack" } } oc -n openshift-ingress-operator get ingresscontroller default -ojsonpath='{.status.endpointPublishingStrategy}' | jq . { "hostNetwork": { "protocol": "TCP" }, "type": "HostNetwork" } Post upgrade: oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.11.0-0.nightly-2022-06-21-040754 True False 14s Cluster version is 4.11.0-0.nightly-2022-06-21-040754 oc -n openshift-ingress-operator get all NAME READY STATUS RESTARTS AGE pod/ingress-operator-5d548f9467-bmflw 2/2 Running 0 14m NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/metrics ClusterIP 172.30.15.98 <none> 9393/TCP 125m NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/ingress-operator 1/1 1 1 125m NAME DESIRED CURRENT READY AGE replicaset.apps/ingress-operator-5d548f9467 1 1 1 43m replicaset.apps/ingress-operator-7899578f6 0 0 0 125m oc -n openshift-ingress-operator logs pod/ingress-operator-5d548f9467-bmflw -c ingress-operator 2022-06-22T05:53:20.053Z INFO operator.main ingress-operator/start.go:63 using operator namespace {"namespace": "openshift-ingress-operator"} I0622 05:53:24.193309 1 request.go:665] Waited for 1.045122844s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/snapshot.storage.k8s.io/v1?timeout=32s 2022-06-22T05:53:25.610Z INFO operator.main ingress-operator/start.go:63 registering Prometheus metrics for canary_controller 2022-06-22T05:53:25.610Z INFO operator.main ingress-operator/start.go:63 registering Prometheus metrics for ingress_controller 2022-06-22T05:53:25.610Z INFO operator.init runtime/asm_amd64.s:1571 starting metrics listener {"addr": "127.0.0.1:60000"} 2022-06-22T05:53:25.610Z INFO operator.main ingress-operator/start.go:63 watching file {"filename": "/etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem"} 2022-06-22T05:53:28.120Z INFO operator.init.controller-runtime.metrics metrics/listener.go:44 Metrics server is starting to listen {"addr": ":8080"} I0622 05:53:28.121036 1 base_controller.go:67] Waiting for caches to sync for spread-default-router-pods 2022-06-22T05:53:28.148Z ERROR operator.init ingress-operator/start.go:197 failed to handle single node 4.11 upgrade logic {"error": "unable to update ingress config \"cluster\": ingresses.config.openshift.io \"cluster\" is forbidden: User \"system:serviceaccount:openshift-ingress-operator:ingress-operator\" cannot patch resource \"ingresses/status\" in API group \"config.openshift.io\" at the cluster scope"} 2022-06-22T05:53:28.149Z INFO operator.init runtime/asm_amd64.s:1571 Starting server {"path": "/metrics", "kind": "metrics", "addr": "[::]:8080"} I0622 05:53:28.221804 1 base_controller.go:73] Caches are synced for spread-default-router-pods -------- Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069 |