Created attachment 1714973 [details] An example of kube-apiserver log Description of problem: - kube-apiserver logs from masters, often shows various concerning messages with "APIServer panic'd: net/http: abort Handler". Version-Release number of selected component (if applicable): - Have seen it on customer's 4.4 cluster. - Have seen it on fresh built vanilla aws default setting cluster, latest 4.5.8. How reproducible: - Appears to be there always without needing to do specific things to reproduce it. The messages: - For example, on a freshly built vanilla aws cluster: [alchan-redhat.com@clientvm 0 ~/must-gather.local.321446042447830151/quay-io-openshift-release-dev-ocp-v4-0-art-dev-sha256-ff3841ef6e1c9ba9c24f935894a882acbdbaba03c4ab51eb4a30581592daf5f3/audit_logs/kube-apiserver]$ zgrep 'panic' * | less ip-10-0-169-170.us-east-2.compute.internal-audit-2020-09-14T17-27-11.192.log.gz:{"kind":"Event","apiVersion":"audit.k8s.io/v1","level":"Metadata","auditID":"d2cc30db-9307-4137-8a02-fa3820498931","stage":"Panic","requestURI":"/apis/image.openshift.io/v1/imagestreams?allowWatchBookmarks=true\u0026resourceVersion=7758\u0026timeout=7m53s\u0026timeoutSeconds=473\u0026watch=true","verb":"watch","user":{"username":"system:kube-controller-manager","groups":["system:authenticated"]},"sourceIPs":["::1"],"userAgent":"cluster-policy-controller/v0.0.0 (linux/amd64) kubernetes/$Format","objectRef":{"resource":"imagestreams","apiGroup":"image.openshift.io","apiVersion":"v1"},"responseStatus":{"metadata":{},"status":"Failure","message":"APIServer panic'd: net/http: abort Handler","reason":"InternalError","code":500},"requestReceivedTimestamp":"2020-09-14T14:57:52.282146Z","stageTimestamp":"2020-09-14T14:58:57.470755Z","annotations":{"authorization.k8s.io/decision":"allow","authorization.k8s.io/reason":"RBAC: allowed by ClusterRoleBinding \"system:kube-controller-manager\" of ClusterRole \"system:kube-controller-manager\" to User \"system:kube-controller-manager\""}} ip-10-0-169-170.us-east-2.compute.internal-audit-2020-09-14T17-27-11.192.log.gz:{"kind":"Event","apiVersion":"audit.k8s.io/v1","level":"Metadata","auditID":"6f921b41-c5e5-4b56-b0b0-df60639b04e7","stage":"Panic","requestURI":"/apis/apps.openshift.io/v1/deploymentconfigs?allowWatchBookmarks=true\u0026resourceVersion=10957\u0026timeout=9m6s\u0026timeoutSeconds=546\u0026watch=true","verb":"watch","user":{"username":"system:kube-controller-manager","groups":["system:authenticated"]},"sourceIPs":["::1"],"userAgent":"cluster-policy-controller/v0.0.0 (linux/amd64) kubernetes/$Format","objectRef":{"resource":"deploymentconfigs","apiGroup":"apps.openshift.io","apiVersion":"v1"},"responseStatus":{"metadata":{},"status":"Failure","message":"APIServer panic'd: net/http: abort Handler","reason":"InternalError","code":500},"requestReceivedTimestamp":"2020-09-14T14:59:01.473711Z","stageTimestamp":"2020-09-14T14:59:41.913442Z","annotations":{"authorization.k8s.io/decision":"allow","authorization.k8s.io/reason":"RBAC: allowed by ClusterRoleBinding \"system:kube-controller-manager\" of ClusterRole \"system:kube-controller-manager\" to User \"system:kube-controller-manager\""}} ip-10-0-169-170.us-east-2.compute.internal-audit-2020-09-14T17-27-11.192.log.gz:{"kind":"Event","apiVersion":"audit.k8s.io/v1","level":"Metadata","auditID":"52193b04-a3a2-4532-b4e2-dc58c63df49c","stage":"Panic","requestURI":"/apis/image.openshift.io/v1/imagestreams?allowWatchBookmarks=true\u0026resourceVersion=10961\u0026timeout=8m29s\u0026timeoutSeconds=509\u0026watch=true","verb":"watch","user":{"username":"system:kube-controller-manager","groups":["system:authenticated"]},"sourceIPs":["::1"],"userAgent":"cluster-policy-controller/v0.0.0 (linux/amd64) kubernetes/$Format","objectRef":{"resource":"imagestreams","apiGroup":"image.openshift.io","apiVersion":"v1"},"responseStatus":{"metadata":{},"status":"Failure","message":"APIServer panic'd: net/http: abort Handler","reason":"InternalError","code":500},"requestReceivedTimestamp":"2020-09-14T14:59:01.849100Z","stageTimestamp":"2020-09-14T14:59:41.913727Z","annotations":{"authorization.k8s.io/decision":"allow","authorization.k8s.io/reason":"RBAC: allowed by ClusterRoleBinding \"system:kube-controller-manager\" of ClusterRole \"system:kube-controller-manager\" to User \"system:kube-controller-manager\""}} ip-10-0-169-170.us-east-2.compute.internal-audit-2020-09-14T17-27-11.192.log.gz:{"kind":"Event","apiVersion":"audit.k8s.io/v1","level":"Metadata","auditID":"aecf4a61-0d10-42aa-b750-390fd1cc2b8a","stage":"Panic","requestURI":"/apis/route.openshift.io/v1/routes?allowWatchBookmarks=true\u0026resourceVersion=10957\u0026timeout=5m17s\u0026timeoutSeconds=317\u0026watch=true","verb":"watch","user":{"username":"system:kube-controller-manager","groups":["system:authenticated"]},"sourceIPs":["::1"],"userAgent":"cluster-policy-controller/v0.0.0 (linux/amd64) kubernetes/$Format","objectRef":{"resource":"routes","apiGroup":"route.openshift.io","apiVersion":"v1"},"responseStatus":{"metadata":{},"status":"Failure","message":"APIServer panic'd: net/http: abort Handler","reason":"InternalError","code":500},"requestReceivedTimestamp":"2020-09-14T14:59:01.644885Z","stageTimestamp":"2020-09-14T14:59:41.914357Z","annotations":{"authorization.k8s.io/decision":"allow","authorization.k8s.io/reason":"RBAC: allowed by ClusterRoleBinding \"system:kube-controller-manager\" of ClusterRole \"system:kube-controller-manager\" to User \"system:kube-controller-manager\""}} ip-10-0-169-170.us-east-2.compute.internal-audit-2020-09-14T17-27-11.192.log.gz:{"kind":"Event","apiVersion":"audit.k8s.io/v1","level":"Metadata","auditID":"d21a81fb-d4ad-4c01-ab5f-9157f97578c7","stage":"Panic","requestURI":"/apis/build.openshift.io/v1/buildconfigs?allowWatchBookmarks=true\u0026resourceVersion=10961\u0026timeout=6m43s\u0026timeoutSeconds=403\u0026watch=true","verb":"watch","user":{"username":"system:kube-controller-manager","groups":["system:authenticated"]},"sourceIPs":["::1"],"userAgent":"cluster-policy-controller/v0.0.0 (linux/amd64) kubernetes/$Format","objectRef":{"resource":"buildconfigs","apiGroup":"build.openshift.io","apiVersion":"v1"},"responseStatus":{"metadata":{},"status":"Failure","message":"APIServer panic'd: net/http: abort Handler","reason":"InternalError","code":500},"requestReceivedTimestamp":"2020-09-14T14:59:01.643149Z","stageTimestamp":"2020-09-14T14:59:41.914614Z","annotations":{"authorization.k8s.io/decision":"allow","authorization.k8s.io/reason":"RBAC: allowed by ClusterRoleBinding \"system:kube-controller-manager\" of ClusterRole \"system:kube-controller-manager\" to User \"system:kube-controller-manager\""}} ip-10-0-169-170.us-east-2.compute.internal-audit-2020-09-14T17-27-11.192.log.gz:{"kind":"Event","apiVersion":"audit.k8s.io/v1","level":"Metadata","auditID":"d898db13-f1ef-4b2b-b6c8-932adc68bd73","stage":"Panic","requestURI":"/apis/build.openshift.io/v1/builds?allowWatchBookmarks=true\u0026resourceVersion=10961\u0026timeout=6m59s\u0026timeoutSeconds=419\u0026watch=true","verb":"watch","user":{"username":"system:kube-controller-manager","groups":["system:authenticated"]},"sourceIPs":["::1"],"userAgent":"cluster-policy-controller/v0.0.0 (linux/amd64) kubernetes/$Format","objectRef":{"resource":"builds","apiGroup":"build.openshift.io","apiVersion":"v1"},"responseStatus":{"metadata":{},"status":"Failure","message":"APIServer panic'd: net/http: abort Handler","reason":"InternalError","code":500},"requestReceivedTimestamp":"2020-09-14T14:59:01.501924Z","stageTimestamp":"2020-09-14T14:59:41.915985Z","annotations":{"authorization.k8s.io/decision":"allow","authorization.k8s.io/reason":"RBAC: allowed by ClusterRoleBinding \"system:kube-controller-manager\" of ClusterRole \"system:kube-controller-manager\" to User \"system:kube-controller-manager\""}} ip-10-0-169-170.us-east-2.compute.internal-audit-2020-09-14T17-27-11.192.log.gz:{"kind":"Event","apiVersion":"audit.k8s.io/v1","level":"Metadata","auditID":"2492eacb-82ea-4a75-af7e-f9cbdb11d53e","stage":"Panic","requestURI":"/apis/build.openshift.io/v1/builds?allowWatchBookmarks=true\u0026resourceVersion=8865\u0026timeout=6m44s\u0026timeoutSeconds=404\u0026watch=true","verb":"watch","user":{"username":"system:serviceaccount:openshift-controller-manager:openshift-controller-manager-sa","uid":"31cd5a32-3463-4430-a4d4-697530a883e1","groups":["system:serviceaccounts","system:serviceaccounts:openshift-controller-manager","system:authenticated"]},"sourceIPs":["10.129.0.28"],"userAgent":"openshift-controller-manager/v0.0.0 (linux/amd64) kubernetes/$Format","objectRef":{"resource":"builds","apiGroup":"build.openshift.io","apiVersion":"v1"},"responseStatus":{"metadata":{},"status":"Failure","message":"APIServer panic'd: net/http: abort Handler","reason":"InternalError","code":500},"requestReceivedTimestamp":"2020-09-14T14:58:50.148403Z","stageTimestamp":"2020-09-14T14:59:50.740518Z","annotations":{"authorization.k8s.io/decision":"allow","authorization.k8s.io/reason":"RBAC: allowed by ClusterRoleBinding \"system:openshift:openshift-controller-manager\" of ClusterRole \"system:openshift:openshift-controller-manager\" to ServiceAccount \"openshift-controller-manager-sa/openshift-controller-manager\""}} ip-10-0-169-170.us-east-2.compute.internal-audit-2020-09-14T17-27-11.192.log.gz:{"kind":"Event","apiVersion":"audit.k8s.io/v1","level":"Metadata","auditID":"bc373a4b-19d3-4fde-9531-c18d8ddb982b","stage":"Panic","requestURI":"/apis/route.openshift.io/v1/routes?allowWatchBookmarks=true\u0026resourceVersion=8865\u0026timeout=5m21s\u0026timeoutSeconds=321\u0026watch=true","verb":"watch","user":{"username":"system:serviceaccount:openshift-controller-manager:openshift-controller-manager-sa","uid":"31cd5a32-3463-4430-a4d4-697530a883e1","groups":["system:serviceaccounts","system:serviceaccounts:openshift-controller-manager","system:authenticated"]},"sourceIPs":["10.129.0.28"],"userAgent":"openshift-controller-manager/v0.0.0 (linux/amd64) kubernetes/$Format","objectRef":{"resource":"routes","apiGroup":"route.openshift.io","apiVersion":"v1"},"responseStatus":{"metadata":{},"status":"Failure","message":"APIServer panic'd: net/http: abort Handler","reason":"InternalError","code":500},"requestReceivedTimestamp":"2020-09-14T14:58:50.148910Z","stageTimestamp":"2020-09-14T14:59:50.741345Z","annotations":{"authorization.k8s.io/decision":"allow","authorization.k8s.io/reason":"RBAC: allowed by ClusterRoleBinding \"system:openshift:openshift-controller-manager\" of ClusterRole \"system:openshift:openshift-controller-manager\" to ServiceAccount \"openshift-controller-manager-sa/openshift-controller-manager\""}}
*** This bug has been marked as a duplicate of bug 1875046 ***
Hi Jaspreet, I'm going to back port https://bugzilla.redhat.com/show_bug.cgi?id=1875046 which will fix the panic messages you are seeing (https://bugzilla.redhat.com/show_bug.cgi?id=1879208#c6).
I installed latest OCP 4.5 on openstack, the cluster works well, but I checked the kube-apiserver audit log, still found some panic logs, see below, $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.5.0-0.nightly-2021-01-30-163415 True False 125m Cluster version is 4.5.0-0.nightly-2021-01-30-163415 Confirm the PR has been loaded in latest 4.5 payload, $ oc adm release info --commits registry.ci.openshift.org/ocp/release:4.5.0-0.nightly-2021-01-30-163415 | grep 'origin' hyperkube https://github.com/openshift/origin e574db2c66c6fa00a3d99ac8fb208b0bc8fd65e5 $ git log --date=local --pretty="%h %an %cd - %s" e574db2 | grep '#25824' e574db2c66 OpenShift Merge Robot Thu Jan 28 08:16:33 2021 - Merge pull request #25824 from p0lyn0mial/release-4.5-race-serialize-object $ oc get no NAME STATUS ROLES AGE VERSION kewang0251-k2hhl-compute-0 Ready worker 145m v1.18.3+e574db2 kewang0251-k2hhl-compute-1 Ready worker 145m v1.18.3+e574db2 kewang0251-k2hhl-compute-2 Ready worker 145m v1.18.3+e574db2 kewang0251-k2hhl-control-plane-0 Ready master 152m v1.18.3+e574db2 kewang0251-k2hhl-control-plane-1 Ready master 153m v1.18.3+e574db2 kewang0251-k2hhl-control-plane-2 Ready master 152m v1.18.3+e574db2 $ oc get co NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE authentication 4.5.0-0.nightly-2021-01-30-163415 True False False 135m cloud-credential 4.5.0-0.nightly-2021-01-30-163415 True False False 166m cluster-autoscaler 4.5.0-0.nightly-2021-01-30-163415 True False False 145m config-operator 4.5.0-0.nightly-2021-01-30-163415 True False False 145m console 4.5.0-0.nightly-2021-01-30-163415 True False False 134m csi-snapshot-controller 4.5.0-0.nightly-2021-01-30-163415 True False False 145m dns 4.5.0-0.nightly-2021-01-30-163415 True False False 150m etcd 4.5.0-0.nightly-2021-01-30-163415 True False False 150m image-registry 4.5.0-0.nightly-2021-01-30-163415 True False False 143m ingress 4.5.0-0.nightly-2021-01-30-163415 True False False 143m insights 4.5.0-0.nightly-2021-01-30-163415 True False False 145m kube-apiserver 4.5.0-0.nightly-2021-01-30-163415 True False False 150m kube-controller-manager 4.5.0-0.nightly-2021-01-30-163415 True False False 145m kube-scheduler 4.5.0-0.nightly-2021-01-30-163415 True False False 147m kube-storage-version-migrator 4.5.0-0.nightly-2021-01-30-163415 True False False 144m machine-api 4.5.0-0.nightly-2021-01-30-163415 True False False 145m machine-approver 4.5.0-0.nightly-2021-01-30-163415 True False False 149m machine-config 4.5.0-0.nightly-2021-01-30-163415 True False False 144m marketplace 4.5.0-0.nightly-2021-01-30-163415 True False False 144m monitoring 4.5.0-0.nightly-2021-01-30-163415 True False False 138m network 4.5.0-0.nightly-2021-01-30-163415 True False False 151m node-tuning 4.5.0-0.nightly-2021-01-30-163415 True False False 151m openshift-apiserver 4.5.0-0.nightly-2021-01-30-163415 True False False 146m openshift-controller-manager 4.5.0-0.nightly-2021-01-30-163415 True False False 145m openshift-samples 4.5.0-0.nightly-2021-01-30-163415 True False False 144m operator-lifecycle-manager 4.5.0-0.nightly-2021-01-30-163415 True False False 151m operator-lifecycle-manager-catalog 4.5.0-0.nightly-2021-01-30-163415 True False False 151m operator-lifecycle-manager-packageserver 4.5.0-0.nightly-2021-01-30-163415 True False False 147m service-ca 4.5.0-0.nightly-2021-01-30-163415 True False False 151m storage 4.5.0-0.nightly-2021-01-30-163415 True False False 145m $ oc debug node/kewang0251-k2hhl-control-plane-0 root@kewang0251-k2hhl-control-plane-0 kube-apiserver]# grep 'panic' audit.log {"kind":"Event","apiVersion":"audit.k8s.io/v1","level":"Metadata","auditID":"0845d6fb-b0c0-4c60-b58b-5d1f48cf32e4","stage":"Panic","requestURI":"/apis/user.openshift.io/v1/groups?allowWatchBookmarks=true\u0026resourceVersion=12770\u0026timeout=7m48s\u0026timeoutSeconds=468\u0026watch=true","verb":"watch","user":{"username":"system:apiserver","uid":"fbceb3f9-fae1-4d4e-bf01-cf38caf5952b","groups":["system:masters"]},"sourceIPs":["::1"],"userAgent":"kube-apiserver/v1.18.3+e574db2 (linux/amd64) kubernetes/e574db2","objectRef":{"resource":"groups","apiGroup":"user.openshift.io","apiVersion":"v1"},"responseStatus":{"metadata":{},"status":"Failure","message":"APIServer panic'd: net/http: abort Handler","reason":"InternalError","code":500},"requestReceivedTimestamp":"2021-02-01T04:19:17.020068Z","stageTimestamp":"2021-02-01T04:20:52.578003Z","annotations":{"authorization.k8s.io/decision":"allow","authorization.k8s.io/reason":""}} {"kind":"Event","apiVersion":"audit.k8s.io/v1","level":"Metadata","auditID":"ad0a89e0-8ce5-4cb5-954e-88c79d0cbe7b","stage":"Panic","requestURI":"/apis/user.openshift.io/v1/groups?allowWatchBookmarks=true\u0026resourceVersion=12632\u0026timeout=8m2s\u0026timeoutSeconds=482\u0026watch=true","verb":"watch","user":{"username":"system:apiserver","uid":"fbceb3f9-fae1-4d4e-bf01-cf38caf5952b","groups":["system:masters"]},"sourceIPs":["::1"],"userAgent":"kube-apiserver/v1.18.3+e574db2 (linux/amd64) kubernetes/e574db2","objectRef":{"resource":"groups","apiGroup":"user.openshift.io","apiVersion":"v1"},"responseStatus":{"metadata":{},"status":"Failure","message":"APIServer panic'd: net/http: abort Handler","reason":"InternalError","code":500},"requestReceivedTimestamp":"2021-02-01T04:19:15.999490Z","stageTimestamp":"2021-02-01T04:20:52.578615Z","annotations":{"authorization.k8s.io/decision":"allow","authorization.k8s.io/reason":""}} {"kind":"Event","apiVersion":"audit.k8s.io/v1","level":"Metadata","auditID":"7306224a-b1d2-47bf-9258-d3efe62a02ab","stage":"Panic","requestURI":"/apis/user.openshift.io/v1/groups?allowWatchBookmarks=true\u0026resourceVersion=19225\u0026timeout=5m10s\u0026timeoutSeconds=310\u0026watch=true","verb":"watch","user":{"username":"system:apiserver","uid":"7723f949-729e-47bf-b4fc-c3e60a15123e","groups":["system:masters"]},"sourceIPs":["::1"],"userAgent":"kube-apiserver/v1.18.3+e574db2 (linux/amd64) kubernetes/e574db2","objectRef":{"resource":"groups","apiGroup":"user.openshift.io","apiVersion":"v1"},"responseStatus":{"metadata":{},"status":"Failure","message":"APIServer panic'd: net/http: abort Handler","reason":"InternalError","code":500},"requestReceivedTimestamp":"2021-02-01T04:23:09.529699Z","stageTimestamp":"2021-02-01T04:24:10.660682Z","annotations":{"authorization.k8s.io/decision":"allow","authorization.k8s.io/reason":""}} {"kind":"Event","apiVersion":"audit.k8s.io/v1","level":"Metadata","auditID":"9b1ffd51-76f4-4a81-a391-2a145ef6fd17","stage":"Panic","requestURI":"/apis/user.openshift.io/v1/groups?allowWatchBookmarks=true\u0026resourceVersion=19225\u0026timeout=8m54s\u0026timeoutSeconds=534\u0026watch=true","verb":"watch","user":{"username":"system:apiserver","uid":"7723f949-729e-47bf-b4fc-c3e60a15123e","groups":["system:masters"]},"sourceIPs":["::1"],"userAgent":"kube-apiserver/v1.18.3+e574db2 (linux/amd64) kubernetes/e574db2","objectRef":{"resource":"groups","apiGroup":"user.openshift.io","apiVersion":"v1"},"responseStatus":{"metadata":{},"status":"Failure","message":"APIServer panic'd: net/http: abort Handler","reason":"InternalError","code":500},"requestReceivedTimestamp":"2021-02-01T04:23:09.512846Z","stageTimestamp":"2021-02-01T04:24:10.660941Z","annotations":{"authorization.k8s.io/decision":"allow","authorization.k8s.io/reason":""}} From above results, the PR 25824 doesn't fix the bug, so assign back.
Hi Ke, This BZ was marked as a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1875046 The actual panic errors that were affecting a customer cluster can be found in https://bugzilla.redhat.com/show_bug.cgi?id=1879208#c6 You should find the following: runtime.go:76] Observed a panic: runtime error: index out of range [43] with length 30 runtime.go:76] Observed a panic: index > windowEnd runtime.go:76] Observed a panic: runtime error: invalid memory address or nil pointer dereference Regarding "net/http: abort Handler" is a sentinel panic value to abort a handler. The server wants to close the connection with the client and fires the panic that is recorded in the audit log. I would consider it as "normal".
Hi Lukasz, thanks for your confirmation. Since the bug is just related to error 'runtime error: invalid memory address or nil pointer dereference', I did a quick search in past 7 days: https://search.ci.openshift.org/?search=kube-apiserver.*panic.*invalid+memory+address+or+nil+pointer+dereference&maxAge=168h&context=2&type=junit&name=&maxMatches=5&maxBytes=20971520&groupBy=job From the searching results, no longer see the relevant results for OCP 4.5. So we can think that the PR worked well, move the bug VERIFIED.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.5.33 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:0428