Bug 2001763 - [OCP on RHV] panic detected in openshift-oauth-apiserver pods
Summary: [OCP on RHV] panic detected in openshift-oauth-apiserver pods
Keywords:
Status: CLOSED DEFERRED
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Etcd
Version: 4.9
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: ---
Assignee: Nobody
QA Contact: ge liu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-09-07 06:57 UTC by Ke Wang
Modified: 2023-09-18 04:25 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-10-10 09:13:20 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift kubernetes pull 893 0 None Merged Bugzilla 2001763: UPSTREAM: <carry>: 104437: run etcd healthcheck in the background and… 2021-09-13 12:38:37 UTC

Description Ke Wang 2021-09-07 06:57:01 UTC
Description of problem:
panic detected in openshift-oauth-apiserver pods

Version-Release number of selected component (if applicable):
$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.9.0-0.nightly-2021-08-31-115131   True        False         5d18h   Cluster version is 4.9.0-0.nightly-2021-08-31-115131

$ oc get infrastructures.config.openshift.io -o json | jq .items[].status.platform
"oVirt"

How reproducible:
once

Reproduced steps:
The steps of reproduction are not yet known

Actual results:
$ oc get pod -n openshift-oauth-apiserver
NAME                        READY   STATUS    RESTARTS        AGE
apiserver-d6f9b59c4-4pb5p   1/1     Running   0               5d18h
apiserver-d6f9b59c4-558mx   1/1     Running   1 (3d20h ago)   5d18h
apiserver-d6f9b59c4-sddtc   1/1     Running   0               5h

masters=$(oc get no -l node-role.kubernetes.io/master | sed '1d' | awk '{print $1}')
$ for node in $masters; do echo $node;oc debug no/$node -- chroot /host bash -c "grep -ir 'panic' /var/log/pods/openshift-oauth-apiserver_apiserver* ";done
ge2n1-xp525-master-0
...
/var/log/pods/openshift-oauth-apiserver_apiserver-d6f9b59c4-4pb5p_800ae3bb-c103-4071-9d49-905f6d87326f/oauth-apiserver/0.log:2021-09-03T10:38:21.610927014+00:00 stderr F panic(0x220fa60, 0x2908620)
/var/log/pods/openshift-oauth-apiserver_apiserver-d6f9b59c4-4pb5p_800ae3bb-c103-4071-9d49-905f6d87326f/oauth-apiserver/0.log:2021-09-03T10:38:21.610927014+00:00 stderr F 	runtime/panic.go:965 +0x1b9
/var/log/pods/openshift-oauth-apiserver_apiserver-d6f9b59c4-4pb5p_800ae3bb-c103-4071-9d49-905f6d87326f/oauth-apiserver/0.log:2021-09-03T10:38:21.768588128+00:00 stderr F panic(0x220fa60, 0x2908620)
/var/log/pods/openshift-oauth-apiserver_apiserver-d6f9b59c4-4pb5p_800ae3bb-c103-4071-9d49-905f6d87326f/oauth-apiserver/0.log:2021-09-03T10:38:21.768588128+00:00 stderr F 	runtime/panic.go:965 +0x1b9

Removing debug pod ...
ge2n1-xp525-master-1
...
/var/log/pods/openshift-oauth-apiserver_apiserver-d6f9b59c4-558mx_647614f1-5aaf-4423-902c-ee22b1f3b957/oauth-apiserver/0.log:2021-09-03T10:38:19.181907449+00:00 stderr F panic(0x220fa60, 0x2908620)
/var/log/pods/openshift-oauth-apiserver_apiserver-d6f9b59c4-558mx_647614f1-5aaf-4423-902c-ee22b1f3b957/oauth-apiserver/0.log:2021-09-03T10:38:19.181907449+00:00 stderr F 	runtime/panic.go:965 +0x1b9
/var/log/pods/openshift-oauth-apiserver_apiserver-d6f9b59c4-558mx_647614f1-5aaf-4423-902c-ee22b1f3b957/oauth-apiserver/0.log:2021-09-03T10:38:19.197628195+00:00 stderr F panic(0x220fa60, 0x2908620)
/var/log/pods/openshift-oauth-apiserver_apiserver-d6f9b59c4-558mx_647614f1-5aaf-4423-902c-ee22b1f3b957/oauth-apiserver/0.log:2021-09-03T10:38:19.197628195+00:00 stderr F 	runtime/panic.go:965 +0x1b9
/var/log/pods/openshift-oauth-apiserver_apiserver-d6f9b59c4-558mx_647614f1-5aaf-4423-902c-ee22b1f3b957/oauth-apiserver/0.log:2021-09-03T10:38:28.619506068+00:00 stderr F panic(0x220fa60, 0x2908620)
/var/log/pods/openshift-oauth-apiserver_apiserver-d6f9b59c4-558mx_647614f1-5aaf-4423-902c-ee22b1f3b957/oauth-apiserver/0.log:2021-09-03T10:38:28.619506068+00:00 stderr F 	runtime/panic.go:965 +0x1b9
/var/log/pods/openshift-oauth-apiserver_apiserver-d6f9b59c4-558mx_647614f1-5aaf-4423-902c-ee22b1f3b957/oauth-apiserver/0.log:2021-09-03T10:38:28.634713177+00:00 stderr F panic(0x220fa60, 0x2908620)
/var/log/pods/openshift-oauth-apiserver_apiserver-d6f9b59c4-558mx_647614f1-5aaf-4423-902c-ee22b1f3b957/oauth-apiserver/0.log:2021-09-03T10:38:28.634713177+00:00 stderr F 	runtime/panic.go:965 +0x1b9
/var/log/pods/openshift-oauth-apiserver_apiserver-d6f9b59c4-558mx_647614f1-5aaf-4423-902c-ee22b1f3b957/oauth-apiserver/0.log:2021-09-03T10:38:32.476346457+00:00 stderr F panic(0x220fa60, 0x2908620)
/var/log/pods/openshift-oauth-apiserver_apiserver-d6f9b59c4-558mx_647614f1-5aaf-4423-902c-ee22b1f3b957/oauth-apiserver/0.log:2021-09-03T10:38:32.476346457+00:00 stderr F 	runtime/panic.go:965 +0x1b9

Removing debug pod ...
ge2n1-xp525-master-2

$ oc debug node/ge2n1-xp525-master-0

sh-4.4# grep -n panic /var/log/pods/openshift-oauth-apiserver_apiserver-d6f9b59c4-4pb5p_800ae3bb-c103-4071-9d49-905f6d87326f/oauth-apiserver/0.log
81366:2021-09-03T10:38:21.610927014+00:00 stderr F panic(0x220fa60, 0x2908620)
81367:2021-09-03T10:38:21.610927014+00:00 stderr F 	runtime/panic.go:965 +0x1b9
81462:2021-09-03T10:38:21.768588128+00:00 stderr F panic(0x220fa60, 0x2908620)
81463:2021-09-03T10:38:21.768588128+00:00 stderr F 	runtime/panic.go:965 +0x1b9

sh-4.4# sed -n '81366,81461p' /var/log/pods/openshift-oauth-apiserver_apiserver-d6f9b59c4-4pb5p_800ae3bb-c103-4071-9d49-905f6d87326f/oauth-apiserver/0.log
2021-09-03T10:38:21.610927014+00:00 stderr F panic(0x220fa60, 0x2908620)
2021-09-03T10:38:21.610927014+00:00 stderr F 	runtime/panic.go:965 +0x1b9
2021-09-03T10:38:21.610927014+00:00 stderr F golang.org/x/net/http2.(*responseWriter).Header(0xc000833810, 0x3941100)
2021-09-03T10:38:21.610927014+00:00 stderr F 	golang.org/x/net.0-20210224082022-3d97a244fca7/http2/server.go:2601 +0x99
2021-09-03T10:38:21.610927014+00:00 stderr F net/http.Error(0x2969fc0, 0xc000833810, 0xc00130c1a0, 0x18b, 0x1f4)
2021-09-03T10:38:21.610927014+00:00 stderr F 	net/http/server.go:2059 +0x3b
2021-09-03T10:38:21.610927014+00:00 stderr F k8s.io/apiserver/pkg/server/healthz.handleRootHealth.func1(0x29747f0, 0xc0002bbe28, 0xc0017cfc00)
2021-09-03T10:38:21.610927014+00:00 stderr F 	k8s.io/apiserver.0/pkg/server/healthz/healthz.go:245 +0x6c5
2021-09-03T10:38:21.610927014+00:00 stderr F k8s.io/apiserver/pkg/endpoints/metrics.InstrumentHandlerFunc.func1(0x7f98ab162950, 0xc0002bbe18, 0xc0017cfc00)
2021-09-03T10:38:21.610927014+00:00 stderr F 	k8s.io/apiserver.0/pkg/endpoints/metrics/metrics.go:453 +0x2be
2021-09-03T10:38:21.610927014+00:00 stderr F net/http.HandlerFunc.ServeHTTP(0xc000563e00, 0x7f98ab162950, 0xc0002bbe18, 0xc0017cfc00)
2021-09-03T10:38:21.610927014+00:00 stderr F 	net/http/server.go:2049 +0x44
2021-09-03T10:38:21.610927014+00:00 stderr F k8s.io/apiserver/pkg/server/mux.(*pathHandler).ServeHTTP(0xc0018fe8c0, 0x7f98ab162950, 0xc0002bbe18, 0xc0017cfc00)
2021-09-03T10:38:21.610927014+00:00 stderr F 	k8s.io/apiserver.0/pkg/server/mux/pathrecorder.go:241 +0x77a
2021-09-03T10:38:21.610927014+00:00 stderr F k8s.io/apiserver/pkg/server/mux.(*PathRecorderMux).ServeHTTP(0xc000211180, 0x7f98ab162950, 0xc0002bbe18, 0xc0017cfc00)
2021-09-03T10:38:21.610927014+00:00 stderr F 	k8s.io/apiserver.0/pkg/server/mux/pathrecorder.go:234 +0x8c
2021-09-03T10:38:21.610927014+00:00 stderr F k8s.io/apiserver/pkg/server.director.ServeHTTP(0x261c5f1, 0xf, 0xc000a66ab0, 0xc000211180, 0x7f98ab162950, 0xc0002bbe18, 0xc0017cfc00)
2021-09-03T10:38:21.610927014+00:00 stderr F 	k8s.io/apiserver.0/pkg/server/handler.go:154 +0x914
2021-09-03T10:38:21.610927014+00:00 stderr F k8s.io/apiserver/pkg/endpoints/filterlatency.trackCompleted.func1(0x7f98ab162950, 0xc0002bbe18, 0xc0017cfc00)
2021-09-03T10:38:21.610927014+00:00 stderr F 	k8s.io/apiserver.0/pkg/endpoints/filterlatency/filterlatency.go:95 +0x193
2021-09-03T10:38:21.610927014+00:00 stderr F net/http.HandlerFunc.ServeHTTP(0xc00102ca50, 0x7f98ab162950, 0xc0002bbe18, 0xc0017cfc00)
2021-09-03T10:38:21.610927014+00:00 stderr F 	net/http/server.go:2049 +0x44
2021-09-03T10:38:21.610927014+00:00 stderr F k8s.io/apiserver/pkg/endpoints/filters.WithAuthorization.func1(0x7f98ab162950, 0xc0002bbe18, 0xc0017cfc00)
2021-09-03T10:38:21.610927014+00:00 stderr F 	k8s.io/apiserver.0/pkg/endpoints/filters/authorization.go:64 +0x603
2021-09-03T10:38:21.610927014+00:00 stderr F net/http.HandlerFunc.ServeHTTP(0xc0008a9bc0, 0x7f98ab162950, 0xc0002bbe18, 0xc0017cfc00)
2021-09-03T10:38:21.610927014+00:00 stderr F 	net/http/server.go:2049 +0x44
2021-09-03T10:38:21.610927014+00:00 stderr F k8s.io/apiserver/pkg/endpoints/filterlatency.trackStarted.func1(0x7f98ab162950, 0xc0002bbe18, 0xc0017cfc00)
2021-09-03T10:38:21.610927014+00:00 stderr F 	k8s.io/apiserver.0/pkg/endpoints/filterlatency/filterlatency.go:71 +0x186
2021-09-03T10:38:21.610927014+00:00 stderr F net/http.HandlerFunc.ServeHTTP(0xc0008a9c00, 0x7f98ab162950, 0xc0002bbe18, 0xc0017cfc00)
2021-09-03T10:38:21.610927014+00:00 stderr F 	net/http/server.go:2049 +0x44
2021-09-03T10:38:21.610927014+00:00 stderr F k8s.io/apiserver/pkg/endpoints/filterlatency.trackCompleted.func1(0x7f98ab162950, 0xc0002bbe18, 0xc0017cfc00)
2021-09-03T10:38:21.610927014+00:00 stderr F 	k8s.io/apiserver.0/pkg/endpoints/filterlatency/filterlatency.go:95 +0x193
2021-09-03T10:38:21.610927014+00:00 stderr F net/http.HandlerFunc.ServeHTTP(0xc00102ca80, 0x7f98ab162950, 0xc0002bbe18, 0xc0017cfc00)
2021-09-03T10:38:21.610927014+00:00 stderr F 	net/http/server.go:2049 +0x44
2021-09-03T10:38:21.610927014+00:00 stderr F k8s.io/apiserver/pkg/server/filters.WithPriorityAndFairness.func1.4()
2021-09-03T10:38:21.610927014+00:00 stderr F 	k8s.io/apiserver.0/pkg/server/filters/priority-and-fairness.go:127 +0x1ba
2021-09-03T10:38:21.610927014+00:00 stderr F k8s.io/apiserver/pkg/util/flowcontrol.(*configController).Handle.func2()
2021-09-03T10:38:21.610927014+00:00 stderr F 	k8s.io/apiserver.0/pkg/util/flowcontrol/apf_filter.go:176 +0x222
2021-09-03T10:38:21.610927014+00:00 stderr F k8s.io/apiserver/pkg/util/flowcontrol.immediateRequest.Finish(...)
2021-09-03T10:38:21.610927014+00:00 stderr F 	k8s.io/apiserver.0/pkg/util/flowcontrol/apf_controller.go:752
2021-09-03T10:38:21.610927014+00:00 stderr F k8s.io/apiserver/pkg/util/flowcontrol.(*configController).Handle(0xc000758000, 0x297a2f8, 0xc001a9edb0, 0xc001b45d90, 0x297a8e0, 0xc00137f840, 0xc0011ecc40, 0xc0011ecc50, 0xc001843980)
2021-09-03T10:38:21.610927014+00:00 stderr F 	k8s.io/apiserver.0/pkg/util/flowcontrol/apf_filter.go:166 +0x907
2021-09-03T10:38:21.610927014+00:00 stderr F k8s.io/apiserver/pkg/server/filters.WithPriorityAndFairness.func1(0x7f98ab162950, 0xc0002bbe18, 0xc0017cfb00)
2021-09-03T10:38:21.610927014+00:00 stderr F 	k8s.io/apiserver.0/pkg/server/filters/priority-and-fairness.go:130 +0x606
2021-09-03T10:38:21.610927014+00:00 stderr F net/http.HandlerFunc.ServeHTTP(0xc00102cab0, 0x7f98ab162950, 0xc0002bbe18, 0xc0017cfb00)
2021-09-03T10:38:21.610927014+00:00 stderr F 	net/http/server.go:2049 +0x44
2021-09-03T10:38:21.610927014+00:00 stderr F k8s.io/apiserver/pkg/endpoints/filterlatency.trackStarted.func1(0x7f98ab162950, 0xc0002bbe18, 0xc0017cfb00)
2021-09-03T10:38:21.610927014+00:00 stderr F 	k8s.io/apiserver.0/pkg/endpoints/filterlatency/filterlatency.go:71 +0x186
2021-09-03T10:38:21.610927014+00:00 stderr F net/http.HandlerFunc.ServeHTTP(0xc0008a9c80, 0x7f98ab162950, 0xc0002bbe18, 0xc0017cfb00)
2021-09-03T10:38:21.610927014+00:00 stderr F 	net/http/server.go:2049 +0x44
2021-09-03T10:38:21.610927014+00:00 stderr F k8s.io/apiserver/pkg/endpoints/filterlatency.trackCompleted.func1(0x7f98ab162950, 0xc0002bbe18, 0xc0017cfb00)
2021-09-03T10:38:21.610927014+00:00 stderr F 	k8s.io/apiserver.0/pkg/endpoints/filterlatency/filterlatency.go:95 +0x193
2021-09-03T10:38:21.610927014+00:00 stderr F net/http.HandlerFunc.ServeHTTP(0xc00102cae0, 0x7f98ab162950, 0xc0002bbe18, 0xc0017cfb00)
2021-09-03T10:38:21.610927014+00:00 stderr F 	net/http/server.go:2049 +0x44
2021-09-03T10:38:21.610927014+00:00 stderr F k8s.io/apiserver/pkg/endpoints/filters.WithImpersonation.func1(0x7f98ab162950, 0xc0002bbe18, 0xc0017cfb00)
2021-09-03T10:38:21.610927014+00:00 stderr F 	k8s.io/apiserver.0/pkg/endpoints/filters/impersonation.go:50 +0x240d
2021-09-03T10:38:21.610927014+00:00 stderr F net/http.HandlerFunc.ServeHTTP(0xc0008a9d00, 0x7f98ab162950, 0xc0002bbe18, 0xc0017cfb00)
2021-09-03T10:38:21.610927014+00:00 stderr F 	net/http/server.go:2049 +0x44
2021-09-03T10:38:21.610927014+00:00 stderr F k8s.io/apiserver/pkg/endpoints/filterlatency.trackStarted.func1(0x7f98ab162950, 0xc0002bbe18, 0xc0017cfb00)
2021-09-03T10:38:21.610927014+00:00 stderr F 	k8s.io/apiserver.0/pkg/endpoints/filterlatency/filterlatency.go:71 +0x186
2021-09-03T10:38:21.610927014+00:00 stderr F net/http.HandlerFunc.ServeHTTP(0xc0008a9d40, 0x7f98ab162950, 0xc0002bbe18, 0xc0017cfb00)
2021-09-03T10:38:21.610927014+00:00 stderr F 	net/http/server.go:2049 +0x44
2021-09-03T10:38:21.610927014+00:00 stderr F k8s.io/apiserver/pkg/endpoints/filterlatency.trackCompleted.func1(0x7f98ab162950, 0xc0002bbe18, 0xc0017cfb00)
2021-09-03T10:38:21.610927014+00:00 stderr F 	k8s.io/apiserver.0/pkg/endpoints/filterlatency/filterlatency.go:95 +0x193
2021-09-03T10:38:21.610927014+00:00 stderr F net/http.HandlerFunc.ServeHTTP(0xc00102cb10, 0x7f98ab162950, 0xc0002bbe18, 0xc0017cfb00)
2021-09-03T10:38:21.610927014+00:00 stderr F 	net/http/server.go:2049 +0x44
2021-09-03T10:38:21.610927014+00:00 stderr F k8s.io/apiserver/pkg/endpoints/filters.WithAudit.func1(0x7f98ab162950, 0xc0002bbe18, 0xc0017cfb00)
2021-09-03T10:38:21.610927014+00:00 stderr F 	k8s.io/apiserver.0/pkg/endpoints/filters/audit.go:55 +0x7d5
2021-09-03T10:38:21.610927014+00:00 stderr F net/http.HandlerFunc.ServeHTTP(0xc0008a9d80, 0x7f98ab162950, 0xc0002bbe18, 0xc0017cfb00)
2021-09-03T10:38:21.610927014+00:00 stderr F 	net/http/server.go:2049 +0x44
2021-09-03T10:38:21.610927014+00:00 stderr F k8s.io/apiserver/pkg/endpoints/filterlatency.trackStarted.func1(0x7f98ab162950, 0xc0002bbe18, 0xc0017cfb00)
2021-09-03T10:38:21.610927014+00:00 stderr F 	k8s.io/apiserver.0/pkg/endpoints/filterlatency/filterlatency.go:71 +0x186
2021-09-03T10:38:21.610927014+00:00 stderr F net/http.HandlerFunc.ServeHTTP(0xc0008a9dc0, 0x7f98ab162950, 0xc0002bbe18, 0xc0017cfb00)
2021-09-03T10:38:21.610927014+00:00 stderr F 	net/http/server.go:2049 +0x44
2021-09-03T10:38:21.610927014+00:00 stderr F k8s.io/apiserver/pkg/endpoints/filterlatency.trackCompleted.func1(0x7f98ab162950, 0xc0002bbe18, 0xc0017cfb00)
2021-09-03T10:38:21.610927014+00:00 stderr F 	k8s.io/apiserver.0/pkg/endpoints/filterlatency/filterlatency.go:95 +0x193
2021-09-03T10:38:21.610927014+00:00 stderr F net/http.HandlerFunc.ServeHTTP(0xc00102cb70, 0x7f98ab162950, 0xc0002bbe18, 0xc0017cfb00)
2021-09-03T10:38:21.610927014+00:00 stderr F 	net/http/server.go:2049 +0x44
2021-09-03T10:38:21.610927014+00:00 stderr F k8s.io/apiserver/pkg/endpoints/filters.withAuthentication.func1(0x7f98ab162950, 0xc0002bbe18, 0xc0017cfb00)
2021-09-03T10:38:21.610927014+00:00 stderr F 	k8s.io/apiserver.0/pkg/endpoints/filters/authentication.go:80 +0x75c
2021-09-03T10:38:21.610927014+00:00 stderr F net/http.HandlerFunc.ServeHTTP(0xc00067be00, 0x7f98ab162950, 0xc0002bbe18, 0xc0017cfa00)
2021-09-03T10:38:21.610927014+00:00 stderr F 	net/http/server.go:2049 +0x44
2021-09-03T10:38:21.610927014+00:00 stderr F k8s.io/apiserver/pkg/endpoints/filterlatency.trackStarted.func1(0x7f98ab162950, 0xc0002bbe18, 0xc0017cf900)
2021-09-03T10:38:21.610927014+00:00 stderr F 	k8s.io/apiserver.0/pkg/endpoints/filterlatency/filterlatency.go:80 +0x38c
2021-09-03T10:38:21.610927014+00:00 stderr F net/http.HandlerFunc.ServeHTTP(0xc0008a9e40, 0x7f98ab162950, 0xc0002bbe18, 0xc0017cf900)
2021-09-03T10:38:21.610927014+00:00 stderr F 	net/http/server.go:2049 +0x44
2021-09-03T10:38:21.610927014+00:00 stderr F k8s.io/apiserver/pkg/server/filters.(*timeoutHandler).ServeHTTP.func1(0xc001ad5aa0, 0xc00089d7e8, 0x297a988, 0xc0002bbe18, 0xc0017cf900)
2021-09-03T10:38:21.610927014+00:00 stderr F 	k8s.io/apiserver.0/pkg/server/filters/timeout.go:107 +0xb8
2021-09-03T10:38:21.610927014+00:00 stderr F created by k8s.io/apiserver/pkg/server/filters.(*timeoutHandler).ServeHTTP
2021-09-03T10:38:21.610927014+00:00 stderr F 	k8s.io/apiserver.0/pkg/server/filters/timeout.go:93 +0x1f4
2021-09-03T10:38:21.767177170+00:00 stderr F I0903 10:38:21.767142       1 healthz.go:244] etcd check failed: readyz
2021-09-03T10:38:21.767177170+00:00 stderr F [-]etcd failed: error getting data from etcd: etcdserver: request timed out
2021-09-03T10:38:21.768588128+00:00 stderr F E0903 10:38:21.767445       1 timeout.go:128] Header called after Handler finished
2021-09-03T10:38:21.768588128+00:00 stderr F goroutine 1524708 [running]:
2021-09-03T10:38:21.768588128+00:00 stderr F k8s.io/apiserver/pkg/server/filters.(*timeoutHandler).ServeHTTP.func1.1(0xc002a2b500)
2021-09-03T10:38:21.768588128+00:00 stderr F 	k8s.io/apiserver.0/pkg/server/filters/timeout.go:102 +0x125


Expected results:
Should not see panic error in oauth-apiserver pod logs.

Additional info:

Comment 2 Sebastian Łaskawiec 2021-09-07 08:57:57 UTC
It's worth to mention that the API Server recovered from the panic. In case of `openshift-oauth-apiserver/pods/apiserver-d6f9b59c4-4pb5p/oauth-apiserver/oauth-apiserver/logs/current.log`, you can clearly see that it it keeps serving the requests.

Nevertheless, all panics happen after the Etcd failure:

2021-09-03T10:38:21.610550225Z I0903 10:38:21.610498       1 healthz.go:244] etcd check failed: healthz
2021-09-03T10:38:21.610550225Z [-]etcd failed: error getting data from etcd: etcdserver: request timed out
2021-09-03T10:38:21.610927014Z E0903 10:38:21.610872       1 timeout.go:128] Header called after Handler finished
2021-09-03T10:38:21.610927014Z goroutine 1524727 [running]:
2021-09-03T10:38:21.610927014Z k8s.io/apiserver/pkg/server/filters.(*timeoutHandler).ServeHTTP.func1.1(0xc001ad5aa0)
...


Looking at this particular 2021-09-03T10:38:21 and Etcd health logs (namespaces/openshift-etcd/pods/etcd-ge2n1-xp525-master-0/etcd-health-monitor/etcd-health-monitor/logs/current.log), I can see that something was wrong:


2021-09-03T10:38:10.136161345Z {"level":"warn","ts":"2021-09-03T10:38:10.136Z","logger":"etcd-client","caller":"v3.0/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc000641c00/#initially=[https://10.37.142.152:2379;https://10.37.142.161:2379;https://10.37.142.228:2379]","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
2021-09-03T10:38:10.136186499Z {"level":"info","ts":"2021-09-03T10:38:10.136Z","caller":"health/log.go:29","msg":"health check","addresses":["https://10.37.142.228:2379"],"check":"QuorumReadSingleTarget","health":false,"start":"2021-09-03T10:38:07.134Z","error":"client timeout exceeded: 1s"}
2021-09-03T10:38:10.383664827Z {"level":"warn","ts":"2021-09-03T10:38:10.383Z","logger":"etcd-client","caller":"v3.0/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc000641c00/#initially=[https://10.37.142.152:2379;https://10.37.142.161:2379;https://10.37.142.228:2379]","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
2021-09-03T10:38:10.383706461Z {"level":"info","ts":"2021-09-03T10:38:10.383Z","caller":"health/log.go:29","msg":"health check","addresses":["https://10.37.142.228:2379"],"check":"SerializedReadSingleTarget","health":false,"start":"2021-09-03T10:38:07.381Z","error":"client timeout exceeded: 1s"}

The Etcd logs also seems wrong (namespaces/openshift-etcd/pods/etcd-ge2n1-xp525-master-0/etcd/etcd/logs/current.log):

2021-09-03T10:38:32.492057632Z {"level":"warn","ts":"2021-09-03T10:38:32.483Z","caller":"etcdserver/util.go:123","msg":"failed to apply request","took":"11.297318ms","request":"header:<ID:9968853707752541488 > lease_revoke:<id:0a587ba1446d0bfd>","response":"size:29","error":"lease not found"}
2021-09-03T10:38:32.492355421Z {"level":"warn","ts":"2021-09-03T10:38:32.492Z","caller":"etcdserver/util.go:123","msg":"failed to apply request","took":"35.115µs","request":"header:<ID:9968853707752541493 > lease_revoke:<id:267b7ba1464878a2>","response":"size:29","error":"lease not found"}
2021-09-03T10:38:32.492534359Z {"level":"warn","ts":"2021-09-03T10:38:32.492Z","caller":"etcdserver/util.go:123","msg":"failed to apply request","took":"22.689µs","request":"header:<ID:9968853707752541498 > lease_revoke:<id:0a587ba1446d0b5e>","response":"size:29","error":"lease not found"}
2021-09-03T10:38:32.492668020Z {"level":"warn","ts":"2021-09-03T10:38:32.492Z","caller":"etcdserver/util.go:123","msg":"failed to apply request","took":"45.336µs","request":"header:<ID:9968853707752541500 > lease_revoke:<id:0a587ba1446d0cc4>","response":"size:29","error":"lease not found"}
2021-09-03T10:38:32.493464003Z {"level":"warn","ts":"2021-09-03T10:38:32.493Z","caller":"etcdserver/util.go:123","msg":"failed to apply request","took":"13.585µs","request":"header:<ID:9968853707752541501 > lease_revoke:<id:0a587ba1446d0bfd>","response":"size:29","error":"lease not found"}
2021-09-03T10:38:32.493506491Z {"level":"warn","ts":"2021-09-03T10:38:32.493Z","caller":"etcdserver/util.go:123","msg":"failed to apply request","took":"4.822µs","request":"header:<ID:9968853707752541503 > lease_revoke:<id:267b7ba1464878a2>","response":"size:29","error":"lease not found"}
2021-09-03T10:38:32.494580758Z {"level":"warn","ts":"2021-09-03T10:38:32.494Z","caller":"etcdserver/util.go:123","msg":"failed to apply request","took":"40.204µs","request":"header:<ID:9968853707752541505 > lease_revoke:<id:0a587ba1446d0b5e>","response":"size:29","error":"lease not found"}
2021-09-03T10:38:32.496841162Z {"level":"warn","ts":"2021-09-03T10:38:32.495Z","caller":"etcdserver/server.go:1130","msg":"failed to revoke lease","lease-id":"0a587ba1446d0cc4","error":"lease not found"}
2021-09-03T10:38:32.496921944Z {"level":"warn","ts":"2021-09-03T10:38:32.496Z","caller":"etcdserver/server.go:1130","msg":"failed to revoke lease","lease-id":"0a587ba1446d0bfd","error":"lease not found"}
2021-09-03T10:38:32.496948975Z {"level":"warn","ts":"2021-09-03T10:38:32.496Z","caller":"etcdserver/server.go:1130","msg":"failed to revoke lease","lease-id":"267b7ba1464878a2","error":"lease not found"}
2021-09-03T10:38:32.496997062Z {"level":"info","ts":"2021-09-03T10:38:32.496Z","caller":"traceutil/trace.go:171","msg":"trace[735177488] transaction","detail":"{read_only:false; number_of_response:1; response_revision:878625; }","duration":"3.7077888s","start":"2021-09-03T10:38:28.789Z","end":"2021-09-03T10:38:32.496Z","steps":["trace[735177488] 'process raft request'  (duration: 3.705272134s)"],"step_count":1}

Sending to the Etcd Team to assess what's going on. I'm also requesting feedback from @sttts to check if we can do something about the panic (but I don't think it's necessary, the output message was quite obvious: "etcd check failed: healthz").

Comment 8 Lucas López Montero 2022-10-10 08:34:11 UTC
Cloning bug on the following Jira ticket as agreed with Thomas:

https://issues.redhat.com/browse/OCPBUGS-2136

This ticket can be closed again.

Comment 10 Red Hat Bugzilla 2023-09-18 04:25:47 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days


Note You need to log in before you can comment on or make changes to this bug.