Bug 1874583

Summary:	apiserver tries and fails to log an event when shutting down
Product:	OpenShift Container Platform	Reporter:	Dan Winship <danw>
Component:	kube-apiserver	Assignee:	Stefan Schimanski <sttts>
Status:	CLOSED ERRATA	QA Contact:	Xingxing Xia <xxia>
Severity:	medium	Docs Contact:
Priority:	medium
Version:	4.6	CC:	aos-bugs, kewang, mfojtik, nstielau, sttts, xxia
Target Milestone:	---
Target Release:	4.7.0
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	No Doc Update
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2021-02-24 15:17:02 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1885619

Description Dan Winship 2020-09-01 16:57:06 UTC

Seen in kube-apiserver logs (in a failed IPv6 bare metal run but I don't think that part is relevant):

I0831 23:50:21.751294       1 secure_serving.go:241] Stopped listening on [::]:6443
I0831 23:50:21.751373       1 genericapiserver.go:665] Event(v1.ObjectReference{Kind:"Namespace", Namespace:"default", Name:"kube-system", UID:"", APIVersion:"v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'TerminationGracefulTerminationFinished' All pending requests processed
W0831 23:50:21.751444       1 genericapiserver.go:668] failed to create event default/kube-system.16307f49968807b6: Event "kube-system.16307f49968807b6" is invalid: [eventTime: Required value, source: Invalid value: "": needs to be unset]

dunno if this failure mode might affect other events too...

Comment 1 Venkata Siva Teja Areti 2020-09-09 17:46:47 UTC

Event validation logic changed recently in upstream

https://github.com/kubernetes/kubernetes/commit/75612c17461fe81fb969768e12ca52bfb22f745d#diff-b08aadf812d8dba261a29b585894d1e0

For the events we are creating, validation function should bail out at this point. https://github.com/kubernetes/kubernetes/commit/75612c17461fe81fb969768e12ca52bfb22f745d#diff-b08aadf812d8dba261a29b585894d1e0R46

But as group and version are not set for this request, it slips into the new validation logic and fails here https://github.com/kubernetes/kubernetes/commit/75612c17461fe81fb969768e12ca52bfb22f745d#diff-b08aadf812d8dba261a29b585894d1e0R53

But even after fixing the validation, I still see "grpc error: code = Unavailable desc = transport is closing" errors.

Changes I tried are here.

https://github.com/openshift/kubernetes/pull/336/files

I was trying to see the events in a integration test but instead found grpc errors after the above fix.

Comment 8 Xingxing Xia 2020-10-28 06:31:48 UTC

First, checked the not-fixed 4.6.1 env, can reproduce the issue:
ssh to master, check:
[root@xxia1028bm-qdtfs-control-plane-0 ~]# crictl ps --name='kube-apiserver$'                                                           
CONTAINER           IMAGE                                                              CREATED             STATE               NAME     
           ATTEMPT             POD ID                                                                                                  
7985fb1b750fd       ee1ad20f0063bb88b3bffd9cfd0ceddeca5ef14d9fb7e1b033fac4d8e61bc140   58 minutes ago      Running             kube-apis
erver      0                   558ca25d06f43

[root@xxia1028bm-qdtfs-control-plane-0 ~]# crictl logs -f 7985fb1b750fd &> 7985fb1b750fd_kube-apiserver.log &

[root@xxia1028bm-qdtfs-control-plane-0 ~]# crictl stop 7985fb1b750fd # make kube-apiserver shutdown
[root@xxia1028bm-qdtfs-control-plane-0 ~]# vi 7985fb1b750fd_kube-apiserver.log # reproduced 'failed to create event ... Invalid value: "": needs to be unset'
...
I1028 03:49:13.236470       1 main.go:176] Received signal terminated. Forwarding to sub-process "hyperkube".
I1028 03:49:13.237278      18 controller.go:181] Shutting down kubernetes service endpoint reconciler
I1028 03:49:13.237289      18 genericapiserver.go:667] Event(v1.ObjectReference{Kind:"Pod", Namespace:"openshift-kube-apiserver", Name:"kube-apiserver-xxia1028bm-qdtfs-control-plane-0", UID:"", APIVersion:"v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'TerminationStart' Received signal to terminate, becoming unready, but keeping serving
W1028 03:49:13.237647      18 genericapiserver.go:670] failed to create event openshift-kube-apiserver/kube-apiserver-xxia1028bm-qdtfs-control-plane-0.16420b66c03dcb8f: Event "kube-apiserver-xxia1028bm-qdtfs-control-plane-0.16420b66c03dcb8f" is invalid: [eventTime: Required value, source: Invalid value: "": needs to be unset]
...

In latest 4.7.0-0.nightly-2020-10-27-051128 env, repeat above steps, the shutdown kube-apiserver container logs do not reproduce it:
...
I1028 02:57:06.183558      18 genericapiserver.go:667] Event(v1.ObjectReference{Kind:"Pod", Namespace:"openshift-kube-apiserver", Name:"kube-apiserver-ip-10-0-156-192.ap-northeast-1.compute.internal", UID:"", APIVersion:"v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'TerminationStart' Received signal to terminate, becoming unready, but keeping serving
I1028 02:57:06.184397       1 main.go:176] Received signal terminated. Forwarding to sub-process "hyperkube".
I1028 02:57:06.286774       1 main.go:198] Termination finished with exit code 1
I1028 02:57:06.286868       1 main.go:151] Deleting termination lock file "/var/log/kube-apiserver/.terminating"
...

Comment 12 errata-xmlrpc 2021-02-24 15:17:02 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633