Seen in kube-apiserver logs (in a failed IPv6 bare metal run but I don't think that part is relevant): I0831 23:50:21.751294 1 secure_serving.go:241] Stopped listening on [::]:6443 I0831 23:50:21.751373 1 genericapiserver.go:665] Event(v1.ObjectReference{Kind:"Namespace", Namespace:"default", Name:"kube-system", UID:"", APIVersion:"v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'TerminationGracefulTerminationFinished' All pending requests processed W0831 23:50:21.751444 1 genericapiserver.go:668] failed to create event default/kube-system.16307f49968807b6: Event "kube-system.16307f49968807b6" is invalid: [eventTime: Required value, source: Invalid value: "": needs to be unset] dunno if this failure mode might affect other events too...
Event validation logic changed recently in upstream https://github.com/kubernetes/kubernetes/commit/75612c17461fe81fb969768e12ca52bfb22f745d#diff-b08aadf812d8dba261a29b585894d1e0 For the events we are creating, validation function should bail out at this point. https://github.com/kubernetes/kubernetes/commit/75612c17461fe81fb969768e12ca52bfb22f745d#diff-b08aadf812d8dba261a29b585894d1e0R46 But as group and version are not set for this request, it slips into the new validation logic and fails here https://github.com/kubernetes/kubernetes/commit/75612c17461fe81fb969768e12ca52bfb22f745d#diff-b08aadf812d8dba261a29b585894d1e0R53 But even after fixing the validation, I still see "grpc error: code = Unavailable desc = transport is closing" errors. Changes I tried are here. https://github.com/openshift/kubernetes/pull/336/files I was trying to see the events in a integration test but instead found grpc errors after the above fix.
First, checked the not-fixed 4.6.1 env, can reproduce the issue: ssh to master, check: [root@xxia1028bm-qdtfs-control-plane-0 ~]# crictl ps --name='kube-apiserver$' CONTAINER IMAGE CREATED STATE NAME ATTEMPT POD ID 7985fb1b750fd ee1ad20f0063bb88b3bffd9cfd0ceddeca5ef14d9fb7e1b033fac4d8e61bc140 58 minutes ago Running kube-apis erver 0 558ca25d06f43 [root@xxia1028bm-qdtfs-control-plane-0 ~]# crictl logs -f 7985fb1b750fd &> 7985fb1b750fd_kube-apiserver.log & [root@xxia1028bm-qdtfs-control-plane-0 ~]# crictl stop 7985fb1b750fd # make kube-apiserver shutdown [root@xxia1028bm-qdtfs-control-plane-0 ~]# vi 7985fb1b750fd_kube-apiserver.log # reproduced 'failed to create event ... Invalid value: "": needs to be unset' ... I1028 03:49:13.236470 1 main.go:176] Received signal terminated. Forwarding to sub-process "hyperkube". I1028 03:49:13.237278 18 controller.go:181] Shutting down kubernetes service endpoint reconciler I1028 03:49:13.237289 18 genericapiserver.go:667] Event(v1.ObjectReference{Kind:"Pod", Namespace:"openshift-kube-apiserver", Name:"kube-apiserver-xxia1028bm-qdtfs-control-plane-0", UID:"", APIVersion:"v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'TerminationStart' Received signal to terminate, becoming unready, but keeping serving W1028 03:49:13.237647 18 genericapiserver.go:670] failed to create event openshift-kube-apiserver/kube-apiserver-xxia1028bm-qdtfs-control-plane-0.16420b66c03dcb8f: Event "kube-apiserver-xxia1028bm-qdtfs-control-plane-0.16420b66c03dcb8f" is invalid: [eventTime: Required value, source: Invalid value: "": needs to be unset] ... In latest 4.7.0-0.nightly-2020-10-27-051128 env, repeat above steps, the shutdown kube-apiserver container logs do not reproduce it: ... I1028 02:57:06.183558 18 genericapiserver.go:667] Event(v1.ObjectReference{Kind:"Pod", Namespace:"openshift-kube-apiserver", Name:"kube-apiserver-ip-10-0-156-192.ap-northeast-1.compute.internal", UID:"", APIVersion:"v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'TerminationStart' Received signal to terminate, becoming unready, but keeping serving I1028 02:57:06.184397 1 main.go:176] Received signal terminated. Forwarding to sub-process "hyperkube". I1028 02:57:06.286774 1 main.go:198] Termination finished with exit code 1 I1028 02:57:06.286868 1 main.go:151] Deleting termination lock file "/var/log/kube-apiserver/.terminating" ...
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5633