test: [sig-api-machinery][Feature:APIServer][Late] kubelet terminates kube-apiserver gracefully is failing frequently in CI, see search results: https://search.ci.openshift.org/?maxAge=168h&context=1&type=bug%2Bjunit&name=&maxMatches=5&maxBytes=20971520&groupBy=job&search=%5C%5Bsig-api-machinery%5C%5D%5C%5BFeature%3AAPIServer%5C%5D%5C%5BLate%5C%5D+kubelet+terminates+kube-apiserver+gracefully https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-metal-compact-4.6/1314250018076495872 fail [github.com/onsi/ginkgo.0-origin.1+incompatible/internal/leafnodes/runner.go:64]: kube-apiserver reports a non-graceful termination: v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"kube-apiserver-master-1.ci-op-jwq0wjcr-35904.origin-ci-int-aws.dev.rhcloud.com.163c1542b8f9d61a", GenerateName:"", Namespace:"openshift-kube-apiserver", SelfLink:"/api/v1/namespaces/openshift-kube-apiserver/events/kube-apiserver-master-1.ci-op-jwq0wjcr-35904.origin-ci-int-aws.dev.rhcloud.com.163c1542b8f9d61a", UID:"a55424d2-daf0-4ce1-8de1-1d414b5233da", ResourceVersion:"21297", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:63737775743, loc:(*time.Location)(0x9003460)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry{v1.ManagedFieldsEntry{Manager:"watch-termination", Operation:"Update", APIVersion:"v1", Time:(*v1.Time)(0xc001ebea00), FieldsType:"FieldsV1", FieldsV1:(*v1.FieldsV1)(0xc001ebea20)}}}, InvolvedObject:v1.ObjectReference{Kind:"Pod", Namespace:"openshift-kube-apiserver", Name:"kube-apiserver-master-1.ci-op-jwq0wjcr-35904.origin-ci-int-aws.dev.rhcloud.com", UID:"", APIVersion:"v1", ResourceVersion:"", FieldPath:""}, Reason:"NonGracefulTermination", Message:"Previous pod kube-apiserver-master-1.ci-op-jwq0wjcr-35904.origin-ci-int-aws.dev.rhcloud.com started at 2020-10-08 17:33:31.267794933 +0000 UTC did not terminate gracefully", Source:v1.EventSource{Component:"apiserver", Host:"master-1.ci-op-jwq0wjcr-35904.origin-ci-int-aws.dev.rhcloud.com"}, FirstTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:63737775743, loc:(*time.Location)(0x9003460)}}, LastTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:63737775743, loc:(*time.Location)(0x9003460)}}, Count:1, Type:"Warning", EventTime:v1.MicroTime{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}. Probably kubelet or CRI-O is not giving the time to cleanly shut down. This can lead to connection refused and network I/O timeout errors in other components.
What's noteworthy is less the failure and more the sudden shift in frequency for some jobs release-openshift-ocp-installer-e2e-azure-4.6 88.00% (0.00%)(25 runs) 100.00% (0.00%)(39 runs) release-openshift-origin-installer-e2e-remote-libvirt-s390x-4.6 90.00% (0.00%)(10 runs) 100.00% (0.00%)(6 runs) release-openshift-ocp-installer-e2e-aws-4.6 92.00% (0.00%)(50 runs) 97.06% (0.00%)(68 runs) periodic-ci-openshift-release-master-ocp-4.6-e2e-vsphere 93.94% (0.00%)(33 runs) 100.00% (0.00%)(35 runs) @sjenning. This suddenly got a lot more severe and would explain our upgrade 10% increase in availability downtime.
*** This bug has been marked as a duplicate of bug 1882750 ***