Bug 1931624 - [sig-api-machinery][Feature:APIServer][Late] kubelet terminates kube-apiserver gracefully
Summary: [sig-api-machinery][Feature:APIServer][Late] kubelet terminates kube-apiserve...
Keywords:
Status: CLOSED DUPLICATE of bug 1928946
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node
Version: 4.8
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 4.8.0
Assignee: Ryan Phillips
QA Contact: Sunil Choudhary
URL:
Whiteboard:
: 1931668 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-02-22 20:04 UTC by Adam Kaplan
Modified: 2021-03-02 21:24 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
[sig-api-machinery][Feature:APIServer][Late] kubelet terminates kube-apiserver gracefully
Last Closed: 2021-03-02 21:24:41 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Adam Kaplan 2021-02-22 20:04:17 UTC
test:
[sig-api-machinery][Feature:APIServer][Late] kubelet terminates kube-apiserver gracefully 

is failing frequently in CI, see search results:
https://search.ci.openshift.org/?maxAge=168h&context=1&type=bug%2Bjunit&name=&maxMatches=5&maxBytes=20971520&groupBy=job&search=%5C%5Bsig-api-machinery%5C%5D%5C%5BFeature%3AAPIServer%5C%5D%5C%5BLate%5C%5D+kubelet+terminates+kube-apiserver+gracefully

This has a low pass rate across all variants (52.46% in the past week).

Example failures:

https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ocp-4.8-e2e-vsphere/1363904051699257344

https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-serial-4.8/1363884602472534016

Common failure reason:

```
fail [github.com/onsi/ginkgo.0-origin.0+incompatible/internal/leafnodes/runner.go:64]: kube-apiserver reports a non-graceful termination: v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"kube-apiserver-ip-10-0-144-144.us-west-1.compute.internal.16661f85ae9e25f3", GenerateName:"", Namespace:"openshift-kube-apiserver", SelfLink:"", UID:"b9a89b71-e995-4e9e-9e81-74f1c4cd688a", ResourceVersion:"26744", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:63749608975, loc:(*time.Location)(0x95105e0)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry{v1.ManagedFieldsEntry{Manager:"watch-termination", Operation:"Update", APIVersion:"v1", Time:(*v1.Time)(0xc000a988c0), FieldsType:"FieldsV1", FieldsV1:(*v1.FieldsV1)(0xc000a988e0)}}}, InvolvedObject:v1.ObjectReference{Kind:"Pod", Namespace:"openshift-kube-apiserver", Name:"kube-apiserver-ip-10-0-144-144.us-west-1.compute.internal", UID:"", APIVersion:"v1", ResourceVersion:"", FieldPath:""}, Reason:"NonGracefulTermination", Message:"Previous pod kube-apiserver-ip-10-0-144-144.us-west-1.compute.internal started at 2021-02-22 16:39:40.464539636 +0000 UTC did not terminate gracefully", Source:v1.EventSource{Component:"apiserver", Host:"ip-10-0-144-144"}, FirstTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:63749608975, loc:(*time.Location)(0x95105e0)}}, LastTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:63749608975, loc:(*time.Location)(0x95105e0)}}, Count:1, Type:"Warning", EventTime:v1.MicroTime{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}. Probably kubelet or CRI-O is not giving the time to cleanly shut down. This can lead to connection refused and network I/O timeout errors in other components.
```

Comment 1 Gabe Montero 2021-02-23 13:43:18 UTC
*** Bug 1931668 has been marked as a duplicate of this bug. ***

Comment 3 Ryan Phillips 2021-03-02 21:24:41 UTC

*** This bug has been marked as a duplicate of bug 1928946 ***


Note You need to log in before you can comment on or make changes to this bug.