1938353 – [sig-api-machinery][Feature:APIServer][Late] kubelet terminates kube-apiserver gracefully

Bug 1938353 - [sig-api-machinery][Feature:APIServer][Late] kubelet terminates kube-apiserver gracefully

Summary: [sig-api-machinery][Feature:APIServer][Late] kubelet terminates kube-apiserve...

Keywords:
Status:	CLOSED DUPLICATE of bug 1928946
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Node
Sub Component:
Version:	4.8
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	4.8.0
Assignee:	Sascha Grunert
QA Contact:	Weinan Liu
Docs Contact:
URL:
Whiteboard:	buildcop
Duplicates (1):	1943364 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2021-03-12 21:06 UTC by Cesar Wong
Modified:	2021-05-14 07:08 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:	[sig-api-machinery][Feature:APIServer][Late] kubelet terminates kube-apiserver gracefully [sig-node] kubelet terminates kube-apiserver gracefully
Last Closed:	2021-05-14 07:08:41 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Cesar Wong 2021-03-12 21:06:34 UTC

test:
[sig-api-machinery][Feature:APIServer][Late] kubelet terminates kube-apiserver gracefully 

is failing frequently in CI, see search results:
https://search.ci.openshift.org/?maxAge=168h&context=1&type=bug%2Bjunit&name=&maxMatches=5&maxBytes=20971520&groupBy=job&search=%5C%5Bsig-api-machinery%5C%5D%5C%5BFeature%3AAPIServer%5C%5D%5C%5BLate%5C%5D+kubelet+terminates+kube-apiserver+gracefully


https://prow.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-aws-serial-4.8/1370423926697496576

[BeforeEach] [Top Level]
  github.com/openshift/origin/test/extended/util/framework.go:1440
[BeforeEach] [Top Level]
  github.com/openshift/origin/test/extended/util/framework.go:1440
[BeforeEach] [Top Level]
  github.com/openshift/origin/test/extended/util/test.go:59
[BeforeEach] [sig-api-machinery][Feature:APIServer][Late]
  github.com/openshift/origin/test/extended/util/client.go:140
STEP: Creating a kubernetes client
[It] kubelet terminates kube-apiserver gracefully [Suite:openshift/conformance/parallel]
  github.com/openshift/origin/test/extended/apiserver/graceful_termination.go:20
[AfterEach] [sig-api-machinery][Feature:APIServer][Late]
  github.com/openshift/origin/test/extended/util/client.go:138
[AfterEach] [sig-api-machinery][Feature:APIServer][Late]
  github.com/openshift/origin/test/extended/util/client.go:139
fail [github.com/onsi/ginkgo.0-origin.0+incompatible/internal/leafnodes/runner.go:64]: kube-apiserver reports a non-graceful termination: v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"kube-apiserver-ip-10-0-180-113.us-west-1.compute.internal.166ba96b94d19c97", GenerateName:"", Namespace:"openshift-kube-apiserver", SelfLink:"", UID:"a9ed49e9-4218-4555-9820-2be18f2078d2", ResourceVersion:"24835", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:63751167970, loc:(*time.Location)(0x951a6c0)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry{v1.ManagedFieldsEntry{Manager:"watch-termination", Operation:"Update", APIVersion:"v1", Time:(*v1.Time)(0xc001911160), FieldsType:"FieldsV1", FieldsV1:(*v1.FieldsV1)(0xc0019111e0)}}}, InvolvedObject:v1.ObjectReference{Kind:"Pod", Namespace:"openshift-kube-apiserver", Name:"kube-apiserver-ip-10-0-180-113.us-west-1.compute.internal", UID:"", APIVersion:"v1", ResourceVersion:"", FieldPath:""}, Reason:"NonGracefulTermination", Message:"Previous pod kube-apiserver-ip-10-0-180-113.us-west-1.compute.internal started at 2021-03-12 17:43:30.008231143 +0000 UTC did not terminate gracefully", Source:v1.EventSource{Component:"apiserver", Host:"ip-10-0-180-113"}, FirstTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:63751167970, loc:(*time.Location)(0x951a6c0)}}, LastTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:63751167970, loc:(*time.Location)(0x951a6c0)}}, Count:1, Type:"Warning", EventTime:v1.MicroTime{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}. Probably kubelet or CRI-O is not giving the time to cleanly shut down. This can lead to connection refused and network I/O timeout errors in other components.


This bug was presumably addressed by: 
https://bugzilla.redhat.com/show_bug.cgi?id=1882750 and
https://bugzilla.redhat.com/show_bug.cgi?id=1926484

However, it keeps happening after those fixes were merged.

Comment 1 Petr Muller 2021-03-18 14:01:19 UTC

This bug is apparently also linked to the [sig-node] kubelet terminates kube-apiserver gracefully failures we see in upgrade jobs, e.g.:

https://testgrid.k8s.io/redhat-openshift-ocp-release-4.8-informing#periodic-ci-openshift-release-master-ci-4.8-e2e-gcp-upgrade

Comment 3 Ryan Phillips 2021-03-31 15:11:41 UTC

*** Bug 1943364 has been marked as a duplicate of this bug. ***

Comment 5 Sascha Grunert 2021-04-28 08:54:13 UTC

(In reply to Petr Muller from comment #1)
> This bug is apparently also linked to the [sig-node] kubelet terminates
> kube-apiserver gracefully failures we see in upgrade jobs, e.g.:
> 
> https://testgrid.k8s.io/redhat-openshift-ocp-release-4.8-informing#periodic-
> ci-openshift-release-master-ci-4.8-e2e-gcp-upgrade

Testgrid shows that the tests are green since some time now, so I assume that there has been a fix merged.

Cesar, can you confirm this?

Comment 6 Sascha Grunert 2021-05-10 09:20:54 UTC

I assume that the test is now green and there is no more action to be taken from us.

Comment 9 Cesar Wong 2021-05-11 13:20:19 UTC

It looks like this is still occurring (See search results https://search.ci.openshift.org/?maxAge=168h&context=1&type=bug%2Bjunit&name=&maxMatches=5&maxBytes=20971520&groupBy=job&search=%5C%5Bsig-api-machinery%5C%5D%5C%5BFeature%3AAPIServer%5C%5D%5C%5BLate%5C%5D+kubelet+terminates+kube-apiserver+gracefully) and also reported in a different bz: https://bugzilla.redhat.com/show_bug.cgi?id=1928946
This should likely be marked as a duplicate of the earlier one.

Comment 10 Weinan Liu 2021-05-13 05:48:26 UTC

FailedQA as per comment 8, I'm also ok to make it duplicated as per comment 9.

Comment 11 Sascha Grunert 2021-05-14 07:08:01 UTC

Sounds good, let's close as duplicate.

Comment 12 Sascha Grunert 2021-05-14 07:08:41 UTC


*** This bug has been marked as a duplicate of bug 1928946 ***

Note You need to log in before you can comment on or make changes to this bug.