Bug 1828606
| Summary: | Graceful termination broken, was: Managed cluster should have no crashlooping pods in core namespaces over two minutes: kube-apiserver pod is failing to start | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Abhinav Dahiya <adahiya> |
| Component: | Node | Assignee: | Harshal Patil <harpatil> |
| Status: | CLOSED DUPLICATE | QA Contact: | Sunil Choudhary <schoudha> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 4.5 | CC: | aos-bugs, estroczy, jokerman, kgarriso, mfojtik, sttts, zyu |
| Target Milestone: | --- | ||
| Target Release: | 4.5.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2020-05-22 15:28:23 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Looked into the logs of the first 4 linked runs and always found this: Error: failed to create listener: failed to listen on 0.0.0.0:6443: listen tcp 0.0.0.0:6443: bind: address already in use The kube-apiserver pods are equally named per node. So either the kernel does not release the port, or graceful termination logic is broken kubelet or cri-o again hides terminating pods such that kubelet starts the new pod early. Investigated https://deck-ci.apps.ci.l2s4.p1.openshiftapps.com/view/gcs/origin-ci-test/pr-logs/pull/openshift_release/8582/rehearse-8582-pull-ci-openshift-installer-master-e2e-vsphere/1 further: The kube-apiserver on master-1 fails to bind to the port: I0427 19:45:02.128271 1 server.go:194] Version: v1.18.0-rc.1 Error: failed to create listener: failed to listen on 0.0.0.0:6443: listen tcp 0.0.0.0:6443: bind: address already in use compare https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_release/8582/rehearse-8582-pull-ci-openshift-installer-master-e2e-vsphere/1/artifacts/e2e-vsphere/gather-extra/pods/openshift-kube-apiserver_kube-apiserver-ci-op-cyr77ydz-3b021-lctgx-master-1_kube-apiserver_previous.log. From events.json we see this: Normal apiserver TerminationStart Received signal to terminate, becoming unready, but keeping serving 2020-04-27T19:44:02Z 2020-04-27T19:44:02Z kube-apiserver-ci-op-cyr77ydz-3b021-lctgx-master-1 1 Normal apiserver TerminationPreShutdownHooksFinished All pre-shutdown hooks have been finished 2020-04-27T19:44:02Z 2020-04-27T19:44:02Z kube-apiserver-ci-op-cyr77ydz-3b021-lctgx-master-1 1 Normal apiserver TerminationMinimalShutdownDurationFinished The minimal shutdown duration of 1m10s finished 2020-04-27T19:45:12Z 2020-04-27T19:45:12Z kube-apiserver-ci-op-cyr77ydz-3b021-lctgx-master-1 1 Normal apiserver TerminationStoppedServing Server has stopped listening 2020-04-27T19:45:12Z 2020-04-27T19:45:12Z kube-apiserver-ci-op-cyr77ydz-3b021-lctgx-master-1 1 Normal apiserver TerminationGracefulTerminationFinished All pending requests processed 2020-04-27T19:45:12Z 2020-04-27T19:45:12Z kube-apiserver-ci-op-cyr77ydz-3b021-lctgx-master-1 1 I.e. the old kube-apiserver is asked to shut down at 19:44:02z. It has a graceful termination period of 135s. Hence, the kubelet is supposed to wait up to 135s. Yes, the kubelet starts launching the new pod exactly after 60s (19:45:02). The old kube-apiserver needs at least 70s to shutdown, and it does according to the events. Kubelet does not wait for that. Moving to node team for deeper investigation why graceful termination is broken. *** Bug 1832511 has been marked as a duplicate of this bug. *** *** Bug 1835375 has been marked as a duplicate of this bug. *** Looking in kubelet_service.log in must-gather logs from, https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_release/8582/rehearse-8582-pull-ci-openshift-installer-master-e2e-vsphere/1/artifacts/e2e-vsphere/gather-must-gather/must-gather.tar Static pod for kube-apiserver is removed first. Apr 27 19:44:02.206818 ci-op-cyr77ydz-3b021-lctgx-master-1 hyperkube[1468]: I0427 19:44:02.206717 1468 file.go:201] Reading config file "/etc/kubernetes/manifests/kube-apiserver-pod.yaml" Apr 27 19:44:02.208928 ci-op-cyr77ydz-3b021-lctgx-master-1 hyperkube[1468]: I0427 19:44:02.208847 1468 kubelet.go:1924] SyncLoop (REMOVE, "file"): "kube-apiserver-ci-op-cyr77ydz-3b021-lctgx-master-1_openshift-kube-apiserver(098ca1cabf32e4a852a68b61bffcb235)" Static pod for kube-apiserver is added back immediately. Apr 27 19:44:02.211239 ci-op-cyr77ydz-3b021-lctgx-master-1 hyperkube[1468]: I0427 19:44:02.211189 1468 kubelet.go:1914] SyncLoop (ADD, "file"): "kube-apiserver-ci-op-cyr77ydz-3b021-lctgx-master-1_openshift-kube-apiserver(402659332c36cc00dc64ab1a77d7714e)" The existing kube-apiserver pod is getting killed. Apr 27 19:44:02.212544 ci-op-cyr77ydz-3b021-lctgx-master-1 hyperkube[1468]: I0427 19:44:02.212470 1468 event.go:278] Event(v1.ObjectReference{Kind:"Pod", Namespace:"openshift-kube-apiserver", Name:"kube-apiserver-ci-op-cyr77ydz-3b021-lctgx-master-1", UID:"098ca1cabf32e4a852a68b61bffcb235", APIVersion:"v1", ResourceVersion:"", FieldPath:"spec.containers{kube-apiserver}"}): type: 'Normal' reason: 'Killing' Stopping container kube-apiserver Apr 27 19:44:02.212544 ci-op-cyr77ydz-3b021-lctgx-master-1 hyperkube[1468]: I0427 19:44:02.212496 1468 event.go:278] Event(v1.ObjectReference{Kind:"Pod", Namespace:"openshift-kube-apiserver", Name:"kube-apiserver-ci-op-cyr77ydz-3b021-lctgx-master-1", UID:"098ca1cabf32e4a852a68b61bffcb235", APIVersion:"v1", ResourceVersion:"", FieldPath:"spec.containers{kube-apiserver-insecure-readyz}"}): type: 'Normal' reason: 'Killing' Stopping container kube-apiserver-insecure-readyz Apr 27 19:44:02.212544 ci-op-cyr77ydz-3b021-lctgx-master-1 hyperkube[1468]: I0427 19:44:02.212513 1468 event.go:278] Event(v1.ObjectReference{Kind:"Pod", Namespace:"openshift-kube-apiserver", Name:"kube-apiserver-ci-op-cyr77ydz-3b021-lctgx-master-1", UID:"098ca1cabf32e4a852a68b61bffcb235", APIVersion:"v1", ResourceVersion:"", FieldPath:"spec.containers{kube-apiserver-cert-regeneration-controller}"}): type: 'Normal' reason: 'Killing' Stopping container kube-apiserver-cert-regeneration-controller Apr 27 19:44:02.212544 ci-op-cyr77ydz-3b021-lctgx-master-1 hyperkube[1468]: I0427 19:44:02.212527 1468 event.go:278] Event(v1.ObjectReference{Kind:"Pod", Namespace:"openshift-kube-apiserver", Name:"kube-apiserver-ci-op-cyr77ydz-3b021-lctgx-master-1", UID:"098ca1cabf32e4a852a68b61bffcb235", APIVersion:"v1", ResourceVersion:"", FieldPath:"spec.containers{kube-apiserver-cert-syncer}"}): type: 'Normal' reason: 'Killing' Stopping container kube-apiserver-cert-syncer Apr 27 19:44:02.230507 ci-op-cyr77ydz-3b021-lctgx-master-1 hyperkube[1468]: I0427 19:44:02.229150 1468 kubelet_pods.go:1355] Generating status for "kube-apiserver-ci-op-cyr77ydz-3b021-lctgx-master-1_openshift-kube-apiserver(402659332c36cc00dc64ab1a77d7714e)" </snip> Apr 27 19:44:02.254413 ci-op-cyr77ydz-3b021-lctgx-master-1 hyperkube[1468]: I0427 19:44:02.253599 1468 manager.go:1007] Destroyed container: "/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod098ca1cabf32e4a852a68b61bffcb235.slice/crio-2e19679098aa828fa4e81c8ca58135c191feb048f4126cd9e40c12a37cfa7166.scope" (aliases: [k8s_kube-apiserver-insecure-readyz_kube-apiserver-ci-op-cyr77ydz-3b021-lctgx-master-1_openshift-kube-apiserver_098ca1cabf32e4a852a68b61bffcb235_0 2e19679098aa828fa4e81c8ca58135c191feb048f4126cd9e40c12a37cfa7166], namespace: "crio") Apr 27 19:44:02.255408 ci-op-cyr77ydz-3b021-lctgx-master-1 hyperkube[1468]: I0427 19:44:02.255368 1468 manager.go:1007] Destroyed container: "/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod098ca1cabf32e4a852a68b61bffcb235.slice/crio-conmon-2e19679098aa828fa4e81c8ca58135c191feb048f4126cd9e40c12a37cfa7166.scope" (aliases: [], namespace: "") Apr 27 19:44:02.273151 ci-op-cyr77ydz-3b021-lctgx-master-1 hyperkube[1468]: I0427 19:44:02.265756 1468 kubelet.go:1653] Trying to delete pod kube-apiserver-ci-op-cyr77ydz-3b021-lctgx-master-1_openshift-kube-apiserver da77b5ea-3ab8-4c22-bfbb-be120a277564 Looking at the time stamps you can see, the mirror pod is deleted immediately too. Apr 27 19:44:02.273151 ci-op-cyr77ydz-3b021-lctgx-master-1 hyperkube[1468]: I0427 19:44:02.265785 1468 mirror_client.go:125] Deleting a mirror pod "kube-apiserver-ci-op-cyr77ydz-3b021-lctgx-master-1_openshift-kube-apiserver" (uid (*types.UID)(0xc001c2bc60)) Apr 27 19:44:02.273151 ci-op-cyr77ydz-3b021-lctgx-master-1 hyperkube[1468]: I0427 19:44:02.273104 1468 manager.go:1007] Destroyed container: "/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod098ca1cabf32e4a852a68b61bffcb235.slice/crio-conmon-04033c95f5fdd8354de951e77a87068d783362c1d7fbcb649f7614bad3d247a9.scope" (aliases: [], namespace: "") Apr 27 19:44:02.284601 ci-op-cyr77ydz-3b021-lctgx-master-1 hyperkube[1468]: I0427 19:44:02.283827 1468 manager.go:1007] Destroyed container: "/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod098ca1cabf32e4a852a68b61bffcb235.slice/crio-04033c95f5fdd8354de951e77a87068d783362c1d7fbcb649f7614bad3d247a9.scope" (aliases: [k8s_kube-apiserver-cert-syncer_kube-apiserver-ci-op-cyr77ydz-3b021-lctgx-master-1_openshift-kube-apiserver_098ca1cabf32e4a852a68b61bffcb235_0 04033c95f5fdd8354de951e77a87068d783362c1d7fbcb649f7614bad3d247a9], namespace: "crio") Without waiting for the terminationGracePeriodSeconds (135s) a new kube-apiserver pod is created. Apr 27 19:44:02.599048 ci-op-cyr77ydz-3b021-lctgx-master-1 hyperkube[1468]: I0427 19:44:02.599016 1468 kubelet.go:1914] SyncLoop (ADD, "api"): "kube-apiserver-ci-op-cyr77ydz-3b021-lctgx-master-1_openshift-kube-apiserver(83ac4da1-4907-4354-b08d-15049e074a0e)" Apr 27 19:44:02.741383 ci-op-cyr77ydz-3b021-lctgx-master-1 hyperkube[1468]: I0427 19:44:02.741351 1468 manager.go:950] Added container: "/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod402659332c36cc00dc64ab1a77d7714e.slice/crio-f4fd1828ce54142bedf7749417e1464a96be8f531c77b6fd8b400e0be43ccdab.scope" (aliases: [k8s_POD_kube-apiserver-ci-op-cyr77ydz-3b021-lctgx-master-1_openshift-kube-apiserver_402659332c36cc00dc64ab1a77d7714e_0 f4fd1828ce54142bedf7749417e1464a96be8f531c77b6fd8b400e0be43ccdab], namespace: "crio") The old kube-apiserver pod has not exited gracefully yet within it's set terminationGracePeriodSeconds. Apr 27 19:44:02.815018 ci-op-cyr77ydz-3b021-lctgx-master-1 hyperkube[1468]: I0427 19:44:02.814978 1468 kubelet_pods.go:1731] Orphaned pod "098ca1cabf32e4a852a68b61bffcb235" found, but volumes not yet removed. Reducing cpu to minimum Apr 27 19:44:02.847649 ci-op-cyr77ydz-3b021-lctgx-master-1 hyperkube[1468]: I0427 19:44:02.847609 1468 status_manager.go:563] Pod "kube-apiserver-ci-op-cyr77ydz-3b021-lctgx-master-1_openshift-kube-apiserver(83ac4da1-4907-4354-b08d-15049e074a0e)" was deleted and then recreated, skipping status update; old UID "098ca1cabf32e4a852a68b61bffcb235", new UID "402659332c36cc00dc64ab1a77d7714e" As a result, we have have old kube-apiserver pod which is still listening on 0.0.0.0:6443, while the new pod is trying to bind to same address. From, https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_release/8582/rehearse-8582-pull-ci-openshift-installer-master-e2e-vsphere/1/artifacts/e2e-vsphere/gather-extra/pods/openshift-kube-apiserver_kube-apiserver-ci-op-cyr77ydz-3b021-lctgx-master-1_kube-apiserver_previous.log, you can see following line, I0427 19:45:02.128271 1 server.go:194] Version: v1.18.0-rc.1 Error: failed to create listener: failed to listen on 0.0.0.0:6443: listen tcp 0.0.0.0:6443: bind: address already in use Usage: kube-apiserver [flags] The events in above logs strongly correlate with https://bugzilla.redhat.com/show_bug.cgi?id=1700279, specifically with the, 1. Description Stefan provided, https://bugzilla.redhat.com/show_bug.cgi?id=1700279#c0 2. Seth's comment, https://bugzilla.redhat.com/show_bug.cgi?id=1700279#c2 3. Ted's comment, https://bugzilla.redhat.com/show_bug.cgi?id=1700279#c8 My bad, instead of respecting the terminationGracePeriodSeconds, as Ted puts in this comment (https://github.com/kubernetes/kubernetes/issues/90999#issuecomment-632742253), we need conditional cancellable grace period. *** This bug has been marked as a duplicate of bug 1700279 *** |
Description of problem: [sig-arch] Managed cluster should have no crashlooping pods in core namespaces over two minutes [Suite:openshift/conformance/parallel] expand_less 2m9s fail [github.com/openshift/origin/test/extended/operators/cluster.go:114]: Expected <[]string | len:2, cap:2>: [ "Pod openshift-kube-apiserver/kube-apiserver-ci-op-cyr77ydz-3b021-lctgx-master-1 is not healthy: back-off 40s restarting failed container=kube-apiserver pod=kube-apiserver-ci-op-cyr77ydz-3b021-lctgx-master-1_openshift-kube-apiserver(402659332c36cc00dc64ab1a77d7714e)", "Pod openshift-kube-apiserver/kube-apiserver-ci-op-cyr77ydz-3b021-lctgx-master-0 is not healthy: back-off 20s restarting failed container=kube-apiserver pod=kube-apiserver-ci-op-cyr77ydz-3b021-lctgx-master-0_openshift-kube-apiserver(402659332c36cc00dc64ab1a77d7714e)", ] to be empty Version-Release number of selected component (if applicable): How reproducible: https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/pr-logs/pull/openshift_release/8582/rehearse-8582-pull-ci-openshift-installer-master-e2e-vsphere/1 https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/pr-logs/pull/openshift_telemeter/328/pull-ci-openshift-telemeter-master-e2e-aws/846 https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/pr-logs/pull/openshift_console-operator/419/pull-ci-openshift-console-operator-master-e2e-gcp/460 https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/pr-logs/pull/openshift_cluster-etcd-operator/322/pull-ci-openshift-cluster-etcd-operator-master-e2e-aws/1386 https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-openstack-4.4/1481 https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-azure-4.4/1545 https://search.svc.ci.openshift.org/?search=back-off+40s+restarting+failed+container%3Dkube-apiserver&maxAge=48h&context=1&type=junit&name=&maxMatches=5&maxBytes=20971520&groupBy=job Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: