Bug 1948052

Summary: Kubelet got stuck on worker, causing thrashing and node not-ready
Product: OpenShift Container Platform Reporter: Gabriel Diotte <gdiotte>
Component: NodeAssignee: Harshal Patil <harpatil>
Node sub component: Kubelet QA Contact: Sunil Choudhary <schoudha>
Status: CLOSED CURRENTRELEASE Docs Contact:
Severity: urgent    
Priority: urgent CC: abeekhof, akaris, aos-bugs, bhershbe, dcbw, eminguez, fbaudin, harpatil, mfojtik, msluiter, nagrawal, rphillips, smilner, trozet
Version: 4.5   
Target Milestone: ---   
Target Release: 4.8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-05-03 14:35:36 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Comment 1 Maciej Szulik 2021-04-12 15:07:19 UTC
Moving this over to node/kubelet team to further investigate, since this doesn't look like a scheduler problem.

Comment 9 Andrew Beekhof 2021-04-19 00:37:22 UTC
A reasonably simple improvement to kubelet's healthz checks would have prevented the node from becoming wedged.
Unfortunately there doesn't seem to be much interest merging the PR that implements it.

   https://github.com/kubernetes/kubernetes/issues/98981
   https://github.com/kubernetes/kubernetes/pull/94210

Comment 26 Red Hat Bugzilla 2023-09-15 01:04:55 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days