Bug 1939732

Summary: Resource Pressure Causes Dropped Probes, Leading to Server Death
Product: OpenShift Container Platform Reporter: Steve Kuznetsov <skuznets>
Component: kube-apiserverAssignee: Abu Kashem <akashem>
Status: CLOSED DUPLICATE QA Contact: Ke Wang <kewang>
Severity: urgent Docs Contact:
Priority: high    
Version: 4.7CC: aos-bugs, mfojtik, mharri, pmuller, wking, xxia
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-03-29 11:57:04 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Steve Kuznetsov 2021-03-16 22:07:04 UTC
The Kube and OpenShift API servers have a bug where resource pressure causes flowstates to drop requests for /readyz and /healthz, causing the process to be killed. Even under pressure, the servers should continue responding to these probes.

Please see the data in this post-mortem:
https://docs.google.com/document/d/1VfwmECbpCnDTOb0JVE37wcEQm4KnGwbatgIynTa6Wvg/edit#heading=h.te8j1fpvmstb

Comment 1 W. Trevor King 2021-03-18 23:52:11 UTC
Dup of bug 1937916?

Comment 2 Michal Fojtik 2021-03-29 11:57:04 UTC

*** This bug has been marked as a duplicate of bug 1937916 ***