Bug 1810652 - Node should not delete pods until all container status is available
Summary: Node should not delete pods until all container status is available
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node
Version: 4.4
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 4.5.0
Assignee: Clayton Coleman
QA Contact: Sunil Choudhary
URL:
Whiteboard:
: 1726934 1780386 (view as bug list)
Depends On:
Blocks: 1810722 1926546
TreeView+ depends on / blocked
 
Reported: 2020-03-05 16:39 UTC by Clayton Coleman
Modified: 2021-02-09 02:37 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1810722 1926546 (view as bug list)
Environment:
Last Closed: 2020-07-13 17:18:35 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift origin pull 24627 0 None closed Bug 1810652: Kubelet should not remove restart never pods until all status is reported 2021-02-20 08:57:37 UTC
Red Hat Product Errata RHBA-2020:2409 0 None None None 2020-07-13 17:18:54 UTC

Description Clayton Coleman 2020-03-05 16:39:17 UTC
The kubelet does not properly terminate pods that are RestartNever - upstream it reports success (even if the pod actually failed), and in OpenShift since 4.1 we provides synthetic status (a fake 137 exit code).  Now that we have fixed the issue upstream, we should backport it to 4.4 at least, possible 4.3.

The upstream e2e reproduces the issue by:

1. Creating a RestartNever pod that should always exit with status code 1
2. Waiting 0-4s
3. Deleting the pod
4. Observing the status written by the kubelet - no container should report exit code 0

To test this in Origin the e2e test is sufficient, and we can verify in upgrade jobs (which terminate lots of pods) that no openshift-* namespace pod exits with code 137 reason ContainerStatusUnknown.

Comment 3 Ryan Phillips 2020-03-26 15:30:28 UTC
*** Bug 1780386 has been marked as a duplicate of this bug. ***

Comment 5 Ryan Phillips 2020-04-06 21:07:39 UTC
*** Bug 1726934 has been marked as a duplicate of this bug. ***

Comment 7 errata-xmlrpc 2020-07-13 17:18:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2409


Note You need to log in before you can comment on or make changes to this bug.