+++ This bug was initially created as a clone of Bug #1810652 +++ The kubelet does not properly terminate pods that are RestartNever - upstream it reports success (even if the pod actually failed), and in OpenShift since 4.1 we provides synthetic status (a fake 137 exit code). Now that we have fixed the issue upstream, we should backport it to 4.4 at least, possible 4.3. The upstream e2e reproduces the issue by: 1. Creating a RestartNever pod that should always exit with status code 1 2. Waiting 0-4s 3. Deleting the pod 4. Observing the status written by the kubelet - no container should report exit code 0 To test this in Origin the e2e test is sufficient, and we can verify in upgrade jobs (which terminate lots of pods) that no openshift-* namespace pod exits with code 137 reason ContainerStatusUnknown.
*** Bug 1734524 has been marked as a duplicate of this bug. ***
*** Bug 1821576 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0581
Scott had added UpgradeBlocker to this bug way back, but I don't think we ever ended up blocking update recommendations on this series, and the fix has been out for almost a year, and 4.4 is now end-of-life. Removing the keyword to get it out of our suspect queue [1]. [1]: https://github.com/openshift/enhancements/pull/475