Bug 1515935

Summary: Hard to tell exit code and OOM status of containers when another process deletes pods
Product: OpenShift Container Platform Reporter: Jim Minter <jminter>
Component: NodeAssignee: Seth Jennings <sjenning>
Status: CLOSED DUPLICATE QA Contact: DeShuai Ma <dma>
Severity: urgent Docs Contact:
Priority: urgent    
Version: unspecifiedCC: aos-bugs, erich, jminter, jokerman, mmccomas, rromerom, sjenning, sreber
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-01-16 21:11:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jim Minter 2017-11-21 16:21:28 UTC
It's hard to tell the exit status (exit code, OOM status) of a container when it exits due to another process deleting the pod.

This was seen in a customer issue where a child process (gradle build) in a Jenkins slave pod was killed by the OOM killer, and the Jenkins master then deleted the pod before it exited naturally.

Had the pod exited naturally, the reason: OOMKilled and exit code would have been visible in the pod status.

Adding kubernetes Events for all container lifecycle events, including exit code and OOM indication would have helped customer and Red Hat to diagnose the issue faster.

https://github.com/kubernetes/kubernetes/pull/45682 would be a foundation for this work, but would need to be further extended to make the OOM indication visible.

Comment 6 Seth Jennings 2018-01-16 21:11:16 UTC
This is a feature request. Duping to the RFE.

*** This bug has been marked as a duplicate of bug 1431824 ***