Bug 1822268

Summary: [4.4] pod/project stuck at terminating status: The container could not be located when the pod was terminated (Exit Code: 137)
Product: OpenShift Container Platform Reporter: Ryan Phillips <rphillips>
Component: NodeAssignee: Ted Yu <zyu>
Status: CLOSED ERRATA QA Contact: Weinan Liu <weinliu>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 4.4CC: aaleman, aos-bugs, hongkliu, jokerman, maszulik, mfojtik, schoudha, skuznets, umohnani, weinliu, wking, yinzhou
Target Milestone: ---Keywords: Reopened
Target Release: 4.4.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1820507
: 1822269 (view as bug list) Environment:
Last Closed: 2020-05-04 11:48:39 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1820507    
Bug Blocks: 1822269    

Comment 4 Ryan Phillips 2020-04-15 18:43:49 UTC

*** This bug has been marked as a duplicate of bug 1819906 ***

Comment 11 Hongkai Liu 2020-04-22 19:02:44 UTC
@Weinan

May I ask how you made the pods stuck terminating before running the drain-node command?

Comment 12 Weinan Liu 2020-04-28 03:28:22 UTC
@Hongkai,

I'm following the steps in https://bugzilla.redhat.com/show_bug.cgi?id=1819954

Comment 13 W. Trevor King 2020-04-30 17:23:26 UTC
Also related to this bug series are the two post-fix mitigation bugs: bug 1829664 and bug 1829999.

Comment 15 errata-xmlrpc 2020-05-04 11:48:39 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0581

Comment 16 Hongkai Liu 2020-05-04 13:53:48 UTC
Since we upgrade to 4.4, we have not hit the bug.

Current version
oc --context build01 get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.4.3     True        False         2d5h    Cluster version is 4.4.3

Thanks for the fix.