Bug 1930960

Summary: After a disaster recovery pods a stuck in "NodeAffinity" state and not running
Product: OpenShift Container Platform Reporter: OpenShift BugZilla Robot <openshift-bugzilla-robot>
Component: NodeAssignee: Elana Hashman <ehashman>
Node sub component: Kubelet QA Contact: Sunil Choudhary <schoudha>
Status: CLOSED ERRATA Docs Contact:
Severity: urgent    
Priority: urgent CC: abeekhof, abodhe, aos-bugs, dblack, decarr, ehashman, iheim, jokerman, mfojtik, nagrawal, rphillips, tsweeney, yjoseph, yprokule
Version: 4.5Keywords: Reopened
Target Milestone: ---   
Target Release: 4.6.z   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: Node is marked as Ready and admits pods before it has a chance to sync. Consequence: Pod status may go out of sync, sometimes many are stuck in NodeAffinity, at node startup for a node that is not cordoned. Fix: Do not mark node as Ready until Node has synced with API servers at least once. Result: Pods should not get stuck in NodeAffinity after e.g. a cold cluster restart.
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-03-25 04:45:13 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On: 1868645    
Bug Blocks:    

Comment 5 errata-xmlrpc 2021-03-25 04:45:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6.22 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:0825