Bug 2002508

Summary: creating pods before sriovnetworknodepolicy sync up succeed will cause node unschedulable
Product: OpenShift Container Platform Reporter: Peng Liu <pliu>
Component: NetworkingAssignee: Peng Liu <pliu>
Networking sub component: SR-IOV QA Contact: Ying Wang <yingwang>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: unspecified CC: pliu, yingwang, zshi
Version: 4.9   
Target Milestone: ---   
Target Release: 4.9.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Known Issue
Doc Text:
Cause: Uses add/delete sriovnetworknodepolicy CR before waiting for all the syncStatus of sriovnetworknodestate CRs turning to 'Succeeded' Consequence: The sriov network config daemon pod will cordon the node and mark it unschedulable forever. Workaround (if any): Before adding/deleting one sriovnetworknodepolicy CR, make sure all the syncStatus of sriovnetworknodestate CRs is in 'Succeeded' state. Result:
Story Points: ---
Clone Of: 1999079
: 2095210 (view as bug list) Environment:
Last Closed: 2021-11-22 21:47:05 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1999079, 2086415    
Bug Blocks: 2095210    

Comment 2 Ying Wang 2021-11-17 03:14:21 UTC
Verified on sriov operator 4.9.0-202111160310, this issue is fixed. Creating and deleting sriovnetworknodpolicy before the sriovnetworknodestats Sync up ready, the node stats are still in Ready Status.

# oc version
Client Version: 4.8.0-0.nightly-2021-04-23-131610
Server Version: 4.9.5
Kubernetes Version: v1.22.0-rc.0+a44d0f0
# oc get csv -n openshift-sriov-network-operator 
NAME                                        DISPLAY                      VERSION              REPLACES   PHASE
performance-addon-operator.v4.9.0           Performance Addon Operator   4.9.0                           Succeeded
sriov-network-operator.4.9.0-202111160310   SR-IOV Network Operator      4.9.0-202111160310              Succeeded

Comment 5 errata-xmlrpc 2021-11-22 21:47:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.9.8 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:4712