Created attachment 1702323 [details] sriov operator and config daemon logs Description of problem: After I deleted the 'default' sriovnetworknodepolicies and it will be restored by operator. then now I apply one policy. Found the VF cannot be inited. Version-Release number of selected component (if applicable): 4.5 How reproducible: Steps to Reproduce: 1. install the sriov operator 2. Delete the 'default' policy oc delete sriovnetworknodepolicies.sriovnetwork.openshift.io default 3. Check the 'default' is restored oc get sriovnetworknodepolicies.sriovnetwork.openshift.io 4. Apply one policy with below apiVersion: sriovnetwork.openshift.io/v1 kind: SriovNetworkNodePolicy metadata: name: intel-netdevice namespace: openshift-sriov-network-operator spec: deviceType: netdevice nicSelector: pfNames: - ens1f0 rootDevices: - '0000:3b:00.0' vendor: '8086' nodeSelector: feature.node.kubernetes.io/sriov-capable: 'true' numVfs: 5 priority: 99 resourceName: intelnetdevice 5. waiting the sriov-device-plugin and sriov-cni pods become 'running' and configdaemon logs show 'setNodeStateStatus(): syncStatus: Succeeded, lastSyncError:" 6. Check 'sriovnetworknodestates.sriovnetwork.openshift.io' of node oc get sriovnetworknodestates.sriovnetwork.openshift.io -o yaml Actual results: step 6: no VF were inited successfully. spec: dpConfigVersion: "1540830" interfaces: - name: ens1f0 numVfs: 5 pciAddress: 0000:3b:00.0 vfGroups: - deviceType: netdevice policyName: intel-netdevice resourceName: intelnetdevice vfRange: 0-4 status: interfaces: - deviceID: "1521" driver: igb mtu: 1500 name: eno1 pciAddress: "0000:18:00.0" totalvfs: 7 vendor: "8086" - deviceID: "1521" driver: igb mtu: 1500 name: eno2 pciAddress: "0000:18:00.1" totalvfs: 7 vendor: "8086" - deviceID: "1521" driver: igb mtu: 1500 name: eno3 pciAddress: "0000:18:00.2" totalvfs: 7 vendor: "8086" - deviceID: "1521" driver: igb mtu: 1500 name: eno4 pciAddress: "0000:18:00.3" totalvfs: 7 vendor: "8086" - deviceID: 158b driver: i40e mtu: 9200 name: ens1f0 pciAddress: 0000:3b:00.0 totalvfs: 64 vendor: "8086" - deviceID: 158b driver: i40e mtu: 1500 name: ens1f1 pciAddress: 0000:3b:00.1 totalvfs: 64 Expected results: VF can be inited. Additional info: Delete configdaemon pod and make it re-create can resolve this issue
Verified this bug on 4.5.0-202008210149.p0 this VF can be inited when create policy after the default policy is deleted and restore.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.5.8 bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:3510