Created attachment 1857377 [details] sriov-network-config-daemon.log Description of problem: SriovNetworkNodeState stuck InProgress reporting timed out waiting for the condition Version-Release number of selected component (if applicable): 4.10.0-0.nightly-2022-01-26-234447 sriov-network-operator.4.10.0-202201261535 How reproducible: 100% Steps to Reproduce: 1. Create the following SriovNetworkNodePolicy: apiVersion: sriovnetwork.openshift.io/v1 kind: SriovNetworkNodePolicy metadata: annotations: ran.openshift.io/ztp-deploy-wave: "100" creationTimestamp: "2022-01-27T12:50:59Z" generation: 3 name: sriov-nnp-du-mh namespace: openshift-sriov-network-operator resourceVersion: "628999" uid: 33c38e79-901c-4b8f-89a1-d4848335b74d spec: deviceType: vfio-pci isRdma: false nicSelector: pfNames: - ens2f1#0-15 nodeSelector: node-role.kubernetes.io/master: "" numVfs: 16 priority: 10 resourceName: du_mh 2.Check sriovnetworknodestates Actual results: oc -n openshift-sriov-network-operator get sriovnetworknodestates -o yaml apiVersion: v1 items: - apiVersion: sriovnetwork.openshift.io/v1 kind: SriovNetworkNodeState metadata: creationTimestamp: "2022-01-27T12:30:04Z" generation: 4 name: sno.kni-qe-1.lab.eng.rdu2.redhat.com namespace: openshift-sriov-network-operator ownerReferences: - apiVersion: sriovnetwork.openshift.io/v1 blockOwnerDeletion: true controller: true kind: SriovNetworkNodePolicy name: default uid: 1f06d010-75b4-496d-949a-fe7e5a625d6e resourceVersion: "680717" uid: d0f1b61c-186f-4f64-b140-278412e0b93f spec: dpConfigVersion: "629000" interfaces: - name: ens2f1 numVfs: 16 pciAddress: 0000:b2:00.1 vfGroups: - deviceType: vfio-pci policyName: sriov-nnp-du-mh resourceName: du_mh vfRange: 0-15 status: interfaces: - deviceID: 158b driver: i40e linkSpeed: 10000 Mb/s linkType: ETH mac: d4:f5:ef:43:37:18 mtu: 1500 name: ens2f0 pciAddress: 0000:b2:00.0 totalvfs: 64 vendor: "8086" - Vfs: - deviceID: 154c driver: iavf pciAddress: 0000:b2:0a.0 vendor: "8086" vfID: 0 - deviceID: 154c driver: iavf pciAddress: 0000:b2:0a.1 vendor: "8086" vfID: 1 - deviceID: 154c driver: iavf mac: da:8c:26:7b:ae:fe mtu: 1500 name: ens2f1v10 pciAddress: 0000:b2:0b.2 vendor: "8086" vfID: 10 - deviceID: 154c driver: iavf mac: 02:c3:4f:73:b2:84 mtu: 1500 name: ens2f1v11 pciAddress: 0000:b2:0b.3 vendor: "8086" vfID: 11 - deviceID: 154c driver: iavf mac: e2:51:0f:07:f0:ca mtu: 1500 name: ens2f1v12 pciAddress: 0000:b2:0b.4 vendor: "8086" vfID: 12 - deviceID: 154c driver: iavf mac: 82:17:3a:a3:d8:f7 mtu: 1500 name: ens2f1v13 pciAddress: 0000:b2:0b.5 vendor: "8086" vfID: 13 - deviceID: 154c driver: iavf mac: fa:fb:0d:bb:e4:eb mtu: 1500 name: ens2f1v14 pciAddress: 0000:b2:0b.6 vendor: "8086" vfID: 14 - deviceID: 154c driver: iavf mac: 02:35:2a:26:88:5f mtu: 1500 name: ens2f1v15 pciAddress: 0000:b2:0b.7 vendor: "8086" vfID: 15 - deviceID: 154c driver: iavf pciAddress: 0000:b2:0a.2 vendor: "8086" vfID: 2 - deviceID: 154c driver: iavf mac: 02:fc:e6:6e:63:ef mtu: 1500 name: ens2f1v3 pciAddress: 0000:b2:0a.3 vendor: "8086" vfID: 3 - deviceID: 154c driver: iavf mac: 26:c4:aa:f3:36:28 mtu: 1500 name: ens2f1v4 pciAddress: 0000:b2:0a.4 vendor: "8086" vfID: 4 - deviceID: 154c driver: iavf mac: b6:d7:90:ed:77:9f mtu: 1500 name: ens2f1v5 pciAddress: 0000:b2:0a.5 vendor: "8086" vfID: 5 - deviceID: 154c driver: iavf mac: ea:1e:5e:d0:78:76 mtu: 1500 name: ens2f1v6 pciAddress: 0000:b2:0a.6 vendor: "8086" vfID: 6 - deviceID: 154c driver: iavf mac: 02:fb:52:6b:cb:b4 mtu: 1500 name: ens2f1v7 pciAddress: 0000:b2:0a.7 vendor: "8086" vfID: 7 - deviceID: 154c driver: iavf mac: 7a:12:17:40:8e:1f mtu: 1500 name: ens2f1v8 pciAddress: 0000:b2:0b.0 vendor: "8086" vfID: 8 - deviceID: 154c driver: iavf mac: f6:2c:9d:8d:c1:1c mtu: 1500 name: ens2f1v9 pciAddress: 0000:b2:0b.1 vendor: "8086" vfID: 9 deviceID: 158b driver: i40e linkSpeed: 10000 Mb/s linkType: ETH mac: d4:f5:ef:43:37:19 mtu: 1500 name: ens2f1 numVfs: 16 pciAddress: 0000:b2:00.1 totalvfs: 64 vendor: "8086" lastSyncError: timed out waiting for the condition syncStatus: InProgress kind: List metadata: resourceVersion: "" selfLink: "" Expected results: No sync errors Additional info: Attaching sriov-network-config-daemon and sriov-device-plugin logs.
Created attachment 1857378 [details] sriov-device-plugin.log
lspci -s b2:00.1 -vv b2:00.1 Ethernet controller: Intel Corporation Ethernet Controller XXV710 for 25GbE SFP28 (rev 02) Subsystem: Hewlett Packard Enterprise Ethernet Network Adapter XXV710-2 Physical Slot: 2 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 42 NUMA node: 1 IOMMU group: 94 Region 0: Memory at f5000000 (64-bit, prefetchable) [size=16M] Region 3: Memory at f6008000 (64-bit, prefetchable) [size=32K] Capabilities: [40] Power Management version 3 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=1 PME- Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+ Address: 0000000000000000 Data: 0000 Masking: 00000000 Pending: 00000000 Capabilities: [70] MSI-X: Enable+ Count=129 Masked- Vector table: BAR=3 offset=00000000 PBA: BAR=3 offset=00001000 Capabilities: [a0] Express (v2) Endpoint, MSI 00 DevCap: MaxPayload 2048 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W DevCtl: CorrErr- NonFatalErr+ FatalErr+ UnsupReq- RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop- FLReset- MaxPayload 256 bytes, MaxReadReq 4096 bytes DevSta: CorrErr+ NonFatalErr- FatalErr- UnsupReq+ AuxPwr- TransPend- LnkCap: Port #0, Speed 8GT/s, Width x8, ASPM L1, Exit Latency L1 <16us ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+ LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 8GT/s (ok), Width x8 (ok) TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Range ABCD, TimeoutDis+ NROPrPrP- LTR- 10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt- EETLPPrefix- EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit- FRS- TPHComp- ExtTPHComp- AtomicOpsCap: 32bit- 64bit- 128bitCAS- DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- OBFF Disabled, AtomicOpsCtl: ReqEn- LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete- EqualizationPhase1- EqualizationPhase2- EqualizationPhase3- LinkEqualizationRequest- Retimer- 2Retimers- CrosslinkRes: unsupported Capabilities: [100 v2] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt+ UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP- SDES+ TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+ CEMsk: RxErr+ BadTLP+ BadDLLP+ Rollover+ Timeout+ AdvNonFatalErr+ AERCap: First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn- MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap- HeaderLog: 00000000 00000000 00000000 00000000 Capabilities: [140 v1] Device Serial Number 18-37-43-ff-ff-ef-f5-d4 Capabilities: [150 v1] Alternative Routing-ID Interpretation (ARI) ARICap: MFVC- ACS-, Next Function: 0 ARICtl: MFVC- ACS-, Function Group: 0 Capabilities: [160 v1] Single Root I/O Virtualization (SR-IOV) IOVCap: Migration-, Interrupt Message Number: 000 IOVCtl: Enable+ Migration- Interrupt- MSE+ ARIHierarchy- IOVSta: Migration- Initial VFs: 64, Total VFs: 64, Number of VFs: 16, Function Dependency Link: 01 VF offset: 79, stride: 1, Device ID: 154c Supported Page Size: 00000553, System Page Size: 00000001 Region 0: Memory at 00000cffff600000 (64-bit, prefetchable) Region 3: Memory at 00000cffffe00000 (64-bit, prefetchable) VF Migration: offset: 00000000, BIR: 0 Capabilities: [1a0 v1] Transaction Processing Hints Device specific mode supported No steering table available Capabilities: [1b0 v1] Access Control Services ACSCap: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans- ACSCtl: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans- Kernel driver in use: i40e Kernel modules: i40e
*** This bug has been marked as a duplicate of bug 2045087 ***