Bug 2047734 - SriovNetworkNodeState stuck InProgress reporting timed out waiting for the condition
Summary: SriovNetworkNodeState stuck InProgress reporting timed out waiting for the co...
Keywords:
Status: CLOSED DUPLICATE of bug 2045087
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.10
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: ---
Assignee: zenghui.shi
QA Contact: zhaozhanqi
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-01-28 12:21 UTC by Marius Cornea
Modified: 2022-02-21 01:11 UTC (History)
0 users

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-01-31 09:02:46 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
sriov-network-config-daemon.log (910.88 KB, text/plain)
2022-01-28 12:21 UTC, Marius Cornea
no flags Details
sriov-device-plugin.log (8.37 KB, text/plain)
2022-01-28 12:22 UTC, Marius Cornea
no flags Details

Description Marius Cornea 2022-01-28 12:21:56 UTC
Created attachment 1857377 [details]
sriov-network-config-daemon.log

Description of problem:

SriovNetworkNodeState stuck InProgress reporting timed out waiting for the condition

Version-Release number of selected component (if applicable):
4.10.0-0.nightly-2022-01-26-234447
sriov-network-operator.4.10.0-202201261535

How reproducible:
100%

Steps to Reproduce:
1. Create the following SriovNetworkNodePolicy:

apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetworkNodePolicy
metadata:
  annotations:
    ran.openshift.io/ztp-deploy-wave: "100"
  creationTimestamp: "2022-01-27T12:50:59Z"
  generation: 3
  name: sriov-nnp-du-mh
  namespace: openshift-sriov-network-operator
  resourceVersion: "628999"
  uid: 33c38e79-901c-4b8f-89a1-d4848335b74d
spec:
  deviceType: vfio-pci
  isRdma: false
  nicSelector:
    pfNames:
    - ens2f1#0-15
  nodeSelector:
    node-role.kubernetes.io/master: ""
  numVfs: 16
  priority: 10
  resourceName: du_mh

2.Check sriovnetworknodestates

Actual results:

oc -n openshift-sriov-network-operator get sriovnetworknodestates -o yaml
apiVersion: v1
items:
- apiVersion: sriovnetwork.openshift.io/v1
  kind: SriovNetworkNodeState
  metadata:
    creationTimestamp: "2022-01-27T12:30:04Z"
    generation: 4
    name: sno.kni-qe-1.lab.eng.rdu2.redhat.com
    namespace: openshift-sriov-network-operator
    ownerReferences:
    - apiVersion: sriovnetwork.openshift.io/v1
      blockOwnerDeletion: true
      controller: true
      kind: SriovNetworkNodePolicy
      name: default
      uid: 1f06d010-75b4-496d-949a-fe7e5a625d6e
    resourceVersion: "680717"
    uid: d0f1b61c-186f-4f64-b140-278412e0b93f
  spec:
    dpConfigVersion: "629000"
    interfaces:
    - name: ens2f1
      numVfs: 16
      pciAddress: 0000:b2:00.1
      vfGroups:
      - deviceType: vfio-pci
        policyName: sriov-nnp-du-mh
        resourceName: du_mh
        vfRange: 0-15
  status:
    interfaces:
    - deviceID: 158b
      driver: i40e
      linkSpeed: 10000 Mb/s
      linkType: ETH
      mac: d4:f5:ef:43:37:18
      mtu: 1500
      name: ens2f0
      pciAddress: 0000:b2:00.0
      totalvfs: 64
      vendor: "8086"
    - Vfs:
      - deviceID: 154c
        driver: iavf
        pciAddress: 0000:b2:0a.0
        vendor: "8086"
        vfID: 0
      - deviceID: 154c
        driver: iavf
        pciAddress: 0000:b2:0a.1
        vendor: "8086"
        vfID: 1
      - deviceID: 154c
        driver: iavf
        mac: da:8c:26:7b:ae:fe
        mtu: 1500
        name: ens2f1v10
        pciAddress: 0000:b2:0b.2
        vendor: "8086"
        vfID: 10
      - deviceID: 154c
        driver: iavf
        mac: 02:c3:4f:73:b2:84
        mtu: 1500
        name: ens2f1v11
        pciAddress: 0000:b2:0b.3
        vendor: "8086"
        vfID: 11
      - deviceID: 154c
        driver: iavf
        mac: e2:51:0f:07:f0:ca
        mtu: 1500
        name: ens2f1v12
        pciAddress: 0000:b2:0b.4
        vendor: "8086"
        vfID: 12
      - deviceID: 154c
        driver: iavf
        mac: 82:17:3a:a3:d8:f7
        mtu: 1500
        name: ens2f1v13
        pciAddress: 0000:b2:0b.5
        vendor: "8086"
        vfID: 13
      - deviceID: 154c
        driver: iavf
        mac: fa:fb:0d:bb:e4:eb
        mtu: 1500
        name: ens2f1v14
        pciAddress: 0000:b2:0b.6
        vendor: "8086"
        vfID: 14
      - deviceID: 154c
        driver: iavf
        mac: 02:35:2a:26:88:5f
        mtu: 1500
        name: ens2f1v15
        pciAddress: 0000:b2:0b.7
        vendor: "8086"
        vfID: 15
      - deviceID: 154c
        driver: iavf
        pciAddress: 0000:b2:0a.2
        vendor: "8086"
        vfID: 2
      - deviceID: 154c
        driver: iavf
        mac: 02:fc:e6:6e:63:ef
        mtu: 1500
        name: ens2f1v3
        pciAddress: 0000:b2:0a.3
        vendor: "8086"
        vfID: 3
      - deviceID: 154c
        driver: iavf
        mac: 26:c4:aa:f3:36:28
        mtu: 1500
        name: ens2f1v4
        pciAddress: 0000:b2:0a.4
        vendor: "8086"
        vfID: 4
      - deviceID: 154c
        driver: iavf
        mac: b6:d7:90:ed:77:9f
        mtu: 1500
        name: ens2f1v5
        pciAddress: 0000:b2:0a.5
        vendor: "8086"
        vfID: 5
      - deviceID: 154c
        driver: iavf
        mac: ea:1e:5e:d0:78:76
        mtu: 1500
        name: ens2f1v6
        pciAddress: 0000:b2:0a.6
        vendor: "8086"
        vfID: 6
      - deviceID: 154c
        driver: iavf
        mac: 02:fb:52:6b:cb:b4
        mtu: 1500
        name: ens2f1v7
        pciAddress: 0000:b2:0a.7
        vendor: "8086"
        vfID: 7
      - deviceID: 154c
        driver: iavf
        mac: 7a:12:17:40:8e:1f
        mtu: 1500
        name: ens2f1v8
        pciAddress: 0000:b2:0b.0
        vendor: "8086"
        vfID: 8
      - deviceID: 154c
        driver: iavf
        mac: f6:2c:9d:8d:c1:1c
        mtu: 1500
        name: ens2f1v9
        pciAddress: 0000:b2:0b.1
        vendor: "8086"
        vfID: 9
      deviceID: 158b
      driver: i40e
      linkSpeed: 10000 Mb/s
      linkType: ETH
      mac: d4:f5:ef:43:37:19
      mtu: 1500
      name: ens2f1
      numVfs: 16
      pciAddress: 0000:b2:00.1
      totalvfs: 64
      vendor: "8086"
    lastSyncError: timed out waiting for the condition
    syncStatus: InProgress
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""

Expected results:
No sync errors

Additional info:

Attaching sriov-network-config-daemon and sriov-device-plugin logs.

Comment 1 Marius Cornea 2022-01-28 12:22:29 UTC
Created attachment 1857378 [details]
sriov-device-plugin.log

Comment 2 Marius Cornea 2022-01-28 12:23:36 UTC
lspci -s b2:00.1 -vv
b2:00.1 Ethernet controller: Intel Corporation Ethernet Controller XXV710 for 25GbE SFP28 (rev 02)
	Subsystem: Hewlett Packard Enterprise Ethernet Network Adapter XXV710-2
	Physical Slot: 2
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 42
	NUMA node: 1
	IOMMU group: 94
	Region 0: Memory at f5000000 (64-bit, prefetchable) [size=16M]
	Region 3: Memory at f6008000 (64-bit, prefetchable) [size=32K]
	Capabilities: [40] Power Management version 3
		Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=1 PME-
	Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
		Address: 0000000000000000  Data: 0000
		Masking: 00000000  Pending: 00000000
	Capabilities: [70] MSI-X: Enable+ Count=129 Masked-
		Vector table: BAR=3 offset=00000000
		PBA: BAR=3 offset=00001000
	Capabilities: [a0] Express (v2) Endpoint, MSI 00
		DevCap:	MaxPayload 2048 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us
			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W
		DevCtl:	CorrErr- NonFatalErr+ FatalErr+ UnsupReq-
			RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop- FLReset-
			MaxPayload 256 bytes, MaxReadReq 4096 bytes
		DevSta:	CorrErr+ NonFatalErr- FatalErr- UnsupReq+ AuxPwr- TransPend-
		LnkCap:	Port #0, Speed 8GT/s, Width x8, ASPM L1, Exit Latency L1 <16us
			ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
		LnkCtl:	ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 8GT/s (ok), Width x8 (ok)
			TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
		DevCap2: Completion Timeout: Range ABCD, TimeoutDis+ NROPrPrP- LTR-
			 10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt- EETLPPrefix-
			 EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
			 FRS- TPHComp- ExtTPHComp-
			 AtomicOpsCap: 32bit- 64bit- 128bitCAS-
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- OBFF Disabled,
			 AtomicOpsCtl: ReqEn-
		LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete- EqualizationPhase1-
			 EqualizationPhase2- EqualizationPhase3- LinkEqualizationRequest-
			 Retimer- 2Retimers- CrosslinkRes: unsupported
	Capabilities: [100 v2] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt+ UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP- SDES+ TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
		CEMsk:	RxErr+ BadTLP+ BadDLLP+ Rollover+ Timeout+ AdvNonFatalErr+
		AERCap:	First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
			MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
		HeaderLog: 00000000 00000000 00000000 00000000
	Capabilities: [140 v1] Device Serial Number 18-37-43-ff-ff-ef-f5-d4
	Capabilities: [150 v1] Alternative Routing-ID Interpretation (ARI)
		ARICap:	MFVC- ACS-, Next Function: 0
		ARICtl:	MFVC- ACS-, Function Group: 0
	Capabilities: [160 v1] Single Root I/O Virtualization (SR-IOV)
		IOVCap:	Migration-, Interrupt Message Number: 000
		IOVCtl:	Enable+ Migration- Interrupt- MSE+ ARIHierarchy-
		IOVSta:	Migration-
		Initial VFs: 64, Total VFs: 64, Number of VFs: 16, Function Dependency Link: 01
		VF offset: 79, stride: 1, Device ID: 154c
		Supported Page Size: 00000553, System Page Size: 00000001
		Region 0: Memory at 00000cffff600000 (64-bit, prefetchable)
		Region 3: Memory at 00000cffffe00000 (64-bit, prefetchable)
		VF Migration: offset: 00000000, BIR: 0
	Capabilities: [1a0 v1] Transaction Processing Hints
		Device specific mode supported
		No steering table available
	Capabilities: [1b0 v1] Access Control Services
		ACSCap:	SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
		ACSCtl:	SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
	Kernel driver in use: i40e
	Kernel modules: i40e

Comment 3 Marius Cornea 2022-01-31 09:02:46 UTC

*** This bug has been marked as a duplicate of bug 2045087 ***


Note You need to log in before you can comment on or make changes to this bug.