Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1734626

Summary: [sriov] sriov-network-config-daemon pod become CrashLoopBackOff when initial the XXV710 VF
Product: OpenShift Container Platform Reporter: zhaozhanqi <zzhao>
Component: NetworkingAssignee: Peng Liu <pliu>
Status: CLOSED ERRATA QA Contact: zhaozhanqi <zzhao>
Severity: high Docs Contact:
Priority: high    
Version: 4.2.0CC: aos-bugs, nagrawal, pliu
Target Milestone: ---   
Target Release: 4.2.0   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-10-16 06:34:11 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description zhaozhanqi 2019-07-31 06:36:47 UTC
Description of problem:
Given the sriov operator is installed by operator hub
When creating SriovNetworkNodePolicy CR on xxv710, then sriov-network-config-daemon pod become CrashLoopBackOff


Version-Release number of selected component (if applicable):
 Red Hat Enterprise Linux CoreOS 420.8.20190721.0 (Ootpa)

How reproducible:


Steps to Reproduce:
1. setup the sriov cluster 
2. install the sriov operator from web console
3. Create the SriovNetworkNodePolicy CR on xxv710 with following:
   
   apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetworkNodePolicy
metadata:
  creationTimestamp: '2019-07-30T08:28:20Z'
  generation: 4
  name: policy-1
  namespace: sriov-network-operator
  resourceVersion: '1994420'
  selfLink: >-
    /apis/sriovnetwork.openshift.io/v1/namespaces/sriov-network-operator/sriovnetworknodepolicies/policy-1
  uid: fd1f597a-b2a3-11e9-a8e4-2e808f6aa7e1
spec:
  deviceType: vfio-pci
  mtu: 1500
  nicSelector:
    pfNames:
      - ens1f0
    rootDevices:
      - '0000:3b:00.0'
    vendor: '8086'
  nodeSelector:
    feature.node.kubernetes.io/sriov-capable: 'true'
  numVfs: 6
  priority: 99
  resourceName: intelnics
**************
4. Check the config daemon pod become to Crash

Actual results:

logs:

E0730 08:31:11.652714   44720 utils.go:118] SyncNodeState(): fail to config sriov interface 0000:3b:00.0: write /sys/bus/pci/drivers/vfio-pci/bind: no such device


Expected results:

should work well

Additional info:

this issue happen on coreOS

Comment 1 Peng Liu 2019-07-31 07:13:50 UTC
Due to IOMMU is not enable by kernel args on RHCOS, vfio-pci driver was unable to be bond.

PR https://github.com/openshift/sriov-network-operator/pull/46

Comment 2 Peng Liu 2019-08-05 11:00:27 UTC
PR merged.

Comment 4 zhaozhanqi 2019-08-06 10:22:03 UTC
Move this bug to 'modified' since the new image is not build yet.

Comment 6 zhaozhanqi 2019-08-08 03:28:49 UTC
verified this bug with quay.io/openshift-release-dev/ocp-v4.0-art-dev:v4.2.0-201908061019-ose-sriov-network-operator

Comment 7 errata-xmlrpc 2019-10-16 06:34:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2922