Bug 1860288 - [sriov] [4.4.z]VF cannot be inited when apply one policy if the 'default' policy is deleted and restored by operator
Summary: [sriov] [4.4.z]VF cannot be inited when apply one policy if the 'default' pol...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.4
Hardware: All
OS: All
high
medium
Target Milestone: ---
: 4.4.z
Assignee: Peng Liu
QA Contact: zhaozhanqi
URL:
Whiteboard:
: 1871681 (view as bug list)
Depends On: 1860286
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-07-24 08:57 UTC by zhaozhanqi
Modified: 2020-10-13 08:18 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1860286
Environment:
Last Closed: 2020-10-13 08:17:44 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift sriov-network-operator pull 330 0 None closed [release-4.4] Bug 1860288: load plugins only when none has been loaded 2020-10-09 03:17:10 UTC
Red Hat Product Errata RHBA-2020:4063 0 None None None 2020-10-13 08:18:17 UTC

Description zhaozhanqi 2020-07-24 08:57:40 UTC
+++ This bug was initially created as a clone of Bug #1860286 +++

Description of problem:
After I deleted the 'default' sriovnetworknodepolicies and it will be restored by operator. then now I apply one policy. Found the VF cannot be inited. 

Version-Release number of selected component (if applicable):
4.5

How reproducible:


Steps to Reproduce:
1. install the sriov operator
2. Delete the 'default' policy
oc delete sriovnetworknodepolicies.sriovnetwork.openshift.io default
3. Check the 'default' is restored
 oc get sriovnetworknodepolicies.sriovnetwork.openshift.io

4. Apply one policy with below
apiVersion: sriovnetwork.openshift.io/v1
kind: SriovNetworkNodePolicy
metadata:
  name: intel-netdevice
  namespace: openshift-sriov-network-operator
spec:
  deviceType: netdevice
  nicSelector:
    pfNames:
      - ens1f0
    rootDevices:
      - '0000:3b:00.0'
    vendor: '8086'
  nodeSelector:
    feature.node.kubernetes.io/sriov-capable: 'true'
  numVfs: 5
  priority: 99
  resourceName: intelnetdevice

5. waiting the sriov-device-plugin and sriov-cni pods become 'running' and configdaemon logs show 'setNodeStateStatus(): syncStatus: Succeeded, lastSyncError:"
6.  Check 'sriovnetworknodestates.sriovnetwork.openshift.io' of node
oc get sriovnetworknodestates.sriovnetwork.openshift.io -o yaml

Actual results:
step 6: no VF were inited successfully.

spec:
  dpConfigVersion: "1540830"
  interfaces:
  - name: ens1f0
    numVfs: 5
    pciAddress: 0000:3b:00.0
    vfGroups:
    - deviceType: netdevice
      policyName: intel-netdevice
      resourceName: intelnetdevice
      vfRange: 0-4
status:
  interfaces:
  - deviceID: "1521"
    driver: igb
    mtu: 1500
    name: eno1
    pciAddress: "0000:18:00.0"
    totalvfs: 7
    vendor: "8086"
  - deviceID: "1521"
    driver: igb
    mtu: 1500
    name: eno2
    pciAddress: "0000:18:00.1"
    totalvfs: 7
    vendor: "8086"
  - deviceID: "1521"
    driver: igb
    mtu: 1500
    name: eno3
    pciAddress: "0000:18:00.2"
    totalvfs: 7
    vendor: "8086"
  - deviceID: "1521"
    driver: igb
    mtu: 1500
    name: eno4
    pciAddress: "0000:18:00.3"
    totalvfs: 7
    vendor: "8086"
  - deviceID: 158b
    driver: i40e
    mtu: 9200
    name: ens1f0
    pciAddress: 0000:3b:00.0
    totalvfs: 64
    vendor: "8086"
  - deviceID: 158b
    driver: i40e
    mtu: 1500
    name: ens1f1
    pciAddress: 0000:3b:00.1
    totalvfs: 64


Expected results:

VF can be inited. 

Additional info:

Delete configdaemon pod and make it re-create can resolve this issue

Comment 1 Peng Liu 2020-09-02 06:07:54 UTC
*** Bug 1871681 has been marked as a duplicate of this bug. ***

Comment 7 zhaozhanqi 2020-10-10 11:30:36 UTC
Verified this bug on 4.4

Comment 9 errata-xmlrpc 2020-10-13 08:17:44 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.4.27 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4063


Note You need to log in before you can comment on or make changes to this bug.