Bug 1969869 - virt-launcher pod stuck in Init:0/1 status: error adding container to network "bigip1ens4f0vf2": SRIOV-CNI failed to load netconf: LoadConf(): VF pci addr is required
Summary: virt-launcher pod stuck in Init:0/1 status: error adding container to network...
Keywords:
Status: CLOSED DUPLICATE of bug 1969870
Alias: None
Product: Container Native Virtualization (CNV)
Classification: Red Hat
Component: Networking
Version: 2.6.5
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ---
Assignee: Petr Horáček
QA Contact: Meni Yakove
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-06-09 11:26 UTC by Marius Cornea
Modified: 2021-06-16 12:24 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-06-16 12:24:13 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Marius Cornea 2021-06-09 11:26:26 UTC
Description of problem:

Following an 4.6.17 -> 4.6.25 -> 4.7.11 OCP upgrade and CNV 2.5.7 -> 2.6.5 upgrade one of the virt-launcher pods remains in Init:0/1 status with describe logs showing "error adding container to network "bigip1ens4f0vf2": SRIOV-CNI failed to load netconf: LoadConf(): VF pci addr is required"


[kni@ocp-edge18 ~]$ oc -n f5-lb get pods
NAME                         READY   STATUS     RESTARTS   AGE
virt-launcher-bigip0-5g7c7   1/1     Running    0          38m
virt-launcher-bigip1-gfx9q   0/1     Init:0/1   0          68m
[kni@ocp-edge18 ~]$ oc -n f5-lb describe pods virt-launcher-bigip1-gfx9q | grep -i fail
  Warning  FailedScheduling        69m                   default-scheduler  0/7 nodes are available: 2 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate, 2 node(s) were unschedulable, 3 node(s) didn't match Pod's node affinity.
  Warning  FailedScheduling        69m                   default-scheduler  0/7 nodes are available: 2 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate, 2 node(s) were unschedulable, 3 node(s) didn't match Pod's node affinity.
  Warning  FailedScheduling        68m                   default-scheduler  0/7 nodes are available: 1 node(s) were unschedulable, 3 node(s) didn't match Pod's node affinity, 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate.
  Warning  FailedScheduling        62m                   default-scheduler  0/7 nodes are available: 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate, 4 node(s) didn't match Pod's node affinity.
  Warning  FailedCreatePodSandBox  61m                   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_virt-launcher-bigip1-gfx9q_f5-lb_28cb4897-3d43-4811-8787-c8609e0d5261_0(9b028b2436d825d3217322a0f93a94fc8e7d1b6af52af434b32edcaf2bd2b5be): [f5-lb/virt-launcher-bigip1-gfx9q:bigip1ens4f0vf2]: error adding container to network "bigip1ens4f0vf2": SRIOV-CNI failed to load netconf: LoadConf(): VF pci addr is required


Version-Release number of selected component (if applicable):


How reproducible:
100%

Steps to Reproduce:
1. Deploy OCP 4.6.17 with CNV and SR-IOV operators

2. Create attached sriovnetworknodepolicy, sriovnetwork, NodeNetworkConfigurationPolicy and VirtualMachine

3. Upgrade OCP to 4.6.25 and then to 4.7.11

4. Upgrade CNV to 2.6.5

5. Upgrade SR-IOV network operator from 4.6.0-202106010807.p0.git.78e7139 to 4.7.0-202105211528.p0


Actual results:
virt-launcher pods assigned to 1 of the 2 VirtualMachines is in Init:0/1 status

Expected results:
all virt-launcher pods are in Running status as they were before the upgrade procedure

Additional info:

After deleting the virt-launcher pod stuck in Init:0/1 status it gets recreated in Running state.

Attaching must-gather and manifests used to create the VMs and networks.

Comment 1 Petr Horáček 2021-06-16 12:24:13 UTC

*** This bug has been marked as a duplicate of bug 1969870 ***


Note You need to log in before you can comment on or make changes to this bug.