Bug 1962634 - fail to get interface name for ... GetNetName(): no net directory under pci device during upgrade from OCP 4.5.16 to 4.6.17
Summary: fail to get interface name for ... GetNetName(): no net directory under pci d...
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.6
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: Federico Paolinelli
QA Contact: zhaozhanqi
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-05-20 12:36 UTC by Andreas Karis
Modified: 2024-12-20 20:05 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-11-08 17:43:37 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Andreas Karis 2021-05-20 12:36:58 UTC
Description of problem:
Fail to get interface name for ... GetNetName(): no net directory under pci device  during upgrade from OCP 4.5.16 to 4.6.17

The purpose of this BZ is to identify whether this is a known issue or one that warrants attention, and to track it through to resolution.

~~~
$ oc logs -n openshift-sriov-network-operator sriov-network-config-daemon-plkp6
W0511 17:52:18.398178   33692 utils.go:275] setNetdevMTU(): fail to get interface name for 0000:12:0a.0: GetNetName(): no net directory under pci device 0000:12:0a.0: "lstat /sys/bus/pci/devices/0000:12:0a.0/net: no such file or directory"
W0511 17:52:19.398672   33692 utils.go:275] setNetdevMTU(): fail to get interface name for 0000:12:0a.0: GetNetName(): no net directory under pci device 0000:12:0a.0: "lstat /sys/bus/pci/devices/0000:12:0a.0/net: no such file or directory"
W0511 17:52:20.398961   33692 utils.go:275] setNetdevMTU(): fail to get interface name for 0000:12:0a.0: GetNetName(): no net directory under pci device 0000:12:0a.0: "lstat /sys/bus/pci/devices/0000:12:0a.0/net: no such file or directory"
W0511 17:52:21.399302   33692 utils.go:275] setNetdevMTU(): fail to get interface name for 0000:12:0a.0: GetNetName(): no net directory under pci device 0000:12:0a.0: "lstat /sys/bus/pci/devices/0000:12:0a.0/net: no such file or directory"
W0511 17:52:22.399551   33692 utils.go:275] setNetdevMTU(): fail to get interface name for 0000:12:0a.0: GetNetName(): no net directory under pci device 0000:12:0a.0: "lstat /sys/bus/pci/devices/0000:12:0a.0/net: no such file or directory"
W0511 17:52:23.399774   33692 utils.go:275] setNetdevMTU(): fail to get interface name for 0000:12:0a.0: GetNetName(): no net directory under pci device 0000:12:0a.0: "lstat /sys/bus/pci/devices/0000:12:0a.0/net: no such file or directory"
W0511 17:52:24.989896   33692 utils.go:275] setNetdevMTU(): fail to get interface name for 0000:12:0f.3: GetNetName(): no net directory under pci device 0000:12:0f.3: "lstat /sys/bus/pci/devices/0000:12:0f.3/net: no such file or directory"
~~~

The above error messages appeared after the node was rebooted, and the `virt-launcher` pods on that node were found not to be running.

This issue was resolved by unbinding and re-binding the SRIOV VFs on that node, and then deleting its sriov-network-config-daemon pod:

unbind and re-bind the node's SRIOV VFs:
~~~
ssh core@node-06
sudo -i
cd /sys/bus/pci/devices/
for x in $(ls -1 | grep 0000\:12\:0\*); do echo $x > /sys/bus/pci/drivers/iavf/unbind; done
for x in $(ls -1 | grep 0000\:12\:0\*); do echo $x > /sys/bus/pci/drivers/iavf/bind; done
~~~

delete the node's sriov-network-config-daemon pod:
~~~
oc delete pod -n openshift-sriov-network-operator sriov-network-config-daemon-plkp6
~~~


Version-Release number of selected component (if applicable):
ocp 4.5.16 -> 4.6.17

How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 11 Federico Paolinelli 2021-11-08 17:43:37 UTC
Closing this, feel free to reopen if you are able to reproduce it again.

Comment 12 Red Hat Bugzilla 2023-09-15 01:06:55 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days


Note You need to log in before you can comment on or make changes to this bug.