Bug 2250687 - [FFU 16.2 -> 17.1] Failed getting representor port for PF on RHEL 9 compute
Summary: [FFU 16.2 -> 17.1] Failed getting representor port for PF on RHEL 9 compute
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: os-net-config
Version: 17.1 (Wallaby)
Hardware: x86_64
OS: Linux
urgent
high
Target Milestone: z2
: 17.1
Assignee: Vijayalakshmi Candappa
QA Contact: Eran Kuris
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-11-20 14:50 UTC by Ricardo Diaz
Modified: 2024-01-16 14:31 UTC (History)
14 users (show)

Fixed In Version: os-net-config-14.2.1-17.1.20230729001034.el9ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2024-01-16 14:31:44 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 902286 0 None MERGED Link_mode and vdpa is not updated in sriov_map 2023-12-06 06:53:09 UTC
Red Hat Issue Tracker OSP-30505 0 None None None 2023-11-20 14:53:53 UTC
Red Hat Product Errata RHBA-2024:0209 0 None None None 2024-01-16 14:31:46 UTC

Description Ricardo Diaz 2023-11-20 14:50:06 UTC
Description of problem:

We an error like this after FFU when trying to instantiate a VM on the RHEL 9 compute:
~~~
2023-11-20 14:38:08.349 2 ERROR os_vif [req-c4ac4d07-7099-4c49-b51d-e44be8600d08 337bfc9a65aa46a6a39f01f02fdbfaa2 5bd6730c34a8449391627c115ee3f674 - default default] Failed to plug vif VIFHostDevice(active=False,address=fa:16:3e:e9:45:36,dev_address=0000:04:01.3,dev_type='ethernet',has_traffic_filtering=True,id=ebcb5c49-c124-4292-a1dd-cd289a928429,network=Network(8c93d973-3c89-4568-aae7-8970716ff52b),plugin='ovs',port_profile=VIFPortProfileOVSRepresentor,preserve_on_delete=True): vif_plug_ovs.exception.RepresentorNotFound: Failed getting representor port for PF enp4s0f0 with 9
2023-11-20 14:38:08.349 2 ERROR os_vif Traceback (most recent call last):
2023-11-20 14:38:08.349 2 ERROR os_vif   File "/usr/lib/python3.9/site-packages/os_vif/__init__.py", line 77, in plug
2023-11-20 14:38:08.349 2 ERROR os_vif     plugin.plug(vif, instance_info)
2023-11-20 14:38:08.349 2 ERROR os_vif   File "/usr/lib/python3.9/site-packages/vif_plug_ovs/ovs.py", line 345, in plug
2023-11-20 14:38:08.349 2 ERROR os_vif     self._plug_vf(vif, instance_info)
2023-11-20 14:38:08.349 2 ERROR os_vif   File "/usr/lib/python3.9/site-packages/vif_plug_ovs/ovs.py", line 311, in _plug_vf
2023-11-20 14:38:08.349 2 ERROR os_vif     representor = linux_net.get_representor_port(pf_ifname, vf_num)
2023-11-20 14:38:08.349 2 ERROR os_vif   File "/usr/lib/python3.9/site-packages/vif_plug_ovs/linux_net.py", line 296, in get_representor_port
2023-11-20 14:38:08.349 2 ERROR os_vif     raise exception.RepresentorNotFound(ifname=pf_ifname, vf_num=vf_num)
2023-11-20 14:38:08.349 2 ERROR os_vif vif_plug_ovs.exception.RepresentorNotFound: Failed getting representor port for PF enp4s0f0 with 9
2023-11-20 14:38:08.349 2 ERROR os_vif
2023-11-20 14:38:08.350 2 ERROR nova.virt.libvirt.driver [req-c4ac4d07-7099-4c49-b51d-e44be8600d08 337bfc9a65aa46a6a39f01f02fdbfaa2 5bd6730c34a8449391627c115ee3f674 - default default] [instance: d5bbdca7-bb5f-478e-a8a9-3dc1451c9288] Failed to start libvirt guest: nova.exception.InternalError: Failure running os_vif plugin plug method: Failed to plug VIF VIFHostDevice(active=False,address=fa:16:3e:e9:45:36,dev_address=0000:04:01.3,dev_type='ethernet',has_traffic_filtering=True,id=ebcb5c49-c124-4292-a1dd-cd289a928429,network=Network(8c93d973-3c89-4568-aae7-8970716ff52b),plugin='ovs',port_profile=VIFPortProfileOVSRepresentor,preserve_on_delete=True). Got error: Failed getting representor port for PF enp4s0f0 with 9
~~~

The networking configuration seems to be ok:
~~~
[root@computehwoffload-0 ~]# lshw -c network -businfo  |grep enp4
pci@0000:04:00.0  enp4s0f0    network        MT27800 Family [ConnectX-5]
pci@0000:04:00.1  enp4s0f1    network        MT27800 Family [ConnectX-5]
pci@0000:04:00.2  enp4s0f0v0  network        MT27800 Family [ConnectX-5 Virtual Function]
...
pci@0000:04:01.3  enp4s0f0v9  network        MT27800 Family [ConnectX-5 Virtual Function]
pci@0000:04:02.6  enp4s0f1v0  network        MT27800 Family [ConnectX-5 Virtual 
...

[root@computehwoffload-0 ~]# ip -details l show enp4s0f0
10: enp4s0f0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master mx-bond state UP mode DEFAULT group default qlen 1000
    link/ether 04:3f:72:b8:bb:5e brd ff:ff:ff:ff:ff:ff promiscuity 1 minmtu 68 maxmtu 9978 
    bond_slave state ACTIVE mii_status UP link_failure_count 0 perm_hwaddr 04:3f:72:b8:bb:5e queue_id 0 addrgenmode eui64 numtxqueues 576 numrxqueues 80 gso_max_size 65536 gso_max_segs 65535 tso_max_size 524280 tso_max_segs 65535 gro_max_size 65536 portname p0 switchid 5ebbb80003723f04 parentbus pci parentdev 0000:04:00.0 
    vf 0     link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
    vf 1     link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
    vf 2     link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
    vf 3     link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
    vf 4     link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
    vf 5     link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
    vf 6     link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
    vf 7     link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
    vf 8     link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
    vf 9     link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
    altname enp4s0f0np0
~~~

However, after doing some debugging in get_representor_port we can see that enp4s0f0_X names not present on RHEL 9 compute in /sys/class/net/enp4s0f0/subsystem/:
~~~
[root@computehwoffload-0 ~]# ls /sys/class/net/enp4s0f0/subsystem/
bond_api         eno1        enp130s0f1  enp4s0f0v1  enp4s0f0v6  enp4s0f1v0  enp4s0f1v5  enp6s0f0        lo          vlan162
bonding_masters  eno2        enp130s0f2  enp4s0f0v2  enp4s0f0v7  enp4s0f1v1  enp4s0f1v6  enp6s0f1        mx-bond
br-int           eno3        enp130s0f3  enp4s0f0v3  enp4s0f0v8  enp4s0f1v2  enp4s0f1v7  enp6s0f2        ovs-system
br-link0         eno4        enp4s0f0    enp4s0f0v4  enp4s0f0v9  enp4s0f1v3  enp4s0f1v8  enp6s0f3        vlan160
br-link1         enp130s0f0  enp4s0f0v0  enp4s0f0v5  enp4s0f1    enp4s0f1v4  enp4s0f1v9  genev_sys_6081  vlan161
~~~

In contrast, on the RHEL 8 compute (the working compute) we can see the enp4s0f0_X names exist in subsytem:
~~~
[root@computehwoffload-1 ~]# ls /sys/class/net/enp4s0f0/subsystem/
bond_api         eno1        enp130s0f1  enp4s0f0_1  enp4s0f0_6  enp4s0f0v8  enp4s0f1_2  enp4s0f1_7  enp4s0f1v9  genev_sys_6081  vlan161
bonding_masters  eno2        enp130s0f2  enp4s0f0_2  enp4s0f0_7  enp4s0f0v9  enp4s0f1_3  enp4s0f1_8  enp6s0f0    lo              vlan162
br-int           eno3        enp130s0f3  enp4s0f0_3  enp4s0f0_8  enp4s0f1    enp4s0f1_4  enp4s0f1_9  enp6s0f1    mx-bond
br-link0         eno4        enp4s0f0    enp4s0f0_4  enp4s0f0_9  enp4s0f1_0  enp4s0f1_5  enp4s0f1v7  enp6s0f2    ovs-system
br-link1         enp130s0f0  enp4s0f0_0  enp4s0f0_5  enp4s0f0v7  enp4s0f1_1  enp4s0f1_6  enp4s0f1v8  enp6s0f3    vlan160
~~~

Version-Release number of selected component (if applicable):
FFU 17.1

How reproducible:
100%

Steps to Reproduce:
1.Perform FFU 17.1
2.Instantiate VM with HW Offload on RHEL 9 compute
3.

Actual results:


Expected results:


Additional info:

Comment 20 errata-xmlrpc 2024-01-16 14:31:44 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 17.1.2 bug fix and enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2024:0209


Note You need to log in before you can comment on or make changes to this bug.