Bug 2188324 - [17.1][E810 VF] No connectivity from a vm using a e810 vf
Summary: [17.1][E810 VF] No connectivity from a vm using a e810 vf
Keywords:
Status: VERIFIED
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: os-net-config
Version: 17.1 (Wallaby)
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: beta
: ---
Assignee: Karthik Sundaravel
QA Contact: Miguel Angel Nieto
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-04-20 13:25 UTC by Miguel Angel Nieto
Modified: 2023-08-07 05:32 UTC (History)
17 users (show)

Fixed In Version: kernel-5.14.0-223.el9
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-04-25 14:07:42 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker NFV-2832 0 None None None 2023-04-20 14:24:01 UTC
Red Hat Issue Tracker OSP-24419 0 None None None 2023-04-20 13:26:55 UTC

Description Miguel Angel Nieto 2023-04-20 13:25:59 UTC
Description of problem:
I have configured a vm using a e810 VF and there is no connectivity from inside the vm. However, if I use other VF from the hypervisor, i can see that connectivity is ok. I have tried with 2 different guest images: rhel 8.4 and rhel 9.1 and ping didnt work with any of them

servers:
(overcloud) [stack@undercloud-0 ~]$ openstack server list --all-projects
+--------------------------------------+----------------------+--------+-----------------------------------------------------------------------------------------------------------------+--------------+-----------------------+
| ID                                   | Name                 | Status | Networks                                                                                                        | Image        | Flavor                |
+--------------------------------------+----------------------+--------+-----------------------------------------------------------------------------------------------------------------+--------------+-----------------------+
| a13a96a6-824e-4cee-8ad7-20621bb2f1d1 | testpmd-sriov-vf-dut | ACTIVE | management_net_management=10.10.107.165, 10.46.228.18; sriov_net_178=10.10.178.166; sriov_net_179=10.10.179.151 | trex_testpmd | perf_numa_0_sriov_dut |
| 43d60f26-e3ac-47e3-abba-4422a8c93922 | trex                 | ACTIVE | management_net_management=10.10.107.179, 10.46.228.21; sriov_net_178=10.10.178.181; sriov_net_179=10.10.179.155 | trex_testpmd | perf_numa_0_trex      |
+--------------------------------------+----------------------+--------+-----------------------------------------------------------------------------------------------------------------+--------------+-----------------------+

e810 vf port of one of the vms:
(overcloud) [stack@undercloud-0 ~]$ openstack port show 6f541ba2-7d7b-4da7-abb0-4db785d5f64d
+-------------------------+----------------------------------------------------------------------------------+
| Field                   | Value                                                                            |
+-------------------------+----------------------------------------------------------------------------------+
| admin_state_up          | UP                                                                               |
| allowed_address_pairs   |                                                                                  |
| binding_host_id         | computeovsdpdksriov-r740.localdomain                                             |
| binding_profile         | pci_slot='0000:3b:11.1', pci_vendor_info='8086:1889', physical_network='sriov-2' |
| binding_vif_details     | connectivity='l2', port_filter='False', vlan='179'                               |
| binding_vif_type        | hw_veb                                                                           |
| binding_vnic_type       | direct                                                                           |
| created_at              | 2023-04-20T10:11:15Z                                                             |
| data_plane_status       | None                                                                             |
| description             |                                                                                  |
| device_id               | a13a96a6-824e-4cee-8ad7-20621bb2f1d1                                             |
| device_owner            | compute:nova                                                                     |
| device_profile          | None                                                                             |
| dns_assignment          | None                                                                             |
| dns_domain              | None                                                                             |
| dns_name                | None                                                                             |
| extra_dhcp_opts         |                                                                                  |
| fixed_ips               | ip_address='10.10.179.151', subnet_id='1e7ea9b7-5755-4aca-9dee-8cadbbf39390'     |
| id                      | 6f541ba2-7d7b-4da7-abb0-4db785d5f64d                                             |
| ip_allocation           | None                                                                             |
| mac_address             | fa:16:3e:f1:f1:35                                                                |
| name                    | sriov_net_nic1_179_dpdk_dut_port-1                                               |
| network_id              | 3c7fadce-0e01-4b12-a264-91be11f0696d                                             |
| numa_affinity_policy    | None                                                                             |
| port_security_enabled   | False                                                                            |
| project_id              | cd61188ce36c40aea44675b1b4d306e0                                                 |
| propagate_uplink_status | None                                                                             |
| qos_network_policy_id   | None                                                                             |
| qos_policy_id           | None                                                                             |
| resource_request        | None                                                                             |
| revision_number         | 5                                                                                |
| security_group_ids      |                                                                                  |
| status                  | ACTIVE                                                                           |
| tags                    |                                                                                  |
| trunk_details           | None                                                                             |
| updated_at              | 2023-04-20T10:15:55Z                                                             |
+-------------------------+----------------------------------------------------------------------------------+

[root@computeovsdpdksriov-r740 tripleo-admin]# lspci |grep 3b.11.1
3b:11.1 Ethernet controller: Intel Corporation Ethernet Adaptive Virtual Function (rev 02)

[root@computeovsdpdksriov-r740 tripleo-admin]# ip link | grep -B 3 fa:16:3e:f1:f1:35
10: ens1f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 40:a6:b7:18:e0:61 brd ff:ff:ff:ff:ff:ff
    vf 0     link/ether 06:64:cf:a8:0e:4a brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
    vf 1     link/ether fa:16:3e:f1:f1:35 brd ff:ff:ff:ff:ff:ff, vlan 179, spoof checking off, link-state enable, trust off

VM interface:
[cloud-user@testpmd-sriov-vf-dut ~]$ ip a s eth2
4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP group default qlen 1000
    link/ether fa:16:3e:f1:f1:35 brd ff:ff:ff:ff:ff:ff
    altname enp6s0
    inet 10.10.179.151/24 brd 10.10.179.255 scope global noprefixroute eth2
       valid_lft forever preferred_lft forever
    inet6 fe80::f816:3eff:fef1:f135/64 scope link 
       valid_lft forever preferred_lft forever

[cloud-user@testpmd-sriov-vf-dut ~]$ ping 10.10.179.254
PING 10.10.179.254 (10.10.179.254) 56(84) bytes of data.
From 10.10.179.151 icmp_seq=1 Destination Host Unreachable
From 10.10.179.151 icmp_seq=2 Destination Host Unreachable
From 10.10.179.151 icmp_seq=3 Destination Host Unreachable
^C
--- 10.10.179.254 ping statistics ---
4 packets transmitted, 0 received, +3 errors, 100% packet loss, time 3067ms
pipe 3
[cloud-user@testpmd-sriov-vf-dut ~]$ ping -c 3 -w 1 10.10.179.254
PING 10.10.179.254 (10.10.179.254) 56(84) bytes of data.

--- 10.10.179.254 ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms

If i use other vf (vf2) from they hypervisor:
10: ens1f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 40:a6:b7:18:e0:61 brd ff:ff:ff:ff:ff:ff
    vf 0     link/ether 06:64:cf:a8:0e:4a brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
    vf 1     link/ether fa:16:3e:f1:f1:35 brd ff:ff:ff:ff:ff:ff, vlan 179, spoof checking off, link-state enable, trust off
    vf 2     link/ether 5e:f6:cb:1a:de:4b brd ff:ff:ff:ff:ff:ff, vlan 179, spoof checking off, link-state auto, trust off

[root@computeovsdpdksriov-r740 tripleo-admin]# ip a s ens1f1v2
33: ens1f1v2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 5e:f6:cb:1a:de:4b brd ff:ff:ff:ff:ff:ff
    altname enp59s0f1v2
    inet 10.10.179.20/24 scope global ens1f1v2
       valid_lft forever preferred_lft forever
    inet6 fe80::4a0e:456:c94c:90a2/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever

[root@computeovsdpdksriov-r740 tripleo-admin]# ping 10.10.179.254
PING 10.10.179.254 (10.10.179.254) 56(84) bytes of data.
64 bytes from 10.10.179.254: icmp_seq=1 ttl=64 time=19.4 ms
64 bytes from 10.10.179.254: icmp_seq=2 ttl=64 time=17.3 ms



Version-Release number of selected component (if applicable):
RHOS-17.1-RHEL-9-20230404.n.1

How reproducible:
1. Deploy vms with e810 vfs. I used ospd-17.1-geneve-ovn-dpdk-sriov-ctlplane-dataplane-bonding-hybrid-e810-performance_sriov
2. I configured a l3 interface in the switch with ip 10.10.179.254 for vlan 179
3. ping from the vm to the interface in the switch


Actual results:
There is no connectivity from inside the vm using a e810 vf


Expected results:
There should be connectivity


Additional info:

Comment 1 Karthik Sundaravel 2023-04-24 10:40:40 UTC
Can you set the ``trust`` of the respective VF's to  ``on`` and repeat the tests ?

Comment 2 Miguel Angel Nieto 2023-04-25 07:35:16 UTC
Hi

It didnt work. No ping. I set trust to on.
6: ens1f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 40:a6:b7:18:e0:60 brd ff:ff:ff:ff:ff:ff
    vf 0     link/ether 5e:b0:a3:f8:f0:96 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
    vf 1     link/ether b2:24:69:d8:55:7b brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
    vf 2     link/ether be:5e:f2:11:d9:fa brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
    vf 3     link/ether 92:4a:11:11:c8:0b brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
    vf 4     link/ether 26:02:28:a5:a4:2c brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
    vf 5     link/ether ea:b2:2f:39:f9:12 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
    vf 6     link/ether ea:61:b8:a6:0b:0a brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
    vf 7     link/ether 0e:dc:07:c3:0f:59 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
    vf 8     link/ether 52:d9:a5:55:2a:f9 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
    vf 9     link/ether fa:16:3e:40:4d:02 brd ff:ff:ff:ff:ff:ff, vlan 178, spoof checking off, link-state enable, trust on
    altname enp59s0f0
9: ens1f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 40:a6:b7:18:e0:61 brd ff:ff:ff:ff:ff:ff
    vf 0     link/ether 6e:2e:c0:74:66:ce brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
    vf 1     link/ether 8e:23:63:4a:0a:cd brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
    vf 2     link/ether 8e:dc:9c:82:07:e8 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
    vf 3     link/ether fa:ea:d4:c0:51:91 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
    vf 4     link/ether 2e:d2:87:23:c5:df brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
    vf 5     link/ether 3a:e2:1d:91:08:e4 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
    vf 6     link/ether f2:55:4e:4c:66:0b brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
    vf 7     link/ether 6e:cb:06:4c:0c:32 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
    vf 8     link/ether a6:7e:3f:30:ec:56 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
    vf 9     link/ether fa:16:3e:cd:52:82 brd ff:ff:ff:ff:ff:ff, vlan 179, spoof checking off, link-state enable, trust on

Comment 3 Karthik Sundaravel 2023-04-25 09:08:33 UTC
After clearing the vlan on ens1f0 of the compute node housing the testpmd, the ping works from the guest.

ip link set dev ens1f0 vf 9 vlan 0 spoofchk off trust off

[cloud-user@testpmd-sriov-vf-dut ~]$ ping 10.10.178.254 -c3
PING 10.10.178.254 (10.10.178.254) 56(84) bytes of data.
64 bytes from 10.10.178.254: icmp_seq=1 ttl=64 time=10.4 ms
64 bytes from 10.10.178.254: icmp_seq=2 ttl=64 time=120 ms
64 bytes from 10.10.178.254: icmp_seq=3 ttl=64 time=102 ms

--- 10.10.178.254 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2003ms
rtt min/avg/max/mdev = 10.372/77.438/119.908/47.980 ms


Can you PTAL at the swicth config.

Comment 4 Karthik Sundaravel 2023-04-25 11:20:53 UTC
The issue is reproduced after switch config is restored.

But when I just removed the VLAN on the VF (vf 9) from the host and created a vlan interface on the guest with same vlan, the ping works.


Also from the guest's dmesg we see the below whenever the vlan tag is added to the VF 

Apr 25 07:14:54 testpmd-sriov-vf-dut kernel: iavf 0000:05:00.0: Reset warning received from the PF
Apr 25 07:14:54 testpmd-sriov-vf-dut kernel: iavf 0000:05:00.0: Scheduling reset task
Apr 25 07:14:54 testpmd-sriov-vf-dut kernel: iavf 0000:05:00.0: PF returned error -5 (IAVF_ERR_PARAM) to our request 6
Apr 25 07:14:55 testpmd-sriov-vf-dut NetworkManager[988]: <info>  [1682421295.0143] device (eth1): carrier: link connected
Apr 25 07:14:55 testpmd-sriov-vf-dut NetworkManager[988]: <info>  [1682421295.0146] device (eth1.178): carrier: link connected
Apr 25 07:14:55 testpmd-sriov-vf-dut kernel: iavf 0000:05:00.0 eth1: NIC Link is Up Speed is 10 Gbps Full Duplex
This concludes that the E810 driver path is not handling the VLAN set on the VFs.

It appears to be a known issue in older kernel versions for E810. Can you please use the recent RHEL guest image and confirm.

At this point the guest uses 4.18 kernel.
[cloud-user@testpmd-sriov-vf-dut ~]$ uname -a
Linux testpmd-sriov-vf-dut 4.18.0-305.16.1.el8_4.x86_64 #1 SMP Mon Aug 23 13:15:56 EDT 2021 x86_64 x86_64 x86_64 GNU/Linux

Comment 6 Miguel Angel Nieto 2023-04-25 13:22:15 UTC
I have tried with rhel 8.4 and 9.1 guest image and any of them worked

Now I have tried with rhel 9.2 and this one is working, so, for using e810 vfs, only rhel 9.2 works

Rhel 8.4
Linux testpmd-sriov-vf-dut 4.18.0-305.16.1.el8_4.x86_64 #1 SMP Mon Aug 23 13:15:56 EDT 2021 x86_64 x86_64 x86_64 GNU/Linux

Rhel 9.1
Red Hat Enterprise Linux release 9.1 (Plow)
Linux testpmd-sriov-vf-dut 5.14.0-162.6.1.el9_1.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Sep 30 07:36:03 EDT 2022 x86_64 x86_64 x86_64 GNU/Linux

Rhel 9.2
Red Hat Enterprise Linux release 9.2 Beta (Plow)
Linux trex 5.14.0-283.el9.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Feb 23 19:04:04 EST 2023 x86_64 x86_64 x86_64 GNU/Linux

Comment 7 Miguel Angel Nieto 2023-04-25 14:07:42 UTC
It is needed rhel 9.2 in guest image, older rhel versions will fail

Rhel 9.2
Red Hat Enterprise Linux release 9.2 Beta (Plow)
Linux trex 5.14.0-283.el9.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Feb 23 19:04:04 EST 2023 x86_64 x86_64 x86_64 GNU/Linux

Comment 20 Miguel Angel Nieto 2023-05-25 09:22:53 UTC
It looks like latest images generated in April and later are working fine

RHEL8
http://download.eng.tlv.redhat.com/rhel-8/rel-eng/RHEL-8/latest-RHEL-8.7.0/compose/BaseOS/x86_64/images/rhel-guest-image-8.7-1660.x86_64.qcow2
Red Hat Enterprise Linux release 8.7 (Ootpa)
Linux testpmd-sriov-vf-dut 4.18.0-425.3.1.el8.x86_64 #1 SMP Fri Sep 30 11:45:06 EDT 2022 x86_64 x86_64 x86_64 GNU/Linux
fail

http://download.eng.tlv.redhat.com/rhel-8/rel-eng/RHEL-8/latest-RHEL-8.8.0/compose/BaseOS/x86_64/images/rhel-guest-image-8.8-1482.x86_64.qcow2
Red Hat Enterprise Linux release 8.8 (Ootpa)
Linux trex 4.18.0-477.10.1.el8_8.x86_64 #1 SMP Wed Apr 5 13:35:01 EDT 2023 x86_64 x86_64 x86_64 GNU/Linux
working


RHEL 9
http://download.eng.tlv.redhat.com/rhel-9/rel-eng/RHEL-9/RHEL-9.2.0-20230306.4/compose/BaseOS/x86_64/images/rhel-guest-image-9.2-20230306.4.x86_64.qcow2
Red Hat Enterprise Linux release 9.2 (Plow)
There is connectivity but ping latency is not stable, there is a huge variability


http://download.eng.tlv.redhat.com/rhel-9/rel-eng/RHEL-9/latest-RHEL-9.2.0/compose/BaseOS/x86_64/images/rhel-guest-image-9.2-20230414.17.x86_64.qcow2
Red Hat Enterprise Linux release 9.2 (Plow)
Linux testpmd-sriov-vf-dut 5.14.0-284.11.1.el9_2.x86_64 #1 SMP PREEMPT_DYNAMIC Wed Apr 12 10:45:03 EDT 2023 x86_64 x86_64 x86_64 GNU/Linux
working


Note You need to log in before you can comment on or make changes to this bug.