Bug 2188324
| Summary: | [17.1][E810 VF] No connectivity from a vm using a e810 vf | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Miguel Angel Nieto <mnietoji> |
| Component: | os-net-config | Assignee: | Karthik Sundaravel <ksundara> |
| Status: | VERIFIED --- | QA Contact: | Miguel Angel Nieto <mnietoji> |
| Severity: | high | Docs Contact: | |
| Priority: | medium | ||
| Version: | 17.1 (Wallaby) | CC: | bfournie, cfontain, dacampbe, ekuris, eshulman, hakhande, hbrock, ivecera, jjoyce, jschluet, jslagle, konguyen, ksundara, mburns, mschmidt, pgrist, poros |
| Target Milestone: | beta | Keywords: | Regression, Reopened, TestOnly, Triaged |
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | kernel-5.14.0-223.el9 | Doc Type: | No Doc Update |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2023-04-25 14:07:42 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Can you set the ``trust`` of the respective VF's to ``on`` and repeat the tests ? Hi
It didnt work. No ping. I set trust to on.
6: ens1f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP mode DEFAULT group default qlen 1000
link/ether 40:a6:b7:18:e0:60 brd ff:ff:ff:ff:ff:ff
vf 0 link/ether 5e:b0:a3:f8:f0:96 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
vf 1 link/ether b2:24:69:d8:55:7b brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
vf 2 link/ether be:5e:f2:11:d9:fa brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
vf 3 link/ether 92:4a:11:11:c8:0b brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
vf 4 link/ether 26:02:28:a5:a4:2c brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
vf 5 link/ether ea:b2:2f:39:f9:12 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
vf 6 link/ether ea:61:b8:a6:0b:0a brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
vf 7 link/ether 0e:dc:07:c3:0f:59 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
vf 8 link/ether 52:d9:a5:55:2a:f9 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
vf 9 link/ether fa:16:3e:40:4d:02 brd ff:ff:ff:ff:ff:ff, vlan 178, spoof checking off, link-state enable, trust on
altname enp59s0f0
9: ens1f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP mode DEFAULT group default qlen 1000
link/ether 40:a6:b7:18:e0:61 brd ff:ff:ff:ff:ff:ff
vf 0 link/ether 6e:2e:c0:74:66:ce brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
vf 1 link/ether 8e:23:63:4a:0a:cd brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
vf 2 link/ether 8e:dc:9c:82:07:e8 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
vf 3 link/ether fa:ea:d4:c0:51:91 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
vf 4 link/ether 2e:d2:87:23:c5:df brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
vf 5 link/ether 3a:e2:1d:91:08:e4 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
vf 6 link/ether f2:55:4e:4c:66:0b brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
vf 7 link/ether 6e:cb:06:4c:0c:32 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
vf 8 link/ether a6:7e:3f:30:ec:56 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
vf 9 link/ether fa:16:3e:cd:52:82 brd ff:ff:ff:ff:ff:ff, vlan 179, spoof checking off, link-state enable, trust on
After clearing the vlan on ens1f0 of the compute node housing the testpmd, the ping works from the guest. ip link set dev ens1f0 vf 9 vlan 0 spoofchk off trust off [cloud-user@testpmd-sriov-vf-dut ~]$ ping 10.10.178.254 -c3 PING 10.10.178.254 (10.10.178.254) 56(84) bytes of data. 64 bytes from 10.10.178.254: icmp_seq=1 ttl=64 time=10.4 ms 64 bytes from 10.10.178.254: icmp_seq=2 ttl=64 time=120 ms 64 bytes from 10.10.178.254: icmp_seq=3 ttl=64 time=102 ms --- 10.10.178.254 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2003ms rtt min/avg/max/mdev = 10.372/77.438/119.908/47.980 ms Can you PTAL at the swicth config. The issue is reproduced after switch config is restored. But when I just removed the VLAN on the VF (vf 9) from the host and created a vlan interface on the guest with same vlan, the ping works. Also from the guest's dmesg we see the below whenever the vlan tag is added to the VF Apr 25 07:14:54 testpmd-sriov-vf-dut kernel: iavf 0000:05:00.0: Reset warning received from the PF Apr 25 07:14:54 testpmd-sriov-vf-dut kernel: iavf 0000:05:00.0: Scheduling reset task Apr 25 07:14:54 testpmd-sriov-vf-dut kernel: iavf 0000:05:00.0: PF returned error -5 (IAVF_ERR_PARAM) to our request 6 Apr 25 07:14:55 testpmd-sriov-vf-dut NetworkManager[988]: <info> [1682421295.0143] device (eth1): carrier: link connected Apr 25 07:14:55 testpmd-sriov-vf-dut NetworkManager[988]: <info> [1682421295.0146] device (eth1.178): carrier: link connected Apr 25 07:14:55 testpmd-sriov-vf-dut kernel: iavf 0000:05:00.0 eth1: NIC Link is Up Speed is 10 Gbps Full Duplex This concludes that the E810 driver path is not handling the VLAN set on the VFs. It appears to be a known issue in older kernel versions for E810. Can you please use the recent RHEL guest image and confirm. At this point the guest uses 4.18 kernel. [cloud-user@testpmd-sriov-vf-dut ~]$ uname -a Linux testpmd-sriov-vf-dut 4.18.0-305.16.1.el8_4.x86_64 #1 SMP Mon Aug 23 13:15:56 EDT 2021 x86_64 x86_64 x86_64 GNU/Linux I have tried with rhel 8.4 and 9.1 guest image and any of them worked Now I have tried with rhel 9.2 and this one is working, so, for using e810 vfs, only rhel 9.2 works Rhel 8.4 Linux testpmd-sriov-vf-dut 4.18.0-305.16.1.el8_4.x86_64 #1 SMP Mon Aug 23 13:15:56 EDT 2021 x86_64 x86_64 x86_64 GNU/Linux Rhel 9.1 Red Hat Enterprise Linux release 9.1 (Plow) Linux testpmd-sriov-vf-dut 5.14.0-162.6.1.el9_1.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Sep 30 07:36:03 EDT 2022 x86_64 x86_64 x86_64 GNU/Linux Rhel 9.2 Red Hat Enterprise Linux release 9.2 Beta (Plow) Linux trex 5.14.0-283.el9.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Feb 23 19:04:04 EST 2023 x86_64 x86_64 x86_64 GNU/Linux It is needed rhel 9.2 in guest image, older rhel versions will fail Rhel 9.2 Red Hat Enterprise Linux release 9.2 Beta (Plow) Linux trex 5.14.0-283.el9.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Feb 23 19:04:04 EST 2023 x86_64 x86_64 x86_64 GNU/Linux It looks like latest images generated in April and later are working fine RHEL8 http://download.eng.tlv.redhat.com/rhel-8/rel-eng/RHEL-8/latest-RHEL-8.7.0/compose/BaseOS/x86_64/images/rhel-guest-image-8.7-1660.x86_64.qcow2 Red Hat Enterprise Linux release 8.7 (Ootpa) Linux testpmd-sriov-vf-dut 4.18.0-425.3.1.el8.x86_64 #1 SMP Fri Sep 30 11:45:06 EDT 2022 x86_64 x86_64 x86_64 GNU/Linux fail http://download.eng.tlv.redhat.com/rhel-8/rel-eng/RHEL-8/latest-RHEL-8.8.0/compose/BaseOS/x86_64/images/rhel-guest-image-8.8-1482.x86_64.qcow2 Red Hat Enterprise Linux release 8.8 (Ootpa) Linux trex 4.18.0-477.10.1.el8_8.x86_64 #1 SMP Wed Apr 5 13:35:01 EDT 2023 x86_64 x86_64 x86_64 GNU/Linux working RHEL 9 http://download.eng.tlv.redhat.com/rhel-9/rel-eng/RHEL-9/RHEL-9.2.0-20230306.4/compose/BaseOS/x86_64/images/rhel-guest-image-9.2-20230306.4.x86_64.qcow2 Red Hat Enterprise Linux release 9.2 (Plow) There is connectivity but ping latency is not stable, there is a huge variability http://download.eng.tlv.redhat.com/rhel-9/rel-eng/RHEL-9/latest-RHEL-9.2.0/compose/BaseOS/x86_64/images/rhel-guest-image-9.2-20230414.17.x86_64.qcow2 Red Hat Enterprise Linux release 9.2 (Plow) Linux testpmd-sriov-vf-dut 5.14.0-284.11.1.el9_2.x86_64 #1 SMP PREEMPT_DYNAMIC Wed Apr 12 10:45:03 EDT 2023 x86_64 x86_64 x86_64 GNU/Linux working |
Description of problem: I have configured a vm using a e810 VF and there is no connectivity from inside the vm. However, if I use other VF from the hypervisor, i can see that connectivity is ok. I have tried with 2 different guest images: rhel 8.4 and rhel 9.1 and ping didnt work with any of them servers: (overcloud) [stack@undercloud-0 ~]$ openstack server list --all-projects +--------------------------------------+----------------------+--------+-----------------------------------------------------------------------------------------------------------------+--------------+-----------------------+ | ID | Name | Status | Networks | Image | Flavor | +--------------------------------------+----------------------+--------+-----------------------------------------------------------------------------------------------------------------+--------------+-----------------------+ | a13a96a6-824e-4cee-8ad7-20621bb2f1d1 | testpmd-sriov-vf-dut | ACTIVE | management_net_management=10.10.107.165, 10.46.228.18; sriov_net_178=10.10.178.166; sriov_net_179=10.10.179.151 | trex_testpmd | perf_numa_0_sriov_dut | | 43d60f26-e3ac-47e3-abba-4422a8c93922 | trex | ACTIVE | management_net_management=10.10.107.179, 10.46.228.21; sriov_net_178=10.10.178.181; sriov_net_179=10.10.179.155 | trex_testpmd | perf_numa_0_trex | +--------------------------------------+----------------------+--------+-----------------------------------------------------------------------------------------------------------------+--------------+-----------------------+ e810 vf port of one of the vms: (overcloud) [stack@undercloud-0 ~]$ openstack port show 6f541ba2-7d7b-4da7-abb0-4db785d5f64d +-------------------------+----------------------------------------------------------------------------------+ | Field | Value | +-------------------------+----------------------------------------------------------------------------------+ | admin_state_up | UP | | allowed_address_pairs | | | binding_host_id | computeovsdpdksriov-r740.localdomain | | binding_profile | pci_slot='0000:3b:11.1', pci_vendor_info='8086:1889', physical_network='sriov-2' | | binding_vif_details | connectivity='l2', port_filter='False', vlan='179' | | binding_vif_type | hw_veb | | binding_vnic_type | direct | | created_at | 2023-04-20T10:11:15Z | | data_plane_status | None | | description | | | device_id | a13a96a6-824e-4cee-8ad7-20621bb2f1d1 | | device_owner | compute:nova | | device_profile | None | | dns_assignment | None | | dns_domain | None | | dns_name | None | | extra_dhcp_opts | | | fixed_ips | ip_address='10.10.179.151', subnet_id='1e7ea9b7-5755-4aca-9dee-8cadbbf39390' | | id | 6f541ba2-7d7b-4da7-abb0-4db785d5f64d | | ip_allocation | None | | mac_address | fa:16:3e:f1:f1:35 | | name | sriov_net_nic1_179_dpdk_dut_port-1 | | network_id | 3c7fadce-0e01-4b12-a264-91be11f0696d | | numa_affinity_policy | None | | port_security_enabled | False | | project_id | cd61188ce36c40aea44675b1b4d306e0 | | propagate_uplink_status | None | | qos_network_policy_id | None | | qos_policy_id | None | | resource_request | None | | revision_number | 5 | | security_group_ids | | | status | ACTIVE | | tags | | | trunk_details | None | | updated_at | 2023-04-20T10:15:55Z | +-------------------------+----------------------------------------------------------------------------------+ [root@computeovsdpdksriov-r740 tripleo-admin]# lspci |grep 3b.11.1 3b:11.1 Ethernet controller: Intel Corporation Ethernet Adaptive Virtual Function (rev 02) [root@computeovsdpdksriov-r740 tripleo-admin]# ip link | grep -B 3 fa:16:3e:f1:f1:35 10: ens1f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 40:a6:b7:18:e0:61 brd ff:ff:ff:ff:ff:ff vf 0 link/ether 06:64:cf:a8:0e:4a brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off vf 1 link/ether fa:16:3e:f1:f1:35 brd ff:ff:ff:ff:ff:ff, vlan 179, spoof checking off, link-state enable, trust off VM interface: [cloud-user@testpmd-sriov-vf-dut ~]$ ip a s eth2 4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP group default qlen 1000 link/ether fa:16:3e:f1:f1:35 brd ff:ff:ff:ff:ff:ff altname enp6s0 inet 10.10.179.151/24 brd 10.10.179.255 scope global noprefixroute eth2 valid_lft forever preferred_lft forever inet6 fe80::f816:3eff:fef1:f135/64 scope link valid_lft forever preferred_lft forever [cloud-user@testpmd-sriov-vf-dut ~]$ ping 10.10.179.254 PING 10.10.179.254 (10.10.179.254) 56(84) bytes of data. From 10.10.179.151 icmp_seq=1 Destination Host Unreachable From 10.10.179.151 icmp_seq=2 Destination Host Unreachable From 10.10.179.151 icmp_seq=3 Destination Host Unreachable ^C --- 10.10.179.254 ping statistics --- 4 packets transmitted, 0 received, +3 errors, 100% packet loss, time 3067ms pipe 3 [cloud-user@testpmd-sriov-vf-dut ~]$ ping -c 3 -w 1 10.10.179.254 PING 10.10.179.254 (10.10.179.254) 56(84) bytes of data. --- 10.10.179.254 ping statistics --- 1 packets transmitted, 0 received, 100% packet loss, time 0ms If i use other vf (vf2) from they hypervisor: 10: ens1f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 40:a6:b7:18:e0:61 brd ff:ff:ff:ff:ff:ff vf 0 link/ether 06:64:cf:a8:0e:4a brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off vf 1 link/ether fa:16:3e:f1:f1:35 brd ff:ff:ff:ff:ff:ff, vlan 179, spoof checking off, link-state enable, trust off vf 2 link/ether 5e:f6:cb:1a:de:4b brd ff:ff:ff:ff:ff:ff, vlan 179, spoof checking off, link-state auto, trust off [root@computeovsdpdksriov-r740 tripleo-admin]# ip a s ens1f1v2 33: ens1f1v2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 5e:f6:cb:1a:de:4b brd ff:ff:ff:ff:ff:ff altname enp59s0f1v2 inet 10.10.179.20/24 scope global ens1f1v2 valid_lft forever preferred_lft forever inet6 fe80::4a0e:456:c94c:90a2/64 scope link noprefixroute valid_lft forever preferred_lft forever [root@computeovsdpdksriov-r740 tripleo-admin]# ping 10.10.179.254 PING 10.10.179.254 (10.10.179.254) 56(84) bytes of data. 64 bytes from 10.10.179.254: icmp_seq=1 ttl=64 time=19.4 ms 64 bytes from 10.10.179.254: icmp_seq=2 ttl=64 time=17.3 ms Version-Release number of selected component (if applicable): RHOS-17.1-RHEL-9-20230404.n.1 How reproducible: 1. Deploy vms with e810 vfs. I used ospd-17.1-geneve-ovn-dpdk-sriov-ctlplane-dataplane-bonding-hybrid-e810-performance_sriov 2. I configured a l3 interface in the switch with ip 10.10.179.254 for vlan 179 3. ping from the vm to the interface in the switch Actual results: There is no connectivity from inside the vm using a e810 vf Expected results: There should be connectivity Additional info: