Bug 2175802 - [17.1][OVN][OFFLOAD][Connect X5] High variability in ping latency
Summary: [17.1][OVN][OFFLOAD][Connect X5] High variability in ping latency
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-neutron
Version: 17.1 (Wallaby)
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ---
Assignee: Haresh Khandelwal
QA Contact: Eran Kuris
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-03-06 15:09 UTC by Miguel Angel Nieto
Modified: 2023-05-11 07:27 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-04-11 09:52:59 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker NFV-2782 0 None None None 2023-03-13 14:21:45 UTC
Red Hat Issue Tracker OSP-22926 0 None None None 2023-03-06 15:09:51 UTC

Description Miguel Angel Nieto 2023-03-06 15:09:03 UTC
Description of problem:

I have deployed ovn hwoffload templates in 17.1 and spawn several vms. 

(overcloud) [stack@undercloud-0 ~]$ openstack server list --all-projects
+--------------------------------------+------------------------------------------+--------+----------------------------------------------------+---------------------------------------+--------------------+
| ID                                   | Name                                     | Status | Networks                                           | Image                                 | Flavor             |
+--------------------------------------+------------------------------------------+--------+----------------------------------------------------+---------------------------------------+--------------------+
| e8a92dc1-a911-409f-b036-f4ff8d922a54 | tempest-TestNfvOffload-server-1950151452 | ACTIVE | mellanox-vlan-provider=10.46.228.38, 30.30.220.170 | rhel-guest-image-7-6-210-x86-64-qcow2 | nfv_qe_base_flavor |
| 7b8f9415-9d8f-4a37-8aa7-35badf4c3aa3 | tempest-TestNfvOffload-server-829199820  | ACTIVE | mellanox-vlan-provider=10.46.228.36, 30.30.220.171 | rhel-guest-image-7-6-210-x86-64-qcow2 | nfv_qe_base_flavor |
| a9bda131-616c-47b2-8a1f-ea513cdef1bd | tempest-TestNfvOffload-server-880070818  | ACTIVE | mellanox-vlan-provider=10.46.228.41, 30.30.220.151 | rhel-guest-image-7-6-210-x86-64-qcow2 | nfv_qe_base_flavor |
| fc1528f2-fdd8-48a2-834f-6d44b7434845 | tempest-TestNfvOffload-server-1738747327 | ACTIVE | mellanox-vlan-provider=10.46.228.39, 30.30.220.141 | rhel-guest-image-7-6-210-x86-64-qcow2 | nfv_qe_base_flavor |
+--------------------------------------+------------------------------------------+--------+----------------------------------------------------+---------------------------------------+--------------------+

(overcloud) [stack@undercloud-0 ~]$ openstack port list | grep tempest
| 9bd78ced-2d86-4275-a77b-55ea0f19ea74 | tempest-port-smoke-141608843  | fa:16:3e:1e:cd:b7 | ip_address='30.30.220.171', subnet_id='c42a58a2-518e-4194-a1cc-bbef7c28ccc6' | ACTIVE |
| ea03a1c6-6c81-472e-a7cf-ce13490937de | tempest-port-smoke-115596582  | fa:16:3e:d6:a2:4a | ip_address='30.30.220.141', subnet_id='c42a58a2-518e-4194-a1cc-bbef7c28ccc6' | ACTIVE |
| f885e50b-af98-456a-8f0f-403fbb6ab910 | tempest-port-smoke-1825543414 | fa:16:3e:08:ef:1f | ip_address='30.30.220.151', subnet_id='c42a58a2-518e-4194-a1cc-bbef7c28ccc6' | ACTIVE |
| ff133e14-8192-4595-8d98-dc4e921f2af0 | tempest-port-smoke-1819262289 | fa:16:3e:5b:95:b8 | ip_address='30.30.220.170', subnet_id='c42a58a2-518e-4194-a1cc-bbef7c28ccc6' | ACTIVE |


(overcloud) [stack@undercloud-0 ~]$ openstack port show ff133e14-8192-4595-8d98-dc4e921f2af0 

+-------------------------+-------------------------------------------------------------------------------------------------------------------+
| Field                   | Value                                                                                                             |
+-------------------------+-------------------------------------------------------------------------------------------------------------------+
| admin_state_up          | UP                                                                                                                |
| allowed_address_pairs   |                                                                                                                   |
| binding_host_id         | computehwoffload-r730.localdomain                                                                                 |
| binding_profile         | capabilities='['switchdev']', pci_slot='0000:04:03.3', pci_vendor_info='15b3:1018', physical_network='mx-network' |
| binding_vif_details     | connectivity='l2', port_filter='True'                                                                             |
| binding_vif_type        | ovs                                                                                                               |
| binding_vnic_type       | direct                                                                                                            |
| created_at              | 2023-03-06T14:43:56Z                                                                                              |
| data_plane_status       | None                                                                                                              |
| description             |                                                                                                                   |
| device_id               | e8a92dc1-a911-409f-b036-f4ff8d922a54                                                                              |
| device_owner            | compute:nova                                                                                                      |
| device_profile          | None                                                                                                              |
| dns_assignment          | None                                                                                                              |
| dns_domain              | None                                                                                                              |
| dns_name                | None                                                                                                              |
| extra_dhcp_opts         |                                                                                                                   |
| fixed_ips               | ip_address='30.30.220.170', subnet_id='c42a58a2-518e-4194-a1cc-bbef7c28ccc6'                                      |
| id                      | ff133e14-8192-4595-8d98-dc4e921f2af0                                                                              |
| ip_allocation           | None                                                                                                              |
| dns_domain              | None                                                                                                              |
| dns_name                | None                                                                                                              |
| extra_dhcp_opts         |                                                                                                                   |
| fixed_ips               | ip_address='30.30.220.170', subnet_id='c42a58a2-518e-4194-a1cc-bbef7c28ccc6'                                      |
| id                      | ff133e14-8192-4595-8d98-dc4e921f2af0                                                                              |
| ip_allocation           | None                                                                                                              |
| mac_address             | fa:16:3e:5b:95:b8                                                                                                 |
| name                    | tempest-port-smoke-1819262289                                                                                     |
| network_id              | 314717b6-a7ee-4055-a800-a16cc44f6841                                                                              |
| numa_affinity_policy    | None                                                                                                              |
| port_security_enabled   | False                                                                                                             |
| project_id              | 64ec7e3d990a4ffd89d6b8207b5663be                                                                                  |
| propagate_uplink_status | None                                                                                                              |
| qos_network_policy_id   | None                                                                                                              |
| qos_policy_id           | None                                                                                                              |
| resource_request        | None                                                                                                              |
| revision_number         | 4                                                                                                                 |
| security_group_ids      |                                                                                                                   |
| status                  | ACTIVE                                                                                                            |
| tags                    |                                                                                                                   |
| trunk_details           | None                                                                                                              |
| updated_at              | 2023-03-06T14:44:09Z                                                                                              |
+-------------------------+-------------------------------------------------------------------------------------------------------------------+


I have ping using floating ip and i see variability:
[stack@undercloud-0 ~]$ ping 10.46.228.38
PING 10.46.228.38 (10.46.228.38) 56(84) bytes of data.
64 bytes from 10.46.228.38: icmp_seq=1 ttl=61 time=255 ms
64 bytes from 10.46.228.38: icmp_seq=2 ttl=61 time=1303 ms
64 bytes from 10.46.228.38: icmp_seq=3 ttl=61 time=273 ms
64 bytes from 10.46.228.38: icmp_seq=4 ttl=61 time=1321 ms
64 bytes from 10.46.228.38: icmp_seq=5 ttl=61 time=274 ms
64 bytes from 10.46.228.38: icmp_seq=6 ttl=61 time=1066 ms
64 bytes from 10.46.228.38: icmp_seq=7 ttl=61 time=19.1 ms
64 bytes from 10.46.228.38: icmp_seq=8 ttl=61 time=42.1 ms
64 bytes from 10.46.228.38: icmp_seq=9 ttl=61 time=321 ms
64 bytes from 10.46.228.38: icmp_seq=10 ttl=61 time=1388 ms
64 bytes from 10.46.228.38: icmp_seq=11 ttl=61 time=359 ms
64 bytes from 10.46.228.38: icmp_seq=12 ttl=61 time=986 ms
64 bytes from 10.46.228.38: icmp_seq=13 ttl=61 time=387 ms
64 bytes from 10.46.228.38: icmp_seq=14 ttl=61 time=279 ms
64 bytes from 10.46.228.38: icmp_seq=15 ttl=61 time=398 ms
64 bytes from 10.46.228.38: icmp_seq=16 ttl=61 time=1480 ms
64 bytes from 10.46.228.38: icmp_seq=17 ttl=61 time=468 ms

If I ping from the compute, I also get variability:
[tripleo-admin@computehwoffload-r730 ~]$ sudo ip netns exec ovnmeta-314717b6-a7ee-4055-a800-a16cc44f6841 ping 30.30.220.170                                                                                 [3/437]
PING 30.30.220.170 (30.30.220.170) 56(84) bytes of data.                                                                                                                                                           
64 bytes from 30.30.220.170: icmp_seq=1 ttl=64 time=1364 ms                                                                                                                                                        
64 bytes from 30.30.220.170: icmp_seq=2 ttl=64 time=347 ms                                                                                                                                                         
64 bytes from 30.30.220.170: icmp_seq=3 ttl=64 time=231 ms
64 bytes from 30.30.220.170: icmp_seq=4 ttl=64 time=429 ms
64 bytes from 30.30.220.170: icmp_seq=5 ttl=64 time=233 ms
64 bytes from 30.30.220.170: icmp_seq=6 ttl=64 time=343 ms
64 bytes from 30.30.220.170: icmp_seq=7 ttl=64 time=227 ms
64 bytes from 30.30.220.170: icmp_seq=8 ttl=64 time=497 ms
64 bytes from 30.30.220.170: icmp_seq=9 ttl=64 time=230 ms
64 bytes from 30.30.220.170: icmp_seq=10 ttl=64 time=339 ms
64 bytes from 30.30.220.170: icmp_seq=11 ttl=64 time=224 ms
64 bytes from 30.30.220.170: icmp_seq=12 ttl=64 time=237 ms
64 bytes from 30.30.220.170: icmp_seq=13 ttl=64 time=225 ms
64 bytes from 30.30.220.170: icmp_seq=14 ttl=64 time=335 ms
64 bytes from 30.30.220.170: icmp_seq=15 ttl=64 time=219 ms
64 bytes from 30.30.220.170: icmp_seq=16 ttl=64 time=681 ms
64 bytes from 30.30.220.170: icmp_seq=17 ttl=64 time=1334 ms
64 bytes from 30.30.220.170: icmp_seq=18 ttl=64 time=283 ms
64 bytes from 30.30.220.170: icmp_seq=19 ttl=64 time=167 ms
64 bytes from 30.30.220.170: icmp_seq=20 ttl=64 time=725 ms


I have seen it with CX5, not tested with CX6 yet
04:00.0 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5]
04:00.1 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5]



Version-Release number of selected component (if applicable):
RHOS-17.1-RHEL-9-20230301.n.1

How reproducible:
1. deploy ovn hwoffload templates: ospd-17.1-geneve-ovn-hw-offload-ctlplane-dataplane-bonding-hybrid
2. spawn vms using ovs hwoffload vlan vfs (of geneve, same issue)
3. ping either from the undercloud to the fip or from the compute namespace  ovnmeta-xxxx to the instance ip address
4. Check the latency variability


Actual results:
High Ping Latency variability


Expected results:
Ping Latency should be lower and more stable


Additional info:

Comment 3 Miguel Angel Nieto 2023-03-16 13:40:13 UTC
I opened this issue in a ovn scenario, but same issue in ml2-ovs

64 bytes from 10.46.228.38: icmp_seq=434 ttl=61 time=92.3 ms
64 bytes from 10.46.228.38: icmp_seq=435 ttl=61 time=97.6 ms
64 bytes from 10.46.228.38: icmp_seq=436 ttl=61 time=650 ms
64 bytes from 10.46.228.38: icmp_seq=437 ttl=61 time=650 ms
64 bytes from 10.46.228.38: icmp_seq=438 ttl=61 time=650 ms
64 bytes from 10.46.228.38: icmp_seq=439 ttl=61 time=177 ms
64 bytes from 10.46.228.38: icmp_seq=440 ttl=61 time=650 ms
64 bytes from 10.46.228.38: icmp_seq=441 ttl=61 time=649 ms
64 bytes from 10.46.228.38: icmp_seq=442 ttl=61 time=649 ms
64 bytes from 10.46.228.38: icmp_seq=443 ttl=61 time=650 ms
64 bytes from 10.46.228.38: icmp_seq=444 ttl=61 time=650 ms
64 bytes from 10.46.228.38: icmp_seq=445 ttl=61 time=90.9 ms
64 bytes from 10.46.228.38: icmp_seq=446 ttl=61 time=648 ms

Comment 4 Miguel Angel Nieto 2023-03-16 13:43:42 UTC
ping from vm to vm in a ml2-ovs scenario

[cloud-user@tempest-testnfvoffload-server-995490205 ~]$ ping 30.30.220.199                                                                                                                                         
PING 30.30.220.199 (30.30.220.199) 56(84) bytes of data.                                                                                                                                                           
64 bytes from 30.30.220.199: icmp_seq=1 ttl=64 time=3595 ms                                                                                                                                                        
64 bytes from 30.30.220.199: icmp_seq=2 ttl=64 time=2595 ms
64 bytes from 30.30.220.199: icmp_seq=3 ttl=64 time=1595 ms
64 bytes from 30.30.220.199: icmp_seq=4 ttl=64 time=595 ms
64 bytes from 30.30.220.199: icmp_seq=5 ttl=64 time=2424 ms
64 bytes from 30.30.220.199: icmp_seq=6 ttl=64 time=1425 ms
64 bytes from 30.30.220.199: icmp_seq=7 ttl=64 time=425 ms
64 bytes from 30.30.220.199: icmp_seq=8 ttl=64 time=1893 ms
64 bytes from 30.30.220.199: icmp_seq=9 ttl=64 time=894 ms
64 bytes from 30.30.220.199: icmp_seq=10 ttl=64 time=824 ms
64 bytes from 30.30.220.199: icmp_seq=11 ttl=64 time=894 ms
64 bytes from 30.30.220.199: icmp_seq=12 ttl=64 time=435 ms
64 bytes from 30.30.220.199: icmp_seq=13 ttl=64 time=873 ms
64 bytes from 30.30.220.199: icmp_seq=14 ttl=64 time=1895 ms
64 bytes from 30.30.220.199: icmp_seq=15 ttl=64 time=896 ms
64 bytes from 30.30.220.199: icmp_seq=16 ttl=64 time=2896 ms
64 bytes from 30.30.220.199: icmp_seq=17 ttl=64 time=1896 ms
64 bytes from 30.30.220.199: icmp_seq=18 ttl=64 time=896 ms

Comment 12 Miguel Angel Nieto 2023-04-11 09:52:59 UTC
The problem was caused by the guest vm which is a bit old (RHEL 7.6)

When updating the vm guest to RHEL 9.2 I do not see the ping issue any more.


Note You need to log in before you can comment on or make changes to this bug.