Bug 2221922
| Summary: | [17.1][ML2-OVS][OFFLOAD] vrrp packets sent by ha router cause performance regression | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Miguel Angel Nieto <mnietoji> |
| Component: | openvswitch | Assignee: | RHOSP:NFV_Eng <rhosp-nfv-int> |
| Status: | NEW --- | QA Contact: | Eran Kuris <ekuris> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 17.1 (Wallaby) | CC: | apevec, bcafarel, chrisw, eshulman, hakhande, mleitner, rhosp-nfv-int, rjarry, scohen |
| Target Milestone: | --- | Keywords: | Triaged |
| Target Release: | --- | Flags: | ifrangs:
needinfo?
(rhosp-nfv-int) |
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | Type: | Bug | |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 2172622 | ||
Any update on this bz? Those packet arriving to compute that are being dropped are causing performance degradation. So, if we want to improve performance, we need to disable ha in router so that no vrrp packets are generated. Maybe those packets should not be sent, as they are dropped. This is probably a duplicate of bz 2217867. @mnietoji can you try switching to dmfs and see if it fixes the issue?
[root@computehwoffload-r740 ~]# devlink dev param set pci/0000:18:00.0 name flow_steering_mode value dmfs cmode runtime
[root@computehwoffload-r740 ~]# devlink dev param set pci/0000:18:00.1 name flow_steering_mode value dmfs cmode runtime
[root@computehwoffload-r740 ~]# systemctl restart openvswitch
[root@computehwoffload-r740 ~]# devlink dev param show pci/0000:18:00.0 name flow_steering_mode
pci/0000:18:00.0:
name flow_steering_mode type driver-specific
values:
cmode runtime value dmfs
[root@computehwoffload-r740 ~]# devlink dev param show pci/0000:18:00.1 name flow_steering_mode
pci/0000:18:00.1:
name flow_steering_mode type driver-specific
values:
cmode runtime value dmfs
When switching to dmfs there is a improvement in performance, but not as high as when there is no vrpp packets smfs and vrpp packets 22.5 mpps dmfs and vrpp packets 27.3 mpps no vrpp packets (smfs or dmfs) 29.7 mpps (In reply to Robin Jarry from comment #3) > @mnietoji can you try switching to dmfs and see if it fixes the > issue? > > [root@computehwoffload-r740 ~]# devlink dev param set pci/0000:18:00.0 name > flow_steering_mode value dmfs cmode runtime > [root@computehwoffload-r740 ~]# devlink dev param set pci/0000:18:00.1 name > flow_steering_mode value dmfs cmode runtime > [root@computehwoffload-r740 ~]# systemctl restart openvswitch > [root@computehwoffload-r740 ~]# devlink dev param show pci/0000:18:00.0 > name flow_steering_mode > pci/0000:18:00.0: > name flow_steering_mode type driver-specific > values: > cmode runtime value dmfs > [root@computehwoffload-r740 ~]# devlink dev param show pci/0000:18:00.1 > name flow_steering_mode > pci/0000:18:00.1: > name flow_steering_mode type driver-specific > values: > cmode runtime value dmfs |
Description of problem: When enabling HA in a router, controller sends VRRP packets to computes through the vxlan tunnel but computes drops these packets when they arrive (overcloud) [stack@undercloud-0 ~]$ openstack router set --disable router (overcloud) [stack@undercloud-0 ~]$ openstack router set --ha router (overcloud) [stack@undercloud-0 ~]$ openstack router set --enable router (overcloud) [stack@undercloud-0 ~]$ openstack router show router --fit-width +-------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Field | Value | +-------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | admin_state_up | UP | | availability_zone_hints | | | availability_zones | nova | | created_at | 2023-07-10T14:05:13Z | | description | | | external_gateway_info | {"network_id": "18d67487-db66-414f-a709-7e7185516b1c", "external_fixed_ips": [{"subnet_id": "7961405c-6675-4791-85ec-b180cf1b0fe6", "ip_address": "10.46.228.39"}], "enable_snat": | | | true} | | flavor_id | None | | ha | True | | id | f2a30e44-aa20-4c6a-9127-585422eb1c59 | | interfaces_info | [{"port_id": "1606473c-8b99-4c4e-8d21-a70334f02ff5", "ip_address": "169.254.192.15", "subnet_id": "6cd0e05a-bf05-4a6f-acc1-8518d511c66d"}, {"port_id": | | | "17b9de50-b364-472a-876e-77f6a10b9b5e", "ip_address": "169.254.192.131", "subnet_id": "6cd0e05a-bf05-4a6f-acc1-8518d511c66d"}, {"port_id": "2aef78d7-3417-4074-ac98-dfe9c68d5f27", | | | "ip_address": "10.10.144.254", "subnet_id": "2531cfd7-c366-45b5-ae80-3a352e1c295d"}, {"port_id": "f8843691-1b2f-4a8f-9771-7a6254d722c2", "ip_address": "169.254.194.210", | | | "subnet_id": "6cd0e05a-bf05-4a6f-acc1-8518d511c66d"}] | | name | router | | project_id | 5e533057d91741bcb54359c9a77c8a4d | | revision_number | 23 | | routes | | | status | ACTIVE | | tags | | | updated_at | 2023-07-11T09:33:15Z | +-------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ I see in compute this flow: [root@computehwoffload-r740 os-net-config]# ovs-appctl dpctl/dump-flows -m type=offloaded ufid:39c661b9-2bf6-428e-9bdc-1974e90acbce, skb_priority(0/0),tunnel(tun_id=0xb70e,src=10.10.141.157,dst=10.10.141.136,ttl=0/0,tp_dst=4789,flags(+key)),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(vxlan_sys_4789),packet_type(ns=0/0,id=0/0),eth(src=00:00:00:00:00:00/00:00:00:00:00:00,dst=00:00:00:00:00:00/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=0.0.0.0/0.0.0.0,proto=0/0,tos=0/0,ttl=0/0,frag=no), packets:148, bytes:7992, used:0.680s, offloaded:yes, dp:tc, actions:drop If I go to the controller I see those packets [root@controller-1 tripleo-admin]# sudo tcpdump -pni vlan141 "port 4789" dropped privs to tcpdump tcpdump: verbose output suppressed, use -v[v]... for full protocol decode listening on vlan141, link-type EN10MB (Ethernet), snapshot length 262144 bytes 09:40:55.834665 IP 10.10.141.157.56032 > 10.10.141.136.vxlan: VXLAN, flags [I] (0x08), vni 46862 IP 169.254.194.210 > 224.0.0.18: VRRPv2, Advertisement, vrid 90, prio 50, authtype none, intvl 2s, length 20 09:40:55.834687 IP 10.10.141.157.56032 > 10.10.141.147.vxlan: VXLAN, flags [I] (0x08), vni 46862 IP 169.254.194.210 > 224.0.0.18: VRRPv2, Advertisement, vrid 90, prio 50, authtype none, intvl 2s, length 20 09:40:55.834691 IP 10.10.141.157.56032 > 10.10.141.191.vxlan: VXLAN, flags [I] (0x08), vni 46862 IP 169.254.194.210 > 224.0.0.18: VRRPv2, Advertisement, vrid 90, prio 50, authtype none, intvl 2s, length 20 09:40:55.834695 IP 10.10.141.157.56032 > 10.10.141.196.vxlan: VXLAN, flags [I] (0x08), vni 46862 IP 169.254.194.210 > 224.0.0.18: VRRPv2, Advertisement, vrid 90, prio 50, authtype none, intvl 2s, length 20 09:40:57.834797 IP 10.10.141.157.56032 > 10.10.141.136.vxlan: VXLAN, flags [I] (0x08), vni 46862 IP 169.254.194.210 > 224.0.0.18: VRRPv2, Advertisement, vrid 90, prio 50, authtype none, intvl 2s, length 20 09:40:57.834816 IP 10.10.141.157.56032 > 10.10.141.147.vxlan: VXLAN, flags [I] (0x08), vni 46862 IP 169.254.194.210 > 224.0.0.18: VRRPv2, Advertisement, vrid 90, prio 50, authtype none, intvl 2s, length 20 09:40:57.834821 IP 10.10.141.157.56032 > 10.10.141.191.vxlan: VXLAN, flags [I] (0x08), vni 46862 IP 169.254.194.210 > 224.0.0.18: VRRPv2, Advertisement, vrid 90, prio 50, authtype none, intvl 2s, length 20 09:40:57.834825 IP 10.10.141.157.56032 > 10.10.141.196.vxlan: VXLAN, flags [I] (0x08), vni 46862 IP 169.254.194.210 > 224.0.0.18: VRRPv2, Advertisement, vrid 90, prio 50, authtype none, intvl 2s, length 20 09:40:59.834939 IP 10.10.141.157.56032 > 10.10.141.136.vxlan: VXLAN, flags [I] (0x08), vni 46862 IP 169.254.194.210 > 224.0.0.18: VRRPv2, Advertisement, vrid 90, prio 50, authtype none, intvl 2s, length 20 09:40:59.834964 IP 10.10.141.157.56032 > 10.10.141.147.vxlan: VXLAN, flags [I] (0x08), vni 46862 IP 169.254.194.210 > 224.0.0.18: VRRPv2, Advertisement, vrid 90, prio 50, authtype none, intvl 2s, length 20 09:40:59.834972 IP 10.10.141.157.56032 > 10.10.141.191.vxlan: VXLAN, flags [I] (0x08), vni 46862 IP 169.254.194.210 > 224.0.0.18: VRRPv2, Advertisement, vrid 90, prio 50, authtype none, intvl 2s, length 20 09:40:59.834978 IP 10.10.141.157.56032 > 10.10.141.196.vxlan: VXLAN, flags [I] (0x08), vni 46862 IP 169.254.194.210 > 224.0.0.18: VRRPv2, Advertisement, vrid 90, prio 50, authtype none, intvl 2s, length 20 These drops are causing 13% performance loss in offload (checking the reason outside of this bz). If i disable HA, those packets are not send and I do not see drop flows in compute (overcloud) [stack@undercloud-0 ~]$ openstack router set --disable router (overcloud) [stack@undercloud-0 ~]$ openstack router set --no-ha router (overcloud) [stack@undercloud-0 ~]$ openstack router set --enable router Why to send packets to computes that will be dropped? Version-Release number of selected component (if applicable): RHOS-17.1-RHEL-9-20230628.n.2 How reproducible: 1. Deploy an ml2-ovs hwoffload scenario 2. Create networks and a router 3. Disable/Enable HA in the router and check flows in the compute Actual results: Controllers send packets to the compute that are dropped Expected results: Not sure about the expected behaviour, if packets should be sent or not to the compute, but I think they should not be dropped if they are sent, Maybe they should not be sent. Additional info: