2187651 – too long connectivity downtime during VM live-migration with BGP

Bug 2187651 - too long connectivity downtime during VM live-migration with BGP

Summary: too long connectivity downtime during VM live-migration with BGP

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	ovn-bgp-agent
Sub Component:
Version:	17.1 (Wallaby)
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	ga
Target Release:	17.1
Assignee:	Luis Tomas Bolivar
QA Contact:	Eduardo Olivares
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2023-04-18 10:30 UTC by Eduardo Olivares
Modified:	2023-08-16 01:15 UTC (History)
CC List:	3 users (show)
Fixed In Version:	ovn-bgp-agent-0.4.1-1.20230512001004.e697e35.el9ost
Doc Type:	No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed:	2023-08-16 01:14:48 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
OpenStack gerrit	880525	None	MERGED	Split creation of base ovs flows from removal of extra leftovers	2023-04-26 09:55:16 UTC
OpenStack gerrit	880526	None	MERGED	Ensure the needed mac tweak flows are created as part of the wiring	2023-04-26 09:55:14 UTC
Red Hat Issue Tracker	OSP-24284	None	None	None	2023-04-18 10:34:29 UTC
Red Hat Product Errata	RHEA-2023:4577	None	None	None	2023-08-16 01:15:18 UTC

Description Eduardo Olivares 2023-04-18 10:30:45 UTC

Description of problem:
This issue can be reproduced with the upstream tempest test test_server_connectivity_live_migration, but it needs to be updated with this change (otherwise it wrongly passes): https://review.opendev.org/c/openstack/tempest/+/880719
It only fails on BGP setups and that's the reason why the component is set initially to ovn-bgp-agent, although the fix may be implemented in neutron or somewhere else.

The manual reproduction is simple:
- create a VM connected to a provider network with external connectivity
- start a ping from the VM to an external IP (8.8.8.8) - by default one ping is sent per second
- run the following command: openstack server migrate --live-migration vm0
- stop the ping command and check the number of pings not replied

In a non-BGP setup, only one ping is lost (~1 second of connectivity downtime). In a BGP setup, the downtime takes between 15 and 20 seconds.
The reason is that the default GWs MAC address changes when the VM is migrated to a different compute, because it corresponds with the compute's br-ex interface in case of BGP setups. This change doesn't happen immediately in the VM's ARP table. It happens when the VM sends an ARP asking for the MAC of that default GW.


Packets captured at the VM eth0 interface before the migration from comp-0 to comp-1 (the MAC a6:e1:df:19:b3:45 corresponds with the comp-0 br-ex interface) show that the pings are successfully replied:
09:41:20.683835 fa:16:3e:27:ba:5b > a6:e1:df:19:b3:45, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 2493, offset 0, flags [DF], proto ICMP (1), length 84)
    172.24.100.16 > 8.8.8.8: ICMP echo request, id 1, seq 22, length 64
09:41:20.756554 a6:e1:df:19:b3:45 > fa:16:3e:27:ba:5b, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 51, id 0, offset 0, flags [none], proto ICMP (1), length 84)
    8.8.8.8 > 172.24.100.16: ICMP echo reply, id 1, seq 22, length 64
09:41:21.685383 fa:16:3e:27:ba:5b > a6:e1:df:19:b3:45, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 3418, offset 0, flags [DF], proto ICMP (1), length 84)
    172.24.100.16 > 8.8.8.8: ICMP echo request, id 1, seq 23, length 64
09:41:21.757542 a6:e1:df:19:b3:45 > fa:16:3e:27:ba:5b, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 51, id 0, offset 0, flags [none], proto ICMP (1), length 84)
    8.8.8.8 > 172.24.100.16: ICMP echo reply, id 1, seq 23, length 64


When the VM is migrated to comp-1, the following ARP is captured (46:fd:fb:5d:e1:41 is comp-1 br-ex MAC):
09:41:22.794658 fa:16:3e:27:ba:5b > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Ethernet (len 6), IPv4 (len 4), Request who-has 172.24.100.16 tell 172.24.100.16, length 28
09:41:23.028863 46:fd:fb:5d:e1:41 > fa:16:3e:27:ba:5b, ethertype ARP (0x0806), length 42: Ethernet (len 6), IPv4 (len 4), Reply 172.24.100.16 is-at 46:fd:fb:5d:e1:41, length 28

After that, pings are not replied during ~17 seconds - they are sent to the wrong destination MAC (a6:e1:df:19:b3:45 is from comp-0 br-ex, but the VM is running on comp-1 now):
09:41:23.734689 fa:16:3e:27:ba:5b > a6:e1:df:19:b3:45, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 4027, offset 0, flags [DF], proto ICMP (1), length 84)
    172.24.100.16 > 8.8.8.8: ICMP echo request, id 1, seq 25, length 64
09:41:24.758616 fa:16:3e:27:ba:5b > a6:e1:df:19:b3:45, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 4042, offset 0, flags [DF], proto ICMP (1), length 84)
    172.24.100.16 > 8.8.8.8: ICMP echo request, id 1, seq 26, length 64
...
09:41:40.118865 fa:16:3e:27:ba:5b > a6:e1:df:19:b3:45, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 12990, offset 0, flags [DF], proto ICMP (1), length 84)
    172.24.100.16 > 8.8.8.8: ICMP echo request, id 1, seq 41, length 64


Then, the following ARP fixes the problem with the destination MAC (46:fd:fb:5d:e1:41 is from comp-1 br-ex):
09:41:41.143189 fa:16:3e:27:ba:5b > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Ethernet (len 6), IPv4 (len 4), Request who-has 172.24.100.1 tell 172.24.100.16, length 28
09:41:41.917087 46:fd:fb:5d:e1:41 > fa:16:3e:27:ba:5b, ethertype ARP (0x0806), length 42: Ethernet (len 6), IPv4 (len 4), Reply 172.24.100.1 is-at 46:fd:fb:5d:e1:41, length 28
09:41:41.917114 fa:16:3e:27:ba:5b > 46:fd:fb:5d:e1:41, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 13055, offset 0, flags [DF], proto ICMP (1), length 84)
    172.24.100.16 > 8.8.8.8: ICMP echo request, id 1, seq 42, length 64
09:41:41.990308 46:fd:fb:5d:e1:41 > fa:16:3e:27:ba:5b, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 51, id 0, offset 0, flags [none], proto ICMP (1), length 84)
    8.8.8.8 > 172.24.100.16: ICMP echo reply, id 1, seq 42, length 64




The tempest test test_server_connectivity_live_migration covers the scenario of a VM with a port from a tenant network and with a FIP. It fails too.

I will add a comment when I test the scenario with a tenant network and no FIP.



Version-Release number of selected component (if applicable):
RHOS-17.1-RHEL-9-20230404.n.1

How reproducible:
100%


Actual results:
Connectivity downtime of 15+ seconds

Expected results:
Lower connectivity downtime during/after live-migration

Comment 1 Eduardo Olivares 2023-04-18 11:17:42 UTC

The downtime without FIP is 3 seconds or less. Even if the VM is migrated to a compute from a different rack (connected to different leaf/s), the downtime is low because the destination MAC corresponds with the router gateway (typically the IP X.X.X.1), which doesn't change during the migration.

Comment 4 Eduardo Olivares 2023-04-18 14:35:23 UTC

This bug only occurs when no other VM is running on the destination compute.
If another VM was running on that compute before the VM under test is migrated, the flows from [1] already existed and the measured downtime is 2 seconds or less

If no previous VM was running on that compute, this flows would not exist until they would be created by the sync process.


[1] 
[root@cmp-1-0 ~]# ovs-ofctl dump-flows br-ex
 cookie=0x3e7, duration=75.948s, table=0, n_packets=1, n_bytes=90, priority=900,ip,in_port="patch-provnet-4" actions=mod_dl_dst:46:fd:fb:5d:e1:41,NORMAL                                                                                     
 cookie=0x3e7, duration=75.938s, table=0, n_packets=0, n_bytes=0, priority=900,ipv6,in_port="patch-provnet-4" actions=mod_dl_dst:46:fd:fb:5d:e1:41,NORMAL                                                                                    
 cookie=0x0, duration=592819.333s, table=0, n_packets=6151, n_bytes=823797, priority=0 actions=NORMAL

Comment 17 errata-xmlrpc 2023-08-16 01:14:48 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Release of components for Red Hat OpenStack Platform 17.1 (Wallaby)), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2023:4577

Note You need to log in before you can comment on or make changes to this bug.