1976604 – [CNV-5786] IP connectivity is lost after migration (masquerade)

Bug 1976604 - [CNV-5786] IP connectivity is lost after migration (masquerade)

Summary: [CNV-5786] IP connectivity is lost after migration (masquerade)

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Container Native Virtualization (CNV)
Classification:	Red Hat
Component:	Networking
Sub Component:
Version:	4.8.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Target Release:	4.9.0
Assignee:	Miguel Duarte Barroso
QA Contact:	awax
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2021-06-27 13:48 UTC by awax
Modified:	2021-11-02 15:59 UTC (History)
CC List:	2 users (show)
Fixed In Version:	virt-handler-container-v4.9.0-29
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2021-11-02 15:59:33 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	kubevirt kubevirt pull 5000	0	None	closed	masquerade, migration: hardcode the bridge MAC addr	2021-06-28 07:46:23 UTC
Red Hat Product Errata	RHSA-2021:4104	0	None	None	None	2021-11-02 15:59:53 UTC

Description awax 2021-06-27 13:48:30 UTC

Created attachment 1795103 [details]
migration_vma_new.yaml

Description of problem:
After a VM is migrated, its ARP tables are messed up and will prevent the VM from outside connectivity. That will recover only once the VM receives ping from the outside or after 5 minutes refresh of ARP.

We should make sure that the connectivity becomes available immediately after the migration.


Version-Release number of selected component (if applicable):
CNV - v.4.8.0
OCP - v.4.8.0-fc.5
Kubernetes Version: v1.21.0-rc.0+88a3e8c


How reproducible:
every time the machine is migrated.


Steps to Reproduce:
1. create a dedicated namespace for the resources that will be created in the next steps. Name it 'anat-test-migration-masquerade' to match the namespace defined in the files attached.
2. create 2 vms (vma and vmb) as a single interface vm's (masquerade) (use 'migration_vma_new.yaml' and 'migration_vmb_new.yaml' files attached).
3. start both VM's:
$ virtctl start vma
$ virtctl start vmb
4. migrate vmb (use 'migration_virtualmachineinstancemigration.yaml' file attached).
5. find the exact moment when the migration finishes. you can find that moment by checking when the vmi is assigned a new IP address using the command:
$ oc get vmi -w
6. as soon as the migration finishes, connect to the migrated VM (vmb). it is important to connect using console and not ssh because connecting through ssh can solve the bug:
$ virtctl console vmb
7. ping from vmb to vma over the main interface (masquerade):
$ ping 10.0.2.1


Actual results:
[fedora@vmb-1624797047-2293534 ~]$ ping 10.0.2.1
PING 10.0.2.1 (10.0.2.1) 56(84) bytes of data.
64 bytes from 10.0.2.1: icmp_seq=9 ttl=64 time=0.635 ms
64 bytes from 10.0.2.1: icmp_seq=10 ttl=64 time=0.289 ms
64 bytes from 10.0.2.1: icmp_seq=11 ttl=64 time=0.332 ms
64 bytes from 10.0.2.1: icmp_seq=12 ttl=64 time=0.765 ms

--- 10.0.2.1 ping statistics ---
12 packets transmitted, 4 received, 66.6667% packet loss, time 11183ms
rtt min/avg/max/mdev = 0.289/0.505/0.765/0.200 ms



Expected results:
no packet loss.


Additional info:

Comment 1 awax 2021-06-27 13:49:03 UTC

Created attachment 1795104 [details]
migration_vmb_new.yaml

Comment 2 awax 2021-06-27 13:49:36 UTC

Created attachment 1795105 [details]
migration_virtualmachineinstancemigration.yaml

Comment 3 Petr Horáček 2021-06-28 07:46:25 UTC

Should be fixed U/S. We are waiting for 4.9 D/S to pick up the fix.

Comment 4 awax 2021-07-15 10:47:10 UTC

Petr, when should this fix show up in the downstream version?

Comment 5 Petr Horáček 2021-07-15 10:55:40 UTC

IIUIC, we switched D/S for 4.9 to follow the main branch only 2 days ago and it seems unstable. So the work on getting it is ongoing, but I can't tell when it will be available as stabilization of new D/S version is usually pretty difficult. I will let you know once we have it available.

Comment 6 Petr Horáček 2021-07-26 07:28:57 UTC

D/S builds of 4.9 are now available.

Comment 7 awax 2021-08-02 14:56:39 UTC

(In reply to Petr Horáček from comment #6)
> D/S builds of 4.9 are now available.

@phoracek 
The fix will be available only when Kubevirt v.0.44 is on D/S build. The current status of 'ON_QA' is problematic because we cannot test it yet.

Comment 8 Petr Horáček 2021-08-02 15:00:47 UTC

The feature should be already on D/S - D/S follows HEAD of the main brunch up until the feature freeze, when it switches to the stable one. With this, the fix should be available as a part of the current D/S build.

Comment 11 errata-xmlrpc 2021-11-02 15:59:33 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Virtualization 4.9.0 Images security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:4104

Note You need to log in before you can comment on or make changes to this bug.