Bug 1976604 - [CNV-5786] IP connectivity is lost after migration (masquerade)
Summary: [CNV-5786] IP connectivity is lost after migration (masquerade)
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Container Native Virtualization (CNV)
Classification: Red Hat
Component: Networking
Version: 4.8.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 4.9.0
Assignee: Miguel Duarte Barroso
QA Contact: awax
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-06-27 13:48 UTC by awax
Modified: 2021-11-02 15:59 UTC (History)
2 users (show)

Fixed In Version: virt-handler-container-v4.9.0-29
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-11-02 15:59:33 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github kubevirt kubevirt pull 5000 0 None closed masquerade, migration: hardcode the bridge MAC addr 2021-06-28 07:46:23 UTC
Red Hat Product Errata RHSA-2021:4104 0 None None None 2021-11-02 15:59:53 UTC

Description awax 2021-06-27 13:48:30 UTC
Created attachment 1795103 [details]
migration_vma_new.yaml

Description of problem:
After a VM is migrated, its ARP tables are messed up and will prevent the VM from outside connectivity. That will recover only once the VM receives ping from the outside or after 5 minutes refresh of ARP.

We should make sure that the connectivity becomes available immediately after the migration.


Version-Release number of selected component (if applicable):
CNV - v.4.8.0
OCP - v.4.8.0-fc.5
Kubernetes Version: v1.21.0-rc.0+88a3e8c


How reproducible:
every time the machine is migrated.


Steps to Reproduce:
1. create a dedicated namespace for the resources that will be created in the next steps. Name it 'anat-test-migration-masquerade' to match the namespace defined in the files attached.
2. create 2 vms (vma and vmb) as a single interface vm's (masquerade) (use 'migration_vma_new.yaml' and 'migration_vmb_new.yaml' files attached).
3. start both VM's:
$ virtctl start vma
$ virtctl start vmb
4. migrate vmb (use 'migration_virtualmachineinstancemigration.yaml' file attached).
5. find the exact moment when the migration finishes. you can find that moment by checking when the vmi is assigned a new IP address using the command:
$ oc get vmi -w
6. as soon as the migration finishes, connect to the migrated VM (vmb). it is important to connect using console and not ssh because connecting through ssh can solve the bug:
$ virtctl console vmb
7. ping from vmb to vma over the main interface (masquerade):
$ ping 10.0.2.1


Actual results:
[fedora@vmb-1624797047-2293534 ~]$ ping 10.0.2.1
PING 10.0.2.1 (10.0.2.1) 56(84) bytes of data.
64 bytes from 10.0.2.1: icmp_seq=9 ttl=64 time=0.635 ms
64 bytes from 10.0.2.1: icmp_seq=10 ttl=64 time=0.289 ms
64 bytes from 10.0.2.1: icmp_seq=11 ttl=64 time=0.332 ms
64 bytes from 10.0.2.1: icmp_seq=12 ttl=64 time=0.765 ms

--- 10.0.2.1 ping statistics ---
12 packets transmitted, 4 received, 66.6667% packet loss, time 11183ms
rtt min/avg/max/mdev = 0.289/0.505/0.765/0.200 ms



Expected results:
no packet loss.


Additional info:

Comment 1 awax 2021-06-27 13:49:03 UTC
Created attachment 1795104 [details]
migration_vmb_new.yaml

Comment 2 awax 2021-06-27 13:49:36 UTC
Created attachment 1795105 [details]
migration_virtualmachineinstancemigration.yaml

Comment 3 Petr Horáček 2021-06-28 07:46:25 UTC
Should be fixed U/S. We are waiting for 4.9 D/S to pick up the fix.

Comment 4 awax 2021-07-15 10:47:10 UTC
Petr, when should this fix show up in the downstream version?

Comment 5 Petr Horáček 2021-07-15 10:55:40 UTC
IIUIC, we switched D/S for 4.9 to follow the main branch only 2 days ago and it seems unstable. So the work on getting it is ongoing, but I can't tell when it will be available as stabilization of new D/S version is usually pretty difficult. I will let you know once we have it available.

Comment 6 Petr Horáček 2021-07-26 07:28:57 UTC
D/S builds of 4.9 are now available.

Comment 7 awax 2021-08-02 14:56:39 UTC
(In reply to Petr Horáček from comment #6)
> D/S builds of 4.9 are now available.

@phoracek 
The fix will be available only when Kubevirt v.0.44 is on D/S build. The current status of 'ON_QA' is problematic because we cannot test it yet.

Comment 8 Petr Horáček 2021-08-02 15:00:47 UTC
The feature should be already on D/S - D/S follows HEAD of the main brunch up until the feature freeze, when it switches to the stable one. With this, the fix should be available as a part of the current D/S build.

Comment 11 errata-xmlrpc 2021-11-02 15:59:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Virtualization 4.9.0 Images security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:4104


Note You need to log in before you can comment on or make changes to this bug.