1693532 – Both the source pod and target pod network IP is Unreachable after migration succeeded

Bug 1693532 - Both the source pod and target pod network IP is Unreachable after migration succeeded

Summary: Both the source pod and target pod network IP is Unreachable after migration ...

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Container Native Virtualization (CNV)
Classification:	Red Hat
Component:	Virtualization
Sub Component:
Version:	2.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	2.1.0
Assignee:	Vladik Romanovsky
QA Contact:	Denys Shchedrivyi
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2019-03-28 06:22 UTC by Yan Du
Modified:	2019-11-04 15:08 UTC (History)
CC List:	8 users (show)
Fixed In Version:	hco-bundle-registry-container-v2.1.0-47
Doc Type:	Known Issue
Doc Text:	Cause: Performing a VM live migration with a pod network backed vNIC Consequence: The VM is not reachable through that vNIC anymore after the live migration Workaround (if any): Use an additional multus backed network to keep it reachable after migration Result:
Clone Of:
Environment:
Last Closed:	2019-11-04 15:08:55 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
vm spec (720 bytes, text/plain) 2019-03-31 13:39 UTC, Yan Du	no flags	Details
View All

Description Yan Du 2019-03-28 06:22:35 UTC

Description of problem:
Both the source pod and target pod's IP is Unreachable after migration succeeded

Version-Release number of selected component (if applicable):
v0.15.0

How reproducible:
Always

Steps to Reproduce:

1. Create a vm

2. Start the vm
# oc get pod -o wide
NAME                            READY     STATUS    RESTARTS   AGE       IP            NODE                                    NOMINATED NODE
virt-launcher-vm-cirros-l4fxp   2/2       Running   0          2m        10.129.0.25   cnv-executor-yaduus-node2.example.com   <none>
# ping 10.129.0.25
PING 10.129.0.25 (10.129.0.25) 56(84) bytes of data.
64 bytes from 10.129.0.25: icmp_seq=1 ttl=64 time=3.49 ms
64 bytes from 10.129.0.25: icmp_seq=2 ttl=64 time=1.42 ms


3. Do migration
# oc describe vmim
Name:         job1
Namespace:    d2
Labels:       <none>
Annotations:  <none>
API Version:  kubevirt.io/v1alpha3
Kind:         VirtualMachineInstanceMigration
Metadata:
  Creation Timestamp:  2019-03-28T06:00:54Z
  Generation:          1
  Resource Version:    46666
  Self Link:           /apis/kubevirt.io/v1alpha3/namespaces/d2/virtualmachineinstancemigrations/job1
  UID:                 d8f93813-511e-11e9-9d65-fa163e90cc1a
Spec:
  Vmi Name:  vm-cirros
Status:
  Phase:  Succeeded
Events:
  Type    Reason               Age   From                       Message
  ----    ------               ----  ----                       -------
  Normal  SuccessfulCreate     4m    virtualmachine-controller  Created migration target pod virt-launcher-vm-cirros-9745n
  Normal  SuccessfulHandOver   4m    virtualmachine-controller  Migration target pod is ready for preparation by virt-handler.
  Normal  SuccessfulMigration  4m    virtualmachine-controller  Source node reported migration succeeded


4. Check the ip of source pod and target pod



Actual results:
step4:
# oc get pod -o wide
NAME                            READY     STATUS    RESTARTS   AGE       IP            NODE                                    NOMINATED NODE
virt-launcher-vm-cirros-9745n   2/2       Running   0          18m       10.130.0.15   cnv-executor-yaduus-node1.example.com   <none>
virt-launcher-vm-cirros-l4fxp   1/2       Running   0          20m       10.129.0.25   cnv-executor-yaduus-node2.example.com   <none>

# ping -c 3 10.129.0.25
PING 10.129.0.25 (10.129.0.25) 56(84) bytes of data.
From 10.128.0.1 icmp_seq=1 Destination Host Unreachable
From 10.128.0.1 icmp_seq=2 Destination Host Unreachable
From 10.128.0.1 icmp_seq=3 Destination Host Unreachable

--- 10.129.0.25 ping statistics ---
3 packets transmitted, 0 received, +3 errors, 100% packet loss, time 1999ms

# ping -c 3 10.130.0.15
PING 10.130.0.15 (10.130.0.15) 56(84) bytes of data.
From 10.128.0.1 icmp_seq=1 Destination Host Unreachable
From 10.128.0.1 icmp_seq=2 Destination Host Unreachable
From 10.128.0.1 icmp_seq=3 Destination Host Unreachable

--- 10.130.0.15 ping statistics ---
3 packets transmitted, 0 received, +3 errors, 100% packet loss, time 1999ms

Expected results:
migration should not make the pod IP unreachable.

Additional info:

Comment 1 Dan Kenigsberg 2019-03-29 14:54:11 UTC

Yan, please include in such cases the spec of the vmi.
Did you define a VM with the pod network attached to it?

Comment 2 Yan Du 2019-03-31 13:39:29 UTC

Created attachment 1550150 [details]
vm spec

Comment 3 Dan Kenigsberg 2019-03-31 14:50:03 UTC

(In reply to Dan Kenigsberg from comment #1)
> Did you define a VM with the pod network attached to it?

Comment 4 Yan Du 2019-03-31 15:05:16 UTC

For this bug, I only defined the default network interface in the vm. Please refer to the yaml file I attached in the bug. For L2 network, I have added a comment in JIRA card, please kindly note. Thanks

Comment 5 Dan Kenigsberg 2019-03-31 15:10:25 UTC

The spec you've attached does not mention any network/interface.

Migration cannot really work if you pass the default network to the VM. Please modify this bug to ask the virt team to block migration in this case.

Comment 6 Yan Du 2019-04-01 02:11:00 UTC

If I understand correctly, I think when we start the vm, the vmi will use a default network and interface in VMI as below. (The VM spec is from https://github.com/kubevirt/kubevirt/blob/master/cluster/examples/vm-cirros.yaml) And it should be a correct vm spec since I did the migration according virt-team's suggestion.
        interfaces:
        - bridge: {}
          name: default
      features:
        acpi:
          enabled: true
      firmware:
        uuid: 0d2a2043-41c0-59c3-9b17-025022203668
      machine:
        type: q35
      resources:
        requests:
          memory: 128M
    networks:
    - name: default
      pod: {}

If I misunderstand, please feel free to comment which kind of the network/interface do you want me to provide.

Comment 7 Dan Kenigsberg 2019-04-01 06:20:12 UTC

Thanks, Yan. You have now specified the network and interface, which I had requested back in comment 1.

        interfaces:
        - bridge: {}
          name: default

should not be used in conjunction with migration. It cannot really work, since for most CNIs, the IP on the destination Pod is going to change and surprise the guest. imho migration should be blocked if default+bridge exist in the VMI.

Comment 17 Yan Du 2019-09-17 02:48:32 UTC

Test on latest CNV 2.1, VM migration with pod network is denied. VM migration with multus network works well.

$ oc create -f mig_job.yaml
Error from server: error when creating "job": admission webhook "migration-create-validator.kubevirt.io" denied the request: Cannot migrate VMI, Reason: InterfaceNotLiveMigratable, Message: cannot migrate VMI with a bridge interface connected to a pod network

Note You need to log in before you can comment on or make changes to this bug.