Bug 2109516 - [16.2][OVN migration] iptables hybrid OVS-specific leftovers (qbr/qvb/qvo) still exist after VM migration
Summary: [16.2][OVN migration] iptables hybrid OVS-specific leftovers (qbr/qvb/qvo) st...
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: python-networking-ovn
Version: 16.2 (Train)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: Arnau Verdaguer
QA Contact: Eran Kuris
URL:
Whiteboard:
Depends On: 2106370
Blocks: 2075038 2075039
TreeView+ depends on / blocked
 
Reported: 2022-07-21 13:15 UTC by Roman Safronov
Modified: 2022-08-08 12:30 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 2106370
Environment:
Last Closed: 2022-08-08 12:30:21 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker OSP-17776 0 None None None 2022-07-21 13:18:03 UTC

Description Roman Safronov 2022-07-21 13:15:39 UTC
+++ This bug was initially created as a clone of Bug #2106370 +++

Description of problem:
The issue found after OVN migration from an environment with OVS neutron backend + iptables_hybrid firewall driver.
After live migrating an existing VM it's expected that OVS-specific leftovers of iptables_hybrid firewall (i.e. interfaces like qbr/qvo/qvb) disapper and VM IP will still be accessible.
However, the OVS leftovers did not disappear.

Version-Release number of selected component (if applicable):
RHOS-16.2-RHEL-8-20220610.n.1


How reproducible:
100%

Steps to Reproduce:
1. Deploy an HA environment (3 controllers + 2 compute nodes) with OVS neutron backend and iptables_hybrid firewall driver. In my case it was environment with DVR enabled.
2. Create a workload. In my case I created an internal network, a router connecting the internal network to the  external one. And connected 2 VMs with normal ports to the internal network and add FIPs to the VMs.
3. Perform migration from ovs to ovn using the official procedure https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.2/html/migrating_the_networking_service_to_the_ml2ovn_mechanism_driver/index
4. Live migrate any of workload VMs


Actual results:
The VM migrated successfully but still is connected using so-called "hybrid connection", i.e. using intermediate linux bridge (it's a leftover of ml2/ovs+iptables_hybrid firewall driver)

Expected results:
After VM migration a VM that used hybrid connection is reconnected with OVN native connection.
OVS-specific leftovers of hybrid connection do not exist after the VM migration.

Additional info:
I migrated a VM with id 5be775c2-2660-4b90-9957-e45db751742e that runs now on compute-1

Let's see it's port ID

 [stack@undercloud-0 ~]$ openstack port list --device-id=5be775c2-2660-4b90-9957-e45db751742e
+--------------------------------------+----------------------------------------+-------------------+----------------------------------------------------------------------------------------------------+--------+
| ID                                   | Name                                   | MAC Address       | Fixed IP Addresses                                                                                 | Status |
+--------------------------------------+----------------------------------------+-------------------+----------------------------------------------------------------------------------------------------+--------+
| 87431e6a-e137-4bca-84f2-26fa47a6c9f8 | ovn-migration-port-normal-int-pinger-1 | fa:16:3e:a7:83:8e | ip_address='192.168.168.225', subnet_id='6e126d51-782d-4e30-a9a4-4bda4573050a'                     | ACTIVE |
|                                      |                                        |                   | ip_address='2001:db8:cafe:1:f816:3eff:fea7:838e', subnet_id='711f54c7-d738-442f-b5c7-89d0c7b3a6df' |        |
+--------------------------------------+----------------------------------------+-------------------+----------------------------------------------------------------------------------------------------+--------+

The port id starts from  87431e6a

Let's log to compute-1 and check for hybrid connection leftovers

[heat-admin@compute-1 ~]$ ip a | grep 87431e6a
44: qbr87431e6a-e1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1442 qdisc noqueue state UP group default qlen 1000
45: qvo87431e6a-e1@qvb87431e6a-e1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1442 qdisc noqueue master ovs-system state UP group default qlen 1000
46: qvb87431e6a-e1@qvo87431e6a-e1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1442 qdisc noqueue master qbr87431e6a-e1 state UP group default qlen 1000
47: tap87431e6a-e1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1442 qdisc noqueue master qbr87431e6a-e1 state UNKNOWN group default qlen 1000

Comment 1 Roman Safronov 2022-07-28 14:39:14 UTC
According to Slawek (skaplons) VM is not expected to be reconnected with native OVN connection (without qbr bridge) after reboot. 
But for VM migration scenario the BZ is still relevant.

Comment 2 Arnau Verdaguer 2022-08-03 15:29:11 UTC
Hello Roman,

I've reproduced this environment, create a workload using this script:
"""
openstack flavor create --disk 1 --ram 128 m1.tiny
openstack image create cirros --file cirros.img --disk-format qcow2 --container-format bare --public
openstack network create net1
openstack subnet create --subnet-range 192.168.100.0/24 --network net1 subnet1
openstack router create router1
openstack router add subnet router1 subnet1
openstack router set --external-gateway nova router1
openstack security group create secgroup1
openstack security group rule create --protocol tcp --dst-port 22 secgroup1
openstack security group rule create --protocol icmp secgroup1
openstack server create --nic net-id=net1 --flavor m1.tiny --image cirros --security-group secgroup1 server0
openstack floating ip create --port $(openstack port list --server server0 -c id -f value) nova
openstack server create --nic net-id=net1 --flavor m1.tiny --image cirros --security-group secgroup1 server1
openstack floating ip create --port $(openstack port list --server server1 -c id -f value) nova
openstack server create --nic net-id=net1 --flavor m1.tiny --image cirros --security-group secgroup1 server2
"""

Then migrate to OVN, after migrate server1 is on compute-0:
server1                                 | ACTIVE | None       | Running     | net1=192.168.100.245, 10.0.0.208         | cirros                      | 70cb9af6-72d3-4058-95f0-86304559ae0a | m1.tiny              | nova              | compute-0.redhat.local

And has the port:
(overcloud) [stack@undercloud-0 ovn_migration]$ nova interface-list server1
+------------+--------------------------------------+--------------------------------------+-----------------+-------------------+-----+
| Port State | Port ID                              | Net ID                               | IP addresses    | MAC Addr          | Tag |
+------------+--------------------------------------+--------------------------------------+-----------------+-------------------+-----+
| ACTIVE     | e540baf8-ff02-490f-906e-772a364bff53 | b96fd417-1d22-40e2-8021-36e10a5f847f | 192.168.100.245 | fa:16:3e:f0:ea:c8 | -   |
+------------+--------------------------------------+--------------------------------------+-----------------+-------------------+-----+

Which can be found at compute-0:
[root@compute-0 heat-admin]# ip -br -c  a s | grep e540baf8
qbre540baf8-ff   UP             fe80::74e9:6eff:fe7a:9f80/64
qvoe540baf8-ff@qvbe540baf8-ff UP             fe80::a0fd:a5ff:fe21:9571/64
qvbe540baf8-ff@qvoe540baf8-ff UP             fe80::74e9:6eff:fe7a:9f80/64
tape540baf8-ff   UNKNOWN        fe80::fc16:3eff:fef0:eac8/64

Once the server1 has been migrated:
[root@compute-0 heat-admin]# ip -br -c a s | grep e540baf8
[root@compute-0 heat-admin]# ovs-ofctl show br-int | grep e540baf8
[root@compute-0 heat-admin]#

[root@compute-1 heat-admin]# ip -br -c a s | grep e540baf8
tape540baf8-ff   UNKNOWN        fe80::fc16:3eff:fef0:eac8/64
[root@compute-1 heat-admin]# ovs-ofctl show br-int | grep e540baf8
        Port tape540baf8-ff
            Interface tape540baf8-ff

Neither on the compute-0 nor compute-1 I can find the leftovers.

Comment 4 Arnau Verdaguer 2022-08-05 10:58:17 UTC
Hello Roman,

I redid the test (this time using a normal VM, as last test, and a trunk VM).

Both migrated fine and deleted all associated resources.

I've gone through some of the QE 16.2 CI and the problems related to the stale resources are:
- FAIL: There are stale ip6tables rules related to id 9df22a10 of vm 8000c5ca-f463-40b0-9925-1a7fa41ea927 on node compute-0 [0]
  It's true that the ip6tables are not deleted after the migration, will further investigate on that
- FAIL: OVS-specific NIC qbrec0d4bf4-d1: related to vm 3d1a2b72-b4aa-4c69-a8c5-d7c67ecaa420 found on compute-1 [1]
  This test is: test_reboot_vm_with_trunk and as said by slaweq (skaplons) this is expected, so
  this test should be changed.


[0] https://rhos-ci-jenkins.lab.eng.tlv2.redhat.com/view/QE/view/OSP16.2/job/DFG-network-networking-ovn-16.2_director-rhel-virthost-3cont_2comp-ipv4-vxlan-ml2ovs-to-ovn-migration_nodvr-to-nodvr_iptables_fw/10/testReport/ovn_migration_validations/validate-workload-operations/OVN_migration___test_live_migration_vm_with_trunk/
[1] https://rhos-ci-jenkins.lab.eng.tlv2.redhat.com/view/QE/view/OSP16.2/job/DFG-network-networking-ovn-16.2_director-rhel-virthost-3cont_2comp-ipv4-vxlan-ml2ovs-to-ovn-migration_nodvr-to-nodvr_iptables_fw/10/testReport/ovn_migration_validations/validate-workload-operations/OVN_migration___test_reboot_vm_with_trunk/

Comment 6 Roman Safronov 2022-08-08 12:30:21 UTC
Verified on RHOS-16.2-RHEL-8-20220804.n.1
Verified that after vm live migration the qbr/qvb/qvo leftovers are deleted.


Note You need to log in before you can comment on or make changes to this bug.