Bug 2106370

Summary: [OSP17.0][OVN migration] iptables hybrid OVS-specific leftovers (qbr/qvb/qvo) still exist after VM migration
Product: Red Hat OpenStack Reporter: Roman Safronov <rsafrono>
Component: openstack-neutronAssignee: Arnau Verdaguer <averdagu>
Status: CLOSED WORKSFORME QA Contact: Eran Kuris <ekuris>
Severity: high Docs Contact:
Priority: unspecified    
Version: 17.0 (Wallaby)CC: averdagu, chrisw, mlavalle, scohen
Target Milestone: ---Keywords: AutomationBlocker
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 2109516 (view as bug list) Environment:
Last Closed: 2022-08-08 12:36:50 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2075038, 2109516    

Description Roman Safronov 2022-07-12 13:24:25 UTC
Description of problem:
The issue found after OVN migration from an environment with OVS neutron backend + iptables_hybrid firewall driver.
After live migrating an existing VM it's expected that OVS-specific leftovers of iptables_hybrid firewall (i.e. interfaces like qbr/qvo/qvb) disapper and VM IP will still be accessible.
However, the OVS leftovers did not disappear.

Version-Release number of selected component (if applicable):
RHOS-17.0-RHEL-9-20220711.n.1
openstack-neutron-ovn-migration-tool-18.4.1-0.20220705190433.5258354.el9ost.noarch
openstack-neutron-18.4.1-0.20220705190433.5258354.el9ost.noarch
ovn22.03-22.03.0-62.el9fdp.x86_64


How reproducible:
100%

Steps to Reproduce:
1. Deploy an HA environment (3 controllers + 2 compute nodes) with OVS neutron backend and iptables_hybrid firewall driver. In my case it was environment with DVR enabled.
2. Create a workload. In my case I created an internal network, a router connecting the internal network to the  external one. And connected 2 VMs with normal ports to the internal network and add FIPs to the VMs.
3. Perform migration from ovs to ovn using the official procedure https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.2/html/migrating_the_networking_service_to_the_ml2ovn_mechanism_driver/index
4. Live migrate a workload VM connected to the internal network

Actual results:
The VM migrated successfully but still is connected using so-called "hybrid connection", i.e. using intermediate linux bridge (it's a leftover of ml2/ovs+iptables_hybrid firewall driver)

Expected results:
After live migration a VM that used hybrid connection is reconnected with OVN native connection.
OVS-specific leftovers of hybrid connection do not exist after the VM migration.

Additional info:
I live migrated a VM with id 5be775c2-2660-4b90-9957-e45db751742e that currently runs on compute-1

Let's see it's port ID

 [stack@undercloud-0 ~]$ openstack port list --device-id=5be775c2-2660-4b90-9957-e45db751742e
+--------------------------------------+----------------------------------------+-------------------+----------------------------------------------------------------------------------------------------+--------+
| ID                                   | Name                                   | MAC Address       | Fixed IP Addresses                                                                                 | Status |
+--------------------------------------+----------------------------------------+-------------------+----------------------------------------------------------------------------------------------------+--------+
| 87431e6a-e137-4bca-84f2-26fa47a6c9f8 | ovn-migration-port-normal-int-pinger-1 | fa:16:3e:a7:83:8e | ip_address='192.168.168.225', subnet_id='6e126d51-782d-4e30-a9a4-4bda4573050a'                     | ACTIVE |
|                                      |                                        |                   | ip_address='2001:db8:cafe:1:f816:3eff:fea7:838e', subnet_id='711f54c7-d738-442f-b5c7-89d0c7b3a6df' |        |
+--------------------------------------+----------------------------------------+-------------------+----------------------------------------------------------------------------------------------------+--------+

The port id starts from  87431e6a

Let's log to compute-1 and check for hybrid connection leftovers

[heat-admin@compute-1 ~]$ ip a | grep 87431e6a
44: qbr87431e6a-e1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1442 qdisc noqueue state UP group default qlen 1000
45: qvo87431e6a-e1@qvb87431e6a-e1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1442 qdisc noqueue master ovs-system state UP group default qlen 1000
46: qvb87431e6a-e1@qvo87431e6a-e1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1442 qdisc noqueue master qbr87431e6a-e1 state UP group default qlen 1000
47: tap87431e6a-e1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1442 qdisc noqueue master qbr87431e6a-e1 state UNKNOWN group default qlen 1000

Comment 1 Roman Safronov 2022-07-28 14:42:40 UTC
According to Slawek (skaplons) VM is not expected to be reconnected with native OVN connection (without qbr bridge) after reboot. 
But for VM migration scenario the BZ is still relevant.

Comment 2 Roman Safronov 2022-07-28 14:45:32 UTC
Note: currently it's not possible to reproduce the BZ on OSP17 with live migration scenario due to Bug 2077964 - Live migration fails due to Cannot load 'vifs' in the base class: NotImplementedError: Cannot load 'vifs' in the base class

Comment 3 Roman Safronov 2022-08-08 12:36:50 UTC
The BZ was opened on VM reboot scenario but as appeared it's not expected that qbr/qvo/qvb leftovers disappear after VM reboot.
Similar BZ for 16.2 was tested with live migration scenario and finally was closed as WORKSFORME, see https://bugzilla.redhat.com/show_bug.cgi?id=2109516.
Closing this BZ as well since it's not possible to reproduce the issue with live migration scenario on OSP17.