Bug 2106376

Summary: [OVN migration] VMs with trunk ports are inaccessible after stop/start
Product: Red Hat OpenStack Reporter: Roman Safronov <rsafrono>
Component: openstack-neutronAssignee: Arnau Verdaguer <averdagu>
Status: ASSIGNED --- QA Contact: Eran Kuris <ekuris>
Severity: medium Docs Contact:
Priority: high    
Version: 17.0 (Wallaby)CC: averdagu, chrisw, ekuris, mburns, mlavalle, scohen
Target Milestone: ---Keywords: AutomationBlocker, Regression, Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 2217504 (view as bug list) Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2217504    

Description Roman Safronov 2022-07-12 13:33:46 UTC
Description of problem:
VM with trunk ports (is was connected to the external network) became inaccessible after it was rebooted.
This happened after migration from OVS with iptables_hybrid firewall driver to OVN.

Note: the VMs that were connected to the internal network and use normal ports have no issues after reboot.

Version-Release number of selected component (if applicable):
RHOS-17.0-RHEL-9-20220711.n.1
openstack-neutron-ovn-migration-tool-18.4.1-0.20220705190433.5258354.el9ost.noarch
openstack-neutron-18.4.1-0.20220705190433.5258354.el9ost.noarch
ovn22.03-22.03.0-62.el9fdp.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Deploy an HA environment (3 controllers + 2 compute nodes) with OVS neutron backend and iptables_hybrid firewall driver. In my case it was environment with DVR enabled.
2. Create a workload. In my case I created 2 VMs with trunk ports, connected to the external netwok.  
3. Perform migration from ovs to ovn using the official procedure https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.2/html/migrating_the_networking_service_to_the_ml2ovn_mechanism_driver/index
4. Reboot a VM with trunk port and try to ping it's IP address.

Actual results:
IP address does not respond to ping

Expected results:
IP address of the rebooted VM with trunk port responds to ping

Additional info:

Comment 1 Arnau Verdaguer 2022-07-28 10:41:23 UTC
After some testing if the reboot is done with a soft reboot (openstack server reboot server_name) the OVN SB DB will release the port and the subport but it will not claim it back once the reboot is completed.

This is done bc the reboot (soft shutdown) will not destroy and recreate the interfaces, hence it will remain attached to the tbr-xxx bridge (intermediate bridge used by OVS) instead of the br-int (used on OVN).
if the reboot is done as a hard reboot (openstack server reboot --hard server_name or openstack stop/start server_name) it should be accessible after the reboot.

On the QE-validation script the reboot is done by stop and starting the VM (hard shutdown) but I've not been able to replicate the issue (on the lasts gates there are an error with the ovn-workload-validation script)

I'll further investigate the issue