Bug 2139674

Summary: [OVN][16.2] Migration from OVS to OVN hangs on "Sync neutron db with OVN db" task
Product: Red Hat OpenStack Reporter: Roman Safronov <rsafrono>
Component: python-networking-ovnAssignee: Arnau Verdaguer <averdagu>
Status: CLOSED ERRATA QA Contact: Roman Safronov <rsafrono>
Severity: high Docs Contact:
Priority: urgent    
Version: 16.2 (Train)CC: apevec, averdagu, bcafarel, lhh, majopela, mtomaska, scohen, skaplons, stchen, tvignaud, ykarel
Target Milestone: z4Keywords: AutomationBlocker, Regression, Triaged
Target Release: 16.2 (Train on RHEL 8.4)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: python-networking-ovn-7.4.2-2.20220409154865.el8ost Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-12-07 19:25:51 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2143574    
Bug Blocks: 2087721    

Description Roman Safronov 2022-11-03 08:38:00 UTC
Description of problem:
OVN migration hangs on "Sync neutron db with OVN db" task.
Dowstream CI job is not able to complete OVN migration during 10 hours and then timeouts.
From the ovn migration tool log:
TASK [migration : Sync neutron db with OVN db (container) - Run 1] *************
task path: /home/stack/ovn_migration/playbooks/roles/migration/tasks/sync-dbs.yml:7
Wednesday 02 November 2022  19:15:38 +0000 (0:00:01.182)       0:45:48.529 **** 
META: noop
META: noop

We had the similar situation with OSP17.0 and the reason was a regression caused by https://review.opendev.org/c/openstack/neutron/+/781555
and it was fixed upstream by https://review.opendev.org/c/openstack/neutron/+/817637 and/or https://review.opendev.org/c/openstack/neutron/+/805768


Version-Release number of selected component (if applicable):
RHOS-16.2-RHEL-8-20221026.n.1
python3-networking-ovn-migration-tool-7.4.2-2.20220409154863.el8ost.noarch
openstack-tripleo-heat-templates-11.6.1-2.20221010235131.e0d438c.el8ost.noarch

How reproducible:
100%

Steps to Reproduce:
1.Deploy OSP16.2 HA environment (3 controllers+ 2 computes)  with ML2OVS neutron backend
2. Create a workload, router, network, VM connected to the network
3. Try to run migration to ML2OVN according to the official procedure
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.2/html-single/migrating_the_networking_service_to_the_ml2ovn_mechanism_driver/index


Actual results:
OVN migration script get stuck on "Sync neutron db with OVN db" task.

Expected results:
OVN migration does not get stuck and finishes successfully.

Additional info:

Comment 12 Miro Tomaska 2022-11-17 17:49:54 UTC
*** Bug 2143740 has been marked as a duplicate of this bug. ***

Comment 20 Roman Safronov 2022-11-27 09:18:13 UTC
Verified on puddle RHOS-16.2-RHEL-8-20221124.n.1 which uses python3-networking-ovn-7.4.2-2.20220409154865.el8ost.noarch.rpm
Verified that during downstream ovs2ovn CI jobs the issue does not happen.

Comment 26 errata-xmlrpc 2022-12-07 19:25:51 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Release of components for Red Hat OpenStack Platform 16.2.4), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:8794