Bug 1785364

Summary: After engine restore, ovn networks are not restored and new OVN networks are not working properly on 4.4
Product: [oVirt] ovirt-engine Reporter: amashah
Component: Backup-Restore.EngineAssignee: Dominik Holler <dholler>
Status: CLOSED CURRENTRELEASE QA Contact: msheena
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.4.0CC: bugs, dholler, emesika, lleistne, mburman, michal.skrivanek, mperina, pmatyas, sgoodman
Target Milestone: ovirt-4.4.0Flags: emesika: needinfo-
pm-rhel: ovirt-4.4+
pm-rhel: devel_ack+
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: rhv-4.4.0-29 ovirt-engine-4.4.0_beta3 rhv-4.4.0-30 Doc Type: Bug Fix
Doc Text:
Previously, when restoring a backup, engine-setup did not restart ovn-northd, so the ssl/tls configuration was outdated. With this update ,the the restored ssl/tls ovn-northd reloads the restored ssl/tls configuration.
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-05-20 20:04:12 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Network RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1812906, 1816648, 1820995    
Bug Blocks:    
Attachments:
Description Flags
terminal log of upstream upgrade
none
backup file and bacup/restore log files none

Description amashah 2019-12-19 18:50:02 UTC
Description of problem:
This may be related to the backup/restore procedure, for which there was a bug 
that was supposed to be resolved in 4.3.5 - 

https://bugzilla.redhat.com/show_bug.cgi?id=1630824

However, after a backup on 4.3.6 and restore, the ovn networks were not restored. 

Furthermore, configuration was broken as ovn-controller.log is spewing SSL errors and even newly created ovn networks applied to VM's are unable to ping with each-other.

Version-Release number of selected component (if applicable):
4.3.6

How reproducible:
I was unable to reproduce this due to inadequate resources available.

Steps to Reproduce:
1. Take a backup on 4.3.6.7
2. Restore backup in new deployment of HE w/ 4.3.7
3. OVN networks not restored, and newly created OVN networks do not work

Actual results:
OVN networks not restored, and newly created OVN networks do not work due to SSL issues between mananger/host.


Expected results:
OVN networks should be restored and there should not be SSL errors between hosts/manager. Also newly created networks should be able to be created and applied to VM's and network connectivity should function.

Additional info:
To fix this setup, the following procedure was required:


~~~
On RHV-M reconfigure OVN:

1. From RHV-M go to Administration -> Providers -> ovirt-provider-ovn -> Edit -> Rename this to something else, because when reconfiguring ovn on RHV-M, it will add a "new" 'ovirt-provider-ovn' and will conflict with the "new" one.

2. Take a backup of /etc/ovirt-provider-ovn/ 

3. Take a backup of /etc/ovirt-engine-setup.conf.d/20-setup-ovirt-post.conf

4. Edit the file /etc/ovirt-engine-setup.conf.d/20-setup-ovirt-post.conf

    - Comment out the line starting with 'OVESETUP_OVN/ovirtProviderOvnId=str:'
    - Change the line starting with 'OVESETUP_OVN/ovirtProviderOvn' to none:None

It should look similar to this (note: Id will be different than below):

~~~
$ grep OVN /etc/ovirt-engine-setup.conf.d/20-setup-ovirt-post.conf

OVESETUP_OVN/ovirtProviderOvn=none:None
#OVESETUP_OVN/ovirtProviderOvnId=str:23e832e4-ddac-4da4-9353-b5be34b5da6d
~~~


5. # engine-setup --reconfigure-optional-components --offline


6. Reconfigure the hosts, from 'rhev1' (ensure any VM's are down and host is in maintenance first):

# vdsm-tool ovn-unconfigure
# ovs-ofctl del-flows br-int
# vdsm-tool ovn-config <RHV-M_IP> ovirtmgmt


7. Check for any SSL errors in /var/log/openvswitch/ovn-controller.log

8. If there are no more SSL errors, then create some new networks in RHV-M on the new 'ovirt-provider-ovn' network and apply it to some VM's and confirm pings are now working properly.
~~~

Comment 4 Petr Matyáš 2020-03-23 14:10:05 UTC
The restore fails in engine-setup after engine-backup --mode=restore in stage Failed to open OVN NORTH DB SSL connection.

I have:
ovirt-engine-setup-4.4.0-0.26.master.el8ev.noarch
ovirt-engine-tools-backup-4.4.0-0.26.master.el8ev.noarch
ovirt-provider-ovn-1.2.29-1.el8ev.noarch

Comment 9 Dominik Holler 2020-04-07 10:54:24 UTC
Created attachment 1676880 [details]
terminal log of upstream upgrade

Even engine-setup failed, openvswitch and ovn looks healthy.

Comment 11 Sandro Bonazzola 2020-04-09 07:43:43 UTC
Please add gerrit patches to be tracked if not merged yet. All tracked patches are merged and included in ovirt-engine-4.4.0_beta3, if nothing else missing please move to QE

Comment 12 Eli Mesika 2020-04-12 10:44:53 UTC
Created attachment 1678221 [details]
backup file and bacup/restore log files

Comment 14 msheena 2020-04-20 11:21:20 UTC
Verified on
===========
ovirt-engine-4.3.9.4 (backup)
ovirt-engine-4.4.0-0.33.master.el8ev.noarch (restore)
ovirt-engine-tools-backup-4.4.0-0.33.master.el8ev.noarch
ovirt-provider-ovn-1.2.30-1.el8ev.noarch

Comment 15 Sandro Bonazzola 2020-05-20 20:04:12 UTC
This bugzilla is included in oVirt 4.4.0 release, published on May 20th 2020.

Since the problem described in this bug report should be
resolved in oVirt 4.4.0 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.