Description of problem:
When deploying an OSP 17 overcloud with OVN DVR AND RAFT, the logs show "Overcloud Deployed successfully", but there are some issue with Octavia.
(overcloud) [stack@undercloud ~]$ openstack loadbalancer list
/usr/lib/python3.9/site-packages/ansible/_vendor/__init__.py:42: UserWarning: One or more Python packages bundled by this ansible-core distribution were already loaded (pyparsing). This may result in undefined behavior.
warnings.warn('One or more Python packages bundled by this ansible-core distribution were already '
Service Unavailable (HTTP 503) (Request-ID: None)
[heat-admin@controller-0 ~]$ sudo tail -1 /var/log/containers/octavia/octavia.log
2022-05-23 05:05:56.151 19 ERROR ovsdbapp.backend.ovs_idl.idlutils [-] Unable to open stream to tcp:192.170.1.63:6641 to retrieve schema: Connection refused
[heat-admin@controller-0 ~]$
This issue appears to be related to [1].
I changed the ovn_nb_connection variable in octavia.conf to port 6641 on controller's IP on each controller host, instead of port 6641 on internal API VIP which was what it was set to before. Running "podman restart octavia_api" after that seemed to fix the issue.
(overcloud) [stack@undercloud ~]$ openstack loadbalancer list
/usr/lib/python3.9/site-packages/ansible/_vendor/__init__.py:42: UserWarning: One or more Python packages bundled by this ansible-core distribution were already loaded (pyparsing). This may result in undefined behavior.
warnings.warn('One or more Python packages bundled by this ansible-core distribution were already '
(overcloud) [stack@undercloud ~]$
I noticed that the patch [2] adds a deployment parameter ovn_db_host which defaults to hiera('ovn_dbs_vip'). I think, in the case of deployments with RAFT, the deployment should detect that RAFT is enabled and change the value of this parameter automatically.
[3] has sosreports from the controllers when ovn_nb_connection was set to port 6641 on the internal API VIP.
[4] has sosreports from the controllers after ovn_nb_connection was set to port 6641 on the controller IPs.
[1] https://bugs.launchpad.net/tripleo/+bug/1825146
[2] https://review.opendev.org/c/openstack/puppet-tripleo/+/655813/
[3] http://perfscale.perf.lab.eng.bos.redhat.com/pub/schari/osp17_250_node_scale_octavia_ovn_nb_issue/before/
[4] http://perfscale.perf.lab.eng.bos.redhat.com/pub/schari/osp17_250_node_scale_octavia_ovn_nb_issue/after/
Version-Release number of selected component (if applicable):
RHOS-17.0-RHEL-9-20220401.n.1
Steps to Reproduce:
Deploy an OSP 17 overcloud with OVN DVR, RAFT, and Octavia.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory (Release of components for Red Hat OpenStack Platform 17.0 (Wallaby)), and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHEA-2022:6543