Bug 2075118 - Octavia fails when enabling TLS-e in existing setup
Summary: Octavia fails when enabling TLS-e in existing setup
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 16.2 (Train)
Hardware: All
OS: All
high
high
Target Milestone: z4
: 16.2 (Train on RHEL 8.4)
Assignee: Jakub Libosvar
QA Contact: Fiorella Yanac
URL:
Whiteboard:
Depends On: 2100906
Blocks: 2100907
TreeView+ depends on / blocked
 
Reported: 2022-04-13 16:12 UTC by Shailesh Chhabdiya
Modified: 2022-12-07 19:22 UTC (History)
11 users (show)

Fixed In Version: openstack-tripleo-heat-templates-11.6.1-2.20220821010130.b1e9bfe.el8ost
Doc Type: No Doc Update
Doc Text:
Clone Of:
: 2100906 2100907 (view as bug list)
Environment:
Last Closed: 2022-12-07 19:22:17 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 839759 0 None MERGED Use EnableInternalTLS to set pssl in nb and sb 2022-07-13 20:33:38 UTC
Red Hat Issue Tracker OSP-14667 0 None None None 2022-04-13 16:31:10 UTC
Red Hat Product Errata RHBA-2022:8794 0 None None None 2022-12-07 19:22:54 UTC

Description Shailesh Chhabdiya 2022-04-13 16:12:15 UTC
Description of problem:
Whenever adding TLS-e in existing setup it fails at octavia network show task

Version-Release number of selected component (if applicable):
RHOSP16.2

How reproducible:
Always

Steps to Reproduce:
1. Deploy setup with Octavia and SSl on public endpoints
2. Enable TLS-e 
3. Deployment fails

Actual results:
Deployment's fails

Expected results:
Deployment should always succeed

Additional info:
Seems like older bugs
https://bugzilla.redhat.com/show_bug.cgi?id=1812744
https://bugzilla.redhat.com/show_bug.cgi?id=1797670

Comment 1 Shailesh Chhabdiya 2022-04-13 16:16:40 UTC
Hello Team,

Details are present in description 

Issue is reproduced twice and at same setup

Here are detailed steps we performed 


1. deploying stack form scratch with SSL on public endpoints only

2. Running an update with TLS-e


The initial deployment (2) finished successfully. The TLS-e update (3) failed at exactly the same step as previously, that is, at Octavia. The overcloud was not functional:

(overcloud) [stack@dtr01 ~]$ openstack network list                                                                   
HttpException: 503: Server Error for url: https://xxxx:13696/v2.0/networks, 503 Service Unavailable: No server is available to handle this request.

Restarting ovn and octavia containers on all controllers with sudo podman restart $(sudo podman ps -f name=ovn -f name=octavia --format="{{.Names}}") didn't bring the overcloud back to being functional and above error when listing networks persisted. Also the same ovn errors as previously were present:

 ----- ctl01 -----
2022-04-12T15:19:36Z|00001|stream_ssl|ERR|Private key must be configured to use SSL
2022-04-12T15:19:36Z|00002|stream_ssl|ERR|Certificate must be configured to use SSL
2022-04-12T15:19:36Z|00003|stream_ssl|ERR|CA certificate must be configured to use SSL
ovn-nbctl: ssl:192.168.211.240:6641: database connection failed (Protocol not available)
 ----- ctl02 -----
2022-04-12T15:19:37Z|00001|stream_ssl|ERR|Private key must be configured to use SSL
2022-04-12T15:19:37Z|00002|stream_ssl|ERR|Certificate must be configured to use SSL
2022-04-12T15:19:37Z|00003|stream_ssl|ERR|CA certificate must be configured to use SSL
ovn-nbctl: ssl:192.168.211.240:6641: database connection failed (Protocol not available)
 ----- ctl03 -----
2022-04-12T15:19:38Z|00001|stream_ssl|ERR|Private key must be configured to use SSL
2022-04-12T15:19:38Z|00002|stream_ssl|ERR|Certificate must be configured to use SSL
2022-04-12T15:19:38Z|00003|stream_ssl|ERR|CA certificate must be configured to use SSL
ovn-nbctl: ssl:192.168.211.240:6641: database connection failed (Protocol not available)

Having restarted the containers, I ran the update again, and it finished successfully:

2022-04-12 17:06:34.076894 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ End Summary Information ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~     
Ansible passed.Overcloud configuration completed

The overcloud look operational. Verified creating and deleting a network, subnet and port successfully. However, the ovn errors still persist:

 ----- ctl01 -----                                                                                                   
2022-04-13T07:26:41Z|00001|stream_ssl|ERR|Private key must be configured to use SSL                                        
2022-04-13T07:26:41Z|00002|stream_ssl|ERR|Certificate must be configured to use SSL                                         
2022-04-13T07:26:41Z|00003|stream_ssl|ERR|CA certificate must be configured to use SSL                                     
ovn-nbctl: ssl:192.168.211.240:6641: database connection failed (Protocol not available)                                   
 ----- ctl02 -----                                                                                                   
2022-04-13T07:26:42Z|00001|stream_ssl|ERR|Private key must be configured to use SSL
2022-04-13T07:26:42Z|00002|stream_ssl|ERR|Certificate must be configured to use SSL                                        
2022-04-13T07:26:42Z|00003|stream_ssl|ERR|CA certificate must be configured to use SSL                                     
ovn-nbctl: ssl:192.168.211.240:6641: database connection failed (Protocol not available)                                   
 ----- ctl03 -----                                                                                                   
2022-04-13T07:26:43Z|00001|stream_ssl|ERR|Private key must be configured to use SSL                                        
2022-04-13T07:26:43Z|00002|stream_ssl|ERR|Certificate must be configured to use SSL
2022-04-13T07:26:43Z|00003|stream_ssl|ERR|CA certificate must be configured to use SSL                                     
ovn-nbctl: ssl:192.168.211.240:6641: database connection failed (Protocol not available)

Apparently, there is an issue in the TLS-e update process related to certificates (sort of race condition?), which can be recreated.

So ideally we have a workaround but a permanent fix is needed 

Thank You

Comment 3 Gregory Thiemonge 2022-04-14 06:27:55 UTC
Well, the update fails when the Octavia playbook executes an OpenStack command: "openstack network show lb-mgmt-net"

It doesn't seem related to Octavia because the customer also reports:

(overcloud) [stack@dtr01 ~]$ openstack network list                                                                   
HttpException: 503: Server Error for url: https://xxxx:13696/v2.0/networks, 503 Service Unavailable: No server is available to handle this request.


Adding Squad:Neutron

Comment 37 errata-xmlrpc 2022-12-07 19:22:17 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Release of components for Red Hat OpenStack Platform 16.2.4), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:8794


Note You need to log in before you can comment on or make changes to this bug.