Bug 2131594

Summary: [BDI] - RHOSP 16.2 haproxy-external-cert certificate renewal got stuck in NEED_CSR and undercloud self-signed Root CA expired
Product: Red Hat OpenStack Reporter: Riccardo Bruzzone <rbruzzon>
Component: puppet-tripleoAssignee: Ade Lee <alee>
Status: CLOSED ERRATA QA Contact: David Rosenfeld <drosenfe>
Severity: high Docs Contact:
Priority: high    
Version: 16.2 (Train)CC: alee, cjeanner, dsedgmen, dwilde, egallen, enothen, hrybacki, jagee, jjoyce, joboyer, jschluet, mburns, morazi, pgodwin, ramishra, rhayakaw, rhos-maint, slinaber, stchen, tvignaud
Target Milestone: z4Keywords: Triaged
Target Release: 16.2 (Train on RHEL 8.4)Flags: dsedgmen: needinfo+
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: puppet-tripleo-11.7.0-2.20220826014803.el8ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-12-07 19:25:51 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Riccardo Bruzzone 2022-10-02 18:19:57 UTC
Problem Description:
The haproxy-external-cert certificate was expired and its automatic renewal failed with the status NEED_CSR.
The final effect was that all openstack commands failed as in the example below (the ip address is obfuscated with xx.xx.xx.xx and a fake hostname used):

# [stack@director ~]$ source stackrc
# (undercloud) [stack@director ~]$ openstack server list
# Failed to discover available identity versions when contacting https://xx.xx.xx.xx:13000. 
# Attempting to parse version from URL.
# Could not find versioned identity endpoints when attempting to authenticate. 
# Please check that your auth_url is correct. SSL exception connecting 
# to https://xx.xx.xx.xx:13000: HTTPSConnectionPool(host='xx.xx.xx.xx', port=13000): 
# Max retries exceeded with url: / (Caused by  SSLError(SSLError(1, '[SSL: CERTIFICATE_VERIFY_FAILED]
#  certificate verify failed (_ssl.c:897)'),))
# As result of this issue, it wasn't possible to manage the Red Hat OpenStack 
# production Environment.
# From the analysis made from both sides (Red Hat and Customer), the following 
# issues have been identified:

- The undercloud self-signed Root CA was renewed internally by certmonger months ago, but the new one wasn't exported to the undercloud trust store (we're hitting BZ#1975505 and the workaround in the first comment works)

- The RHOSP 16.2 haproxy certificate renewal got stuck in NEED_CSR as well, restarting certmonger itself renewed the certificate and now the undercloud APIs are working again

- Overcloud nodes hasn't been updated with the new undercloud self-signed Root CA

- In the official Red Hat OpenStack product documents there isn't any description of the steps missing and required to correctly update the undercloud self-signed Root CA (certificate automatically renewed by certmonger).

Customer environment: 
RHOSP 16.2.0 deployed in the Production Environment

Type of Deployment:   
Stretched scenario with SBD

Customer Case:        
03323027

Customer Expectation:
- The new undercloud self-signed Root CA (Local Signing Authority), CA automatically renewed via certmonger and used to sign the ha proxy cert, should be automatically updated in the trusted anchor of all OSP components (undercloud and overcloud nodes) where this CA is required.

- Red Hat OpenStack product documents should be updated with all steps required to exit from a situation where the full automatic renewal process isn't still implemented or working.

- During the RHOSP installation, or minor update procedure, it should be possible to change the undercloud self-signed Root CA validity period (currently 1 year) and align it the Customer policy and needs (e.g.: Local Signing Authority with validity period of 5 years or more).


#------------------ NEED_CSR and self-signed Root CA not updated ------------------#

[(GLD) root@director ~]# getcert list
Number of certificates and requests being tracked: 1.
Request ID 'haproxy-external-cert':
        status: NEED_CSR
        stuck: no
        key pair storage: type=FILE,location='/etc/pki/tls/private/haproxy/overcloud-haproxy-external.key'
        certificate: type=FILE,location='/etc/pki/tls/certs/haproxy/overcloud-haproxy-external.crt'
        CA: local
        issuer: CN=709a5be9-35eb43dc-806638a9-536b1aaf,CN=Local Signing Authority
        subject: CN=xx.xx.xx.xx
        expires: 2022-09-24 12:26:03 CEST
        eku: id-kp-clientAuth,id-kp-serverAuth
        pre-save command:
        post-save command: /usr/bin/certmonger-haproxy-refresh.sh reload external
        track: yes
        auto-renew: yes

[(GLD) root@director ~]# openssl x509 -noout -subject -issuer -dates -in /etc/pki/tls/private/overcloud_endpoint.pem
subject=CN = xx.xx.xx.xx
issuer=CN = Local Signing Authority, CN = 709a5be9-35eb43dc-806638a9-536b1aaf
notBefore=Oct 21 07:54:28 2021 GMT
notAfter=Sep 24 10:26:03 2022 GMT

Comment 32 errata-xmlrpc 2022-12-07 19:25:51 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Release of components for Red Hat OpenStack Platform 16.2.4), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:8794

Comment 33 David Sedgmen 2023-02-16 15:12:20 UTC
*** Bug 2104546 has been marked as a duplicate of this bug. ***

Comment 34 Red Hat Bugzilla 2023-09-19 04:27:31 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days