Bug 2099855
| Summary: | [OSP17][TLS-E] haproxy check fails for ceph-grafana service | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Marian Krcmarik <mkrcmari> |
| Component: | puppet-tripleo | Assignee: | Francesco Pantano <fpantano> |
| Status: | CLOSED ERRATA | QA Contact: | Alfredo <alfrgarc> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 17.0 (Wallaby) | CC: | adking, epuertat, fpantano, jjoyce, jschluet, mburns, ramishra, slinaber, tvignaud |
| Target Milestone: | ga | Keywords: | Triaged |
| Target Release: | 17.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | puppet-tripleo-14.2.3-0.20220705151704.bc62cd8.el9ost | Doc Type: | No Doc Update |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-09-21 12:23:00 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Release of components for Red Hat OpenStack Platform 17.0 (Wallaby)), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2022:6543 The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days |
Description of problem: If OSP is deployed with ceph-dashboard there are multiple ceph-dashboard services deployed and place behind haproxy, one of the services is grafana, The following haproxy configuration is generated for grafana on OSP17: listen ceph_grafana bind 192.168.24.71:3100 transparent ssl crt /etc/pki/tls/certs/haproxy/overcloud-haproxy-storage.pem mode http balance source http-request set-header X-Forwarded-Proto https if { ssl_fc } http-request set-header X-Forwarded-Proto http if !{ ssl_fc } http-request set-header X-Forwarded-Port %[dst_port] option httpchk HEAD / option httplog option forwardfor server central-controller-0.storage.redhat.local 172.23.1.55:3100 ca-file /etc/ipa/ca.crt check fall 5 inter 2000 rise 2 ssl verify required verifyhost central-controller-0.storage.redhat.local server central-controller-1.storage.redhat.local 172.23.1.124:3100 ca-file /etc/ipa/ca.crt check fall 5 inter 2000 rise 2 ssl verify required verifyhost central-controller-1.storage.redhat.local server central-controller-2.storage.redhat.local 172.23.1.243:3100 ca-file /etc/ipa/ca.crt check fall 5 inter 2000 rise 2 ssl verify required verifyhost central-controller-2.storage.redhat.local The haproxy configuration for grafana service seems to be correct and haproxy does service backend checks regularly. The problem seems to be that the check fails, the grafana service complains every 2 seconds about: 2022/06/21 12:36:00 http: TLS handshake error from 172.23.1.243:56364: remote error: tls: internal error 2022/06/21 12:36:00 http: TLS handshake error from 172.23.1.55:52296: remote error: tls: internal error 2022/06/21 12:36:01 http: TLS handshake error from 172.23.1.124:52898: remote error: tls: internal error I think the reason that all the grafana server containers on all the controller nodes (in my case grafana is deployed on controllers) have the same SSL certificate and key deployed in /etc/grafana/certs/cert_file|key, in my case It's SSL certificate generated for grafana service on controller-0. So the haproxy check is successful to grafana on controller-0 but fails to the other grafana backends because the grafana containers have the same certificate generated for controller-0 deployed in /etc/grafana/certs/cert_file|key. The container's file /etc/grafana/certs/cert_file are bind to /var/lib/ceph/d5c621ae-ec54-5b9d-910d-b8dba8e6b5ba/grafana.central-controller-*/etc/grafana/certs/cert_key on the hosts and It's the same files on all the hosts but the certificates in /etc/pki/tls/certs/ceph_grafana.crt are different and correctly generated for each host. If I copy /etc/pki/tls/certs/ceph_grafana.crt to /var/lib/ceph/d5c621ae-ec54-5b9d-910d-b8dba8e6b5ba/grafana.central-controller-*/etc/grafana/certs/cert_file and restart grafana containers on all hosts, The haproxy check starts to be successful. I am not sure about the right component so I am assigning initially to THT. Version-Release number of selected component (if applicable): puppet-tripleo-14.2.3-0.20220607163018.bc63c9e.el9ost.noarch ansible-tripleo-ipsec-11.0.1-0.20210910011424.b5559c8.el9ost.noarch ansible-tripleo-ipa-0.2.3-0.20220301190449.6b0ed82.el9ost.noarch ansible-role-tripleo-modify-image-1.3.1-0.20220216001439.30d23d5.el9ost.noarch python3-tripleo-common-15.4.1-0.20220608140403.caa0c1f.el9ost.noarch tripleo-ansible-3.3.1-0.20220607162207.ae139c3.el9ost.noarch openstack-tripleo-validations-14.2.2-0.20220514020831.d2a1172.el9ost.noarch openstack-tripleo-common-containers-15.4.1-0.20220608140403.caa0c1f.el9ost.noarch openstack-tripleo-common-15.4.1-0.20220608140403.caa0c1f.el9ost.noarch openstack-tripleo-heat-templates-14.3.1-0.20220607161058.ced328c.el9ost.noarch python3-tripleoclient-16.4.1-0.20220607160517.4d2a5db.el9ost.noarch How reproducible: Always Steps to Reproduce: 1. Deploy OSP17 with ceph-dashboard. 2. Check the grafana server log and haproxy status Actual results: 2022/06/21 12:35:58 http: TLS handshake error from 172.23.1.243:56362: remote error: tls: internal error 2022/06/21 12:35:58 http: TLS handshake error from 172.23.1.55:52294: remote error: tls: internal error 2022/06/21 12:35:59 http: TLS handshake error from 172.23.1.124:52896: remote error: tls: internal error 2022/06/21 12:36:00 http: TLS handshake error from 172.23.1.243:56364: remote error: tls: internal error 2022/06/21 12:36:00 http: TLS handshake error from 172.23.1.55:52296: remote error: tls: internal error 2022/06/21 12:36:01 http: TLS handshake error from 172.23.1.124:52898: remote error: tls: internal error and haproxy reports two backends out of 3 to be DOWN for grafana Additional info: Feel free to request any logs