Bug 1856999

Summary: Ceph dashboard is not accessible while using TLS everywhere. Haproxy complains of not having enough backend servers
Product: Red Hat OpenStack Reporter: Mohammed Salih <mputhenp>
Component: openstack-tripleo-heat-templatesAssignee: Francesco Pantano <fpantano>
Status: CLOSED CURRENTRELEASE QA Contact: Alfredo <alfrgarc>
Severity: high Docs Contact:
Priority: high    
Version: 16.0 (Train)CC: alfrgarc, amcleod, apevec, asalvati, bperkins, fpantano, jamsmith, jhajyahy, lhh, lmarsh, marjones, mburns, moddi, ndeevy, pkundal, yrabl
Target Milestone: z3Keywords: TestOnly, Triaged
Target Release: 16.1 (Train on RHEL 8.2)   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: openstack-tripleo-heat-templates-11.3.2-1.20200914170164.el8ost, puppet-tripleo-11.5.0-1.20200914161843.el8ost, tripleo-ansible-0.5.1-1.20200914163923.el8ost Doc Type: Known Issue
Doc Text:
The Ceph Dashboard currently does not work with the TLS Everywhere framework because the `dashboard_protocol` parameter was incorrectly omitted from the heat template. As a result, back ends fail to appear when HAproxy is started. + As a temporary solution, create a new environment file that contains the `dashboard_protocol` parameter and include the environment file in your overcloud deployment with the `-e` option: + ``` parameter_defaults: CephAnsibleExtraConfig: dashboard_protocol: 'https' ``` + This solution introduces a ceph-ansible bug. For more information, see https://bugzilla.redhat.com/show_bug.cgi?id=1860815.
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-11-18 11:54:42 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1860815, 1866006, 1884543    
Bug Blocks:    

Description Mohammed Salih 2020-07-14 22:22:33 UTC
Description of problem:
Ceph dashboard is not accessible while using TLS everywhere. Haproxy complains of not having enough backend servers. Though the backend services are up and running, they are all "http" instead of expected "https". 

Currently a possible workaround is to use heat parameter "CephAnsibleExtraConfig" to pass "dashboard_protocol: https". This will set the extra_vars for Ceph-Ansible  playbook.
 
Version-Release number of selected component (if applicable):
Red Hat OpenStack 16.0
openstack-tripleo-heat-templates-11.3.2-0.20200405044625.ec9970c.el8ost.noarch
ceph-ansible-4.0.23-1.el8cp.noarch

How reproducible:
Always

Steps to Reproduce:
1. Deploy RHSOP16.0 with following environments along with your other environment files 
>  -e /usr/share/openstack-tripleo-heat-templates/environments/ssl/enable-internal-tls.yaml \
>  -e /usr/share/openstack-tripleo-heat-templates/environments/ssl/tls-everywhere-endpoints-dns.yaml \
>  -e /usr/share/openstack-tripleo-heat-templates/environments/services/haproxy-public-tls-certmonger.yaml \
>  -e /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-ansible.yaml \
>  -e /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-rgw.yaml \
>  -e /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-dashboard.yaml \
2. Deployment succeeds, but if you try to access the dasboard using the control plane VIP, you would get "service unavailable" error. 
3. Check /var/lib/config-data/puppet-generated/haproxy/etc/haproxy/haproxy.cfg. You can see that both "ceph_dashboard" and "ceph_grafana" are configured correctly with SSL and enabled backends assuming SSL.
4. try to directly access any of the backends listed using curl with "https" and it would give error. Now try to access the same URLs with "http" and you would get a successful response.

Actual results:
Service unavailable while accessing Ceph dashboard

Expected results:
A nice Ceph dashboard

Additional info:

Comment 1 Mohammed Salih 2020-07-15 05:07:59 UTC
Even the workaround suggested above wouldn't work, because grafana container service is configured to expect cert and key at /etc/grafana/ceph_grafana.{key,crt} , but heat templates points it to /etc/pki/tls/certs/ceph_grafana.crt and /etc/pki/tls/private/ceph_grafana.key instead. Obviously the deployment fails.

Comment 2 Mohammed Salih 2020-07-16 05:56:34 UTC
I've tried with custom heat templates for 

> /usr/share/openstack-tripleo-heat-templates/deployment/ceph-ansible/ceph-grafana.yaml and
> /usr/share/openstack-tripleo-heat-templates/deployment/ceph-ansible/ceph-mgr.yaml

with modified dashboard_crt, dashboard_key, grafana_crt and grafana_key file location, but heat deployment fails because /etc/grafana or /etc/ceph directories doesn't exists. 

Reverted back the file locations to /etc/pki/tls/{private,certs}/, but then ceph-ansible would complain it couldn't copy files from Ansible Controller to remote server. 

So tried with empty strings for dashboard_crt, dashboard_key, grafana_crt and grafana_key in the custom heat templates, so as to trick ceph-ansible not to attempt copy the files, and added custom `postsave_cmd` to move the certificates created by puppet to location expected by ceph-ansible. nevertheless it didn't work. Though the certificates were created and moved to correct location, task "inject grafana dashboard layouts" in ceph-dashboard role would complain that the certificate doesn't have the SAN with the IP as set by `grafana_server_addr` and the ceph-ansible deployment fails.

Finally a not so nice workaround is applied with a first-boot script as shown below. Also reverted back to stock ceph-dashboard template and removed all custom heat templates and scripts.

> sed -i -e '/$ceph_grafana_network,/{N;s/\n.*//;}' -e '/$ceph_dashboard_network,/{N;s/\n.*//;}' /usr/share/openstack-puppet/modules/tripleo/manifests/haproxy.pp

This would remove SSL checking for the haproxy backend servers, as there is no SSL enabled with the stock ceph-grafana.yaml and ceph-mgr.yaml heat templates.

Comment 11 Lon Hohberger 2020-10-29 10:51:39 UTC
According to our records, this should be resolved by tripleo-ansible-0.5.1-1.20200914163925.el8ost.  This build is available now.

Comment 12 Yogev Rabl 2020-11-18 01:02:35 UTC
Verified on tripleo-ansible-0.5.1-1.20200914163929.el8ost.noarch puppet-tripleo-11.5.0-1.20200914161844.el8ost.noarch openstack-tripleo-heat-templates-11.3.2-1.20200914170172.el8ost.noarch

Comment 13 Francesco Pantano 2020-12-18 16:57:27 UTC
*** Bug 1907198 has been marked as a duplicate of this bug. ***