Bug 1570089 - HAproxy unable to load SSL private key from PEM file
Description Stan Toporek 2018-04-20 15:35:49 UTC
Description of problem:
This error is seen after redeploying overcloud to change parameter:
  CeilometerStoreEvents: true

See deploy_outputs.txt for output error message.

On controll node the it is this error "unable to load SSL private key from PEM file '/etc/pki/tls/private/overcloud_endpoint.pem'" (line 501 in os-collect-config-snippet.log)

HAproxy is unable to start because of wrong file permissions or wrong process owner.
On the controll node the SSL certificate used by HAproxy belongs to group haproxy (gid: 188), in container uid=42454(haproxy) gid=42454(haproxy) groups=42454(haproxy),42400(kolla)
[root@ctl-prod-0 config-data]# pwd
[root@ctl-prod-0 config-data]# find . -name '*overcloud_endpoint.pem'
[root@ctl-prod-0 config-data]# ls -l ./haproxy/etc/pki/tls/private/overcloud_endpoint.pem
-r--r-----. 1 root haproxy 3777 Apr  9 11:50 ./haproxy/etc/pki/tls/private/overcloud_endpoint.pem
[root@ctl-prod-0 config-data]# grep haproxy /etc/group
But whithin the HAproxy docker container on control node the HAproxy user has different id (uid and gid):
[root@ctl-prod-0 config-data]# docker ps | grep hapr
2c0ebbbb2eab                        "/bin/bash /usr/lo..."   5 days ago          Up 5 days                                 haproxy-bundle-docker-2
[root@ctl-prod-0 config-data]# docker exec -it haproxy-bundle-docker-2 bash
()[root@ctl-prod-0 /]# ps -ef | grep haproxy
root           1       0  0 Apr11 ?        00:00:00 /usr/sbin/haproxy-systemd-wrapper -f /etc/haproxy/haproxy.cfg
haproxy       10       1  0 Apr11 ?        00:00:00 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -Ds
haproxy       11      10  3 Apr11 ?        04:14:41 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -Ds
root       58867   58844  0 14:55 ?        00:00:00 grep --color=auto haproxy
()[root@ctl-prod-0 /]# id haproxy
uid=42454(haproxy) gid=42454(haproxy) groups=42454(haproxy),42400(kolla)

()[root@ctl-prod-0 /]# ls -l /etc/pk        
pkcs11/ pki/    
()[root@ctl-prod-0 /]# ls -l /etc/pki/tls/
cert.pem     certs/       misc/        openssl.cnf  private/     
()[root@ctl-prod-0 /]# ls -l /etc/pki/tls/private/overcloud_endpoint.pem 
-r--r-----. 1 root 188 3777 Apr  9 11:50 /etc/pki/tls/private/overcloud_endpoint.pem
Since the process within a docker container is running with different uid (uid=42454(haproxy) gid=42454(haproxy)) and certificate file is accessible (read-only) to group with id 188 which is not defined on docker container
()[root@ctl-prod-0 /]# grep 188 /etc/group
()[root@ctl-prod-0 /]# 
 HAproxy is unable to load SSL certificate.

For some reason HAproxy user within the container has some strange ID (uid=42454(haproxy) gid=42454(haproxy) groups=42454(haproxy),42400(kolla)) in this case.

How reproducible:
Every time 

Steps to Reproduce:
1. Change feature that uses HAproxy
2. Redeploy

Actual results:
"unable to load SSL private key from PEM file '/etc/pki/tls/private/overcloud_endpoint.pem'" 

Expected results:

feature added with error
Additional info:

