Description of problem: This error is seen after redeploying overcloud to change parameter: parameter_defaults: CeilometerStoreEvents: true See deploy_outputs.txt for output error message. On controll node the it is this error "unable to load SSL private key from PEM file '/etc/pki/tls/private/overcloud_endpoint.pem'" (line 501 in os-collect-config-snippet.log) HAproxy is unable to start because of wrong file permissions or wrong process owner. On the controll node the SSL certificate used by HAproxy belongs to group haproxy (gid: 188), in container uid=42454(haproxy) gid=42454(haproxy) groups=42454(haproxy),42400(kolla) ``` [root@ctl-prod-0 config-data]# pwd /var/lib/config-data [root@ctl-prod-0 config-data]# find . -name '*overcloud_endpoint.pem' ./haproxy/etc/pki/tls/private/overcloud_endpoint.pem [root@ctl-prod-0 config-data]# ls -l ./haproxy/etc/pki/tls/private/overcloud_endpoint.pem -r--r-----. 1 root haproxy 3777 Apr 9 11:50 ./haproxy/etc/pki/tls/private/overcloud_endpoint.pem [root@ctl-prod-0 config-data]# grep haproxy /etc/group haproxy:x:188: ``` But whithin the HAproxy docker container on control node the HAproxy user has different id (uid and gid): ``` [root@ctl-prod-0 config-data]# docker ps | grep hapr 2c0ebbbb2eab 10.172.96.7:8787/rhosp12/openstack-haproxy:pcmklatest "/bin/bash /usr/lo..." 5 days ago Up 5 days haproxy-bundle-docker-2 [root@ctl-prod-0 config-data]# docker exec -it haproxy-bundle-docker-2 bash ()[root@ctl-prod-0 /]# ps -ef | grep haproxy root 1 0 0 Apr11 ? 00:00:00 /usr/sbin/haproxy-systemd-wrapper -f /etc/haproxy/haproxy.cfg haproxy 10 1 0 Apr11 ? 00:00:00 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -Ds haproxy 11 10 3 Apr11 ? 04:14:41 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -Ds root 58867 58844 0 14:55 ? 00:00:00 grep --color=auto haproxy ()[root@ctl-prod-0 /]# id haproxy uid=42454(haproxy) gid=42454(haproxy) groups=42454(haproxy),42400(kolla) ()[root@ctl-prod-0 /]# ls -l /etc/pk pkcs11/ pki/ ()[root@ctl-prod-0 /]# ls -l /etc/pki/tls/ cert.pem certs/ misc/ openssl.cnf private/ ()[root@ctl-prod-0 /]# ls -l /etc/pki/tls/private/overcloud_endpoint.pem -r--r-----. 1 root 188 3777 Apr 9 11:50 /etc/pki/tls/private/overcloud_endpoint.pem ``` Since the process within a docker container is running with different uid (uid=42454(haproxy) gid=42454(haproxy)) and certificate file is accessible (read-only) to group with id 188 which is not defined on docker container ``` ()[root@ctl-prod-0 /]# grep 188 /etc/group ()[root@ctl-prod-0 /]# ``` HAproxy is unable to load SSL certificate. For some reason HAproxy user within the container has some strange ID (uid=42454(haproxy) gid=42454(haproxy) groups=42454(haproxy),42400(kolla)) in this case. How reproducible: Every time Steps to Reproduce: 1. Change feature that uses HAproxy 2. Redeploy 3. Actual results: "unable to load SSL private key from PEM file '/etc/pki/tls/private/overcloud_endpoint.pem'" Expected results: feature added with error Additional info:
Can we get a sosreport of ctrl-prod-0 and undercloud and the full deploy commandline + env files used? Thanks, Michele
i have the same issue with OSP13