Created attachment 1666175 [details] ansible.log Description of problem: OC deployment fails, mistral reports issue with Certmonger, also Pacemaker(pcmd) not able to start (likely related): UC $ cat /home/stack/overcloud_install.log | grep -i puppet-user | grep -i "error" # or /var/log/mistral/overcloud/ansible.log "<13>Feb 27 03:17:36 puppet-user:except OSError:", "<13>Feb 27 03:17:36 puppet-user: error: Could not connect to cluster (is it running )", "<13>Feb 27 03:17:44 puppet-user: Notice: /Stage[main]/Tripleo::Certmonger::Ca::Local/Exec[extract-and-trust-ca]/returns: 140482529113920:error:02001002:system library:fopen:No such file or directory:crypto/bio/bss_file.c:72:fopen('/var/lib/certmonger/local/creds','rb')", "<13>Feb 27 03:17:44 puppet-user: Notice: /Stage[main]/Tripleo::Certmonger::Ca::Local/Exec[extract-and-trust-ca]/returns: 140482529113920:error:2006D080:BIO routines:BIO_new_file:no such file:crypto/bio/bss_file.c:79:", "<13>Feb 27 03:17:44 puppet-user: Error: 'openssl pkcs12 -in /var/lib/certmonger/local/creds -out /etc/pki/ca-trust/source/anchors/cm-local-ca.pem -nokeys -nodes -passin pass:'' && update-ca-trust extract' returned 1 instead of one of [0]", "<13>Feb 27 03:17:44 puppet-user: Error: /Stage[main]/Tripleo::Certmonger::Ca::Local/Exec[extract-and-trust-ca]/returns: change from 'notrun' to ['0'] failed: 'openssl pkcs12 -in /var/lib/certmonger/local/creds -out /etc/pki/ca-trust/source/anchors/cm-local-ca.pem -nokeys -nodes -passin pass:'' && update-ca-trust extract' returned 1 instead of one of [0]", "<13>Feb 27 03:17:49 puppet-user: Notice: /Stage[main]/Tripleo::Certmonger::Ca::Local/Exec[extract-and-trust-ca]/returns: 140529897686848:error:02001002:system library:fopen:No such file or directory:crypto/bio/bss_file.c:72:fopen('/var/lib/certmonger/local/creds','rb')", "<13>Feb 27 03:17:49 puppet-user: Notice: /Stage[main]/Tripleo::Certmonger::Ca::Local/Exec[extract-and-trust-ca]/returns: 140529897686848:error:2006D080:BIO routines:BIO_new_file:no such file:crypto/bio/bss_file.c:79:", "<13>Feb 27 03:17:49 puppet-user: Error: /Stage[main]/Tripleo::Certmonger::Ca::Local/Exec[extract-and-trust-ca]: Failed to call refresh: 'openssl pkcs12 -in /var/lib/certmonger/local/creds -out /etc/pki/ca-trust/source/anchors/cm-local-ca.pem -nokeys -nodes -passin pass:'' && update-ca-trust extract' returned 1 instead of one of [0]", "<13>Feb 27 03:17:49 puppet-user: Error: /Stage[main]/Tripleo::Certmonger::Ca::Local/Exec[extract-and-trust-ca]: 'openssl pkcs12 -in /var/lib/certmonger/local/creds -out /etc/pki/ca-trust/source/anchors/cm-local-ca.pem -nokeys -nodes -passin pass:'' && update-ca-trust extract' returned 1 instead of one of [0]", "<13>Feb 27 03:17:51 puppet-user: Error: Systemd start for pcsd failed!", "<13>Feb 27 03:17:51 puppet-user: Error: /Stage[main]/Pacemaker::Service/Service[pcsd]/ensure: change from 'stopped' to 'running' failed: Systemd start for pcsd failed!", "<13>Feb 27 03:17:31 puppet-user:except OSError:", "<13>Feb 27 03:17:38 puppet-user: Notice: /Stage[main]/Tripleo::Certmonger::Ca::Local/Exec[extract-and-trust-ca]/returns: 140329221809984:error:02001002:system library:fopen:No such file or directory:crypto/bio/bss_file.c:72:fopen('/var/lib/certmonger/local/creds','rb')", "<13>Feb 27 03:17:38 puppet-user: Notice: /Stage[main]/Tripleo::Certmonger::Ca::Local/Exec[extract-and-trust-ca]/returns: 140329221809984:error:2006D080:BIO routines:BIO_new_file:no such file:crypto/bio/bss_file.c:79:", "<13>Feb 27 03:17:38 puppet-user: Error: 'openssl pkcs12 -in /var/lib/certmonger/local/creds -out /etc/pki/ca-trust/source/anchors/cm-local-ca.pem -nokeys -nodes -passin pass:'' && update-ca-trust extract' returned 1 instead of one of [0]", "<13>Feb 27 03:17:38 puppet-user: Error: /Stage[main]/Tripleo::Certmonger::Ca::Local/Exec[extract-and-trust-ca]/returns: change from 'notrun' to ['0'] failed: 'openssl pkcs12 -in /var/lib/certmonger/local/creds -out /etc/pki/ca-trust/source/anchors/cm-local-ca.pem -nokeys -nodes -passin pass:'' && update-ca-trust extract' returned 1 instead of one of [0]", "<13>Feb 27 03:17:43 puppet-user: Notice: /Stage[main]/Tripleo::Certmonger::Ca::Local/Exec[extract-and-trust-ca]/returns: 140183343900480:error:02001002:system library:fopen:No such file or directory:crypto/bio/bss_file.c:72:fopen('/var/lib/certmonger/local/creds','rb')", "<13>Feb 27 03:17:43 puppet-user: Notice: /Stage[main]/Tripleo::Certmonger::Ca::Local/Exec[extract-and-trust-ca]/returns: 140183343900480:error:2006D080:BIO routines:BIO_new_file:no such file:crypto/bio/bss_file.c:79:", "<13>Feb 27 03:17:43 puppet-user: Error: /Stage[main]/Tripleo::Certmonger::Ca::Local/Exec[extract-and-trust-ca]: Failed to call refresh: 'openssl pkcs12 -in /var/lib/certmonger/local/creds -out /etc/pki/ca-trust/source/anchors/cm-local-ca.pem -nokeys -nodes -passin pass:'' && update-ca-trust extract' returned 1 instead of one of [0]", "<13>Feb 27 03:17:43 puppet-user: Error: /Stage[main]/Tripleo::Certmonger::Ca::Local/Exec[extract-and-trust-ca]: 'openssl pkcs12 -in /var/lib/certmonger/local/creds -out /etc/pki/ca-trust/source/anchors/cm-local-ca.pem -nokeys -nodes -passin pass:'' && update-ca-trust extract' returned 1 instead of one of [0]", controller $ var/log/messages Feb 27 03:17:43 controller-0 puppet-user[25475]: Debug: Executing: 'openssl pkcs12 -in /var/lib/certmonger/local/creds -out /etc/pki/ca-trust/source/anchors/cm-local-ca.pem -nokeys -nodes -passin pass:'' && update-ca-trust extract' Feb 27 03:17:43 controller-0 puppet-user[25475]: Debug: /Stage[main]/Tripleo::Certmonger::Ca::Local/Exec[extract-and-trust-ca]/returns: Sleeping for 1 seconds between triesFeb 27 03:17:44 controller-0 puppet-user[25475]: Notice: /Stage[main]/Tripleo::Certmonger::Ca::Local/Exec[extract-and-trust-ca]/returns: Can't open /var/lib/certmonger/local/creds for reading, No such file or directory Feb 27 03:17:44 controller-0 puppet-user[25475]: Notice: /Stage[main]/Tripleo::Certmonger::Ca::Local/Exec[extract-and-trust-ca]/returns: 140482529113920:error:02001002:system library:fopen:No such file or directory:crypto/bio/bss_file.c:72:fopen('/var/lib/certmonger/local/creds','rb') Feb 27 03:17:44 controller-0 puppet-user[25475]: Notice: /Stage[main]/Tripleo::Certmonger::Ca::Local/Exec[extract-and-trust-ca]/returns: 140482529113920:error:2006D080:BIO routines:BIO_new_file:no such file:crypto/bio/bss_file.c:79: Feb 27 03:17:44 controller-0 puppet-user[25475]: Error: 'openssl pkcs12 -in /var/lib/certmonger/local/creds -out /etc/pki/ca-trust/source/anchors/cm-local-ca.pem -nokeys -nodes -passin pass:'' && update-ca-trust extract' returned 1 instead of one of [0]Feb 27 03:17:44 controller-0 puppet-user[25475]: Error: /Stage[main]/Tripleo::Certmonger::Ca::Local/Exec[extract-and-trust-ca]/returns: change from 'notrun' to ['0'] failed: 'openssl pkcs12 -in /var/lib/certmonger/local/creds -out /etc/pki/ca-trust/source/anchors/cm-local-ca.pem -nokeys -nodes -passin pass:'' && update-ca-trust extract' returned 1 instead of one of [0]Feb 27 03:17:44 controller-0 puppet-user[25475]: Debug: Exec[extract-and-trust-ca](provider=posix): Executing check 'test -e /etc/pki/ca-trust/source/anchors/cm-local-ca.pem && openssl x509 -checkend 0 -noout -in /etc/pki/ca-trust/source/anchors/cm-local-ca.pem' controller $ cat /var/log/pcsd/pcsd.log E, [2020-02-27T03:17:51.572 #00000] ERROR -- : Unable to start pcsd daemon, exiting: [Errno 2] No such file or directory: '/var/lib/pcsd/pcsd.crt' Version-Release number of selected component (if applicable): OSP15, since RHOS_TRUNK-15.0-RHEL-8-20200226.n.1 Additional info: Controller: puppet-tripleo.noarch 10.5.3-0.20200113175340.45a4a61.el8ost @rhos-15.0 Undercloud: ansible-role-tripleo-modify-image.noarch 1.1.1-0.20200123020437.1e10b22.el8ost @rhelosp-15.0-trunk ansible-tripleo-ipsec.noarch 9.1.1-0.20190513190404.ffe104c.el8ost @rhelosp-15.0-trunk openstack-tripleo-common.noarch 10.8.3-0.20200113210450.0e559fc.el8ost @rhelosp-15.0-trunk openstack-tripleo-common-containers.noarch 10.8.3-0.20200113210450.0e559fc.el8ost @rhelosp-15.0-trunk openstack-tripleo-heat-templates.noarch 10.6.3-0.20200113185560.cf467ea.el8ost @rhelosp-15.0-trunk openstack-tripleo-image-elements.noarch 10.4.2-0.20190912000426.1ebd7af.el8ost @rhelosp-15.0-trunk openstack-tripleo-puppet-elements.noarch 10.3.3-0.20200123202112.78f7e7f.el8ost @rhelosp-15.0-trunk openstack-tripleo-validations.noarch 10.5.3-0.20200107180445.ffd651f.el8ost @rhelosp-15.0-trunk puppet-tripleo.noarch 10.5.3-0.20200113175340.45a4a61.el8ost @rhelosp-15.0-trunk python3-tripleo-common.noarch 10.8.3-0.20200113210450.0e559fc.el8ost @rhelosp-15.0-trunk python3-tripleoclient.noarch 11.5.3-0.20200114200459.d09212f.el8ost @rhelosp-15.0-trunk python3-tripleoclient-heat-installer.noarch 11.5.3-0.20200114200459.d09212f.el8ost @rhelosp-15.0-trunk
Created attachment 1666176 [details] messages
Created attachment 1666195 [details] Difference between 07.2 (passed_phase2) and 26.2 (failing) OC image rpm manifests
This feels like a race condition.
Tried with RHOS_TRUNK-15.0-RHEL-8-20200226.n.1. Seems like these two dirs are missing in the overcloud-full image: [root@controller-0 ~]# rpm -qf /var/lib/certmonger/local certmonger-0.79.7-3.el8.x86_64 [root@controller-0 ~]# rpm -V certmonger missing /var/lib/certmonger/local missing /var/lib/certmonger/requests no trace in dnf.log or journalctl of the removal, looks like something during the image build process removed them.
According to our records, this should be resolved by rhosp-director-images-15.0-20200227.1.el8ost. This build is available now.