Bug 2058532
Summary: | certs-regenerate breaks qpidd certificates, resulting in qpidd start-up failures: Couldn't find any network address to listen to | |||
---|---|---|---|---|
Product: | Red Hat Satellite | Reporter: | Lukas Pramuk <lpramuk> | |
Component: | Installation | Assignee: | Eric Helms <ehelms> | |
Status: | CLOSED ERRATA | QA Contact: | Lukas Pramuk <lpramuk> | |
Severity: | high | Docs Contact: | ||
Priority: | unspecified | |||
Version: | 6.11.0 | CC: | ahumbe, egolov, ehelms, gtalreja | |
Target Milestone: | 6.11.0 | Keywords: | Triaged | |
Target Release: | Unused | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | foreman-installer-3.1.2.3 | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 2081814 (view as bug list) | Environment: | ||
Last Closed: | 2022-07-05 14:33:57 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: |
Description
Lukas Pramuk
2022-02-25 09:16:44 UTC
As there is no reproducer and no sosreport for this, the issue is a tad hard to diagnose. Additionally, Lukas reports that this same DB passes fine in automation :/ My original thought was IPv4-only vs IPv4/6 setups, as qpidd by default listens on both v4 and v6 localhost (via interface=lo in qpidd.conf). But I couldn't find any hints for irregularities in the existing customer backup. Thus I would like to request QE to try to reproduce this again and if possible leave the VM running so that we can inspect it. It occurred also in upgrade automation: 1) Upgrade QE template to 6.11.0 2) Enable cockpit feature by running installer # satellite-installer --enable-foreman-plugin-remote-execution-cockpit Hah, with a reproducer machine in place, this gets much better. The issue is not so much that it can't find a network address, but that it seems it can't load the certs and then can't create an SSL listening socket. I still don't know *why*, but well, progress nevertheless. With log-enable=debug in the config, we see: 2022-04-26 06:43:34 [Network] debug Using interface: 127.0.0.1 2022-04-26 06:43:34 [System] debug Exception constructed: Failed to retrieve private key from certificate (/builddir/build/BUILD/qpid-cpp-1.36.0/src/qpid/sys/ssl/SslSocket.cpp:234) 2022-04-26 06:43:34 [Network] debug Using interface: ::1 2022-04-26 06:43:34 [System] debug Exception constructed: Failed to retrieve private key from certificate (/builddir/build/BUILD/qpid-cpp-1.36.0/src/qpid/sys/ssl/SslSocket.cpp:234) 2022-04-26 06:43:34 [System] debug Exception constructed: Couldn't find any network address to listen to 2022-04-26 06:43:34 [Broker] critical Broker (pid=96725) start-up failed: Couldn't find any network address to listen to 2022-04-26 06:43:34 [Broker] critical Unexpected error: Couldn't find any network address to listen to This is not Upgrades related as I have reproducer with fresh 6.11:
1) Have a Satellite 6.11.0
2) Enable katello-agent by installer run
# satellite-installer --foreman-proxy-content-enable-katello-agent
3) Run installer again
# satellite-installer
>>> after enabling katello-agent all sucessive installer runs are failing
it seems (to me), that it even doesn't try to load the password file: on a working system: # strace -f /usr/sbin/qpidd --config /etc/qpid/qpidd.conf 2>&1 |grep nss_db [pid 51190] openat(AT_FDCWD, "/etc/pki/katello/nss_db_password-file", O_RDONLY) = 26 on a broken one: # strace -f /usr/sbin/qpidd --config /etc/qpid/qpidd.conf 2>&1 |grep nss_db <empty> Coming closer. Looking at a puppet run on the reproducer, the certs changed for some reason: 2022-04-26 08:04:25 [DEBUG ] [configure] /Stage[main]/Certs::Qpid/Cert[dhcp-3-82.vms.sat.rdu2.redhat.com-qpid-broker]: Starting to evaluate the resource (365 of 1762) 2022-04-26 08:04:25 [DEBUG ] [configure] Executing: '/bin/katello-ssl-tool --gen-server --dir /root/ssl-build --set-hostname dhcp-3-82.vms.sat.rdu2.redhat.com --server-cert dhcp-3-82.vms.sat.rdu2.redhat.com-qpid-broker.crt --server -cert-req dhcp-3-82.vms.sat.rdu2.redhat.com-qpid-broker.crt.req --server-key dhcp-3-82.vms.sat.rdu2.redhat.com-qpid-broker.key --server-rpm dhcp-3-82.vms.sat.rdu2.redhat.com-qpid-broker -p file:/etc/pki/katello/private/katello-defa ult-ca.pwd --set-hostname dhcp-3-82.vms.sat.rdu2.redhat.com --set-common-name dhcp-3-82.vms.sat.rdu2.redhat.com --ca-cert /etc/pki/katello-certs-tools/certs/katello-default-ca.crt --ca-key /etc/pki/katello-certs-tools/private/katel lo-default-ca.key --set-country US --set-state North Carolina --set-city Raleigh --set-org pulp --set-org-unit SomeOrgUnit --set-email --cert-expiration 7300 --set-cname localhost' 2022-04-26 08:04:26 [INFO ] [configure] /Stage[main]/Certs::Qpid/Cert[dhcp-3-82.vms.sat.rdu2.redhat.com-qpid-broker]/ensure: created 2022-04-26 08:04:26 [DEBUG ] [configure] /Stage[main]/Certs::Qpid/Cert[dhcp-3-82.vms.sat.rdu2.redhat.com-qpid-broker]: The container Class[Certs::Qpid] will propagate my refresh event 2022-04-26 08:04:26 [DEBUG ] [configure] /Stage[main]/Certs::Qpid/Cert[dhcp-3-82.vms.sat.rdu2.redhat.com-qpid-broker]: Evaluated in 1.57 seconds 2022-04-26 08:04:26 [DEBUG ] [configure] /Stage[main]/Certs::Ssltools::Nssdb/File[/etc/pki/katello/nss_db_password-file]: Starting to evaluate the resource (366 of 1762) 2022-04-26 08:04:26 [DEBUG ] [configure] /Stage[main]/Certs::Ssltools::Nssdb/File[/etc/pki/katello/nss_db_password-file]: Evaluated in 0.00 seconds 2022-04-26 08:04:26 [DEBUG ] [configure] /Stage[main]/Certs::Ssltools::Nssdb/Nssdb[/etc/pki/katello/nssdb]: Starting to evaluate the resource (367 of 1762) 2022-04-26 08:04:26 [DEBUG ] [configure] Executing: '/bin/certutil -K -d /etc/pki/katello/nssdb -f /etc/pki/katello/nss_db_password-file' 2022-04-26 08:04:26 [DEBUG ] [configure] /Stage[main]/Certs::Ssltools::Nssdb/Nssdb[/etc/pki/katello/nssdb]: Evaluated in 0.05 seconds 2022-04-26 08:04:27 [DEBUG ] [configure] /Stage[main]/Certs::Ssltools::Nssdb/File[/etc/pki/katello/nssdb]: Starting to evaluate the resource (368 of 1766) 2022-04-26 08:04:27 [DEBUG ] [configure] /Stage[main]/Certs::Ssltools::Nssdb/File[/etc/pki/katello/nssdb]: Evaluated in 0.00 seconds 2022-04-26 08:04:27 [DEBUG ] [configure] /Stage[main]/Certs::Ssltools::Nssdb/File[/etc/pki/katello/nssdb/cert8.db]: Starting to evaluate the resource (369 of 1766) 2022-04-26 08:04:27 [DEBUG ] [configure] /Stage[main]/Certs::Ssltools::Nssdb/File[/etc/pki/katello/nssdb/cert8.db]: Evaluated in 0.00 seconds 2022-04-26 08:04:27 [DEBUG ] [configure] /Stage[main]/Certs::Ssltools::Nssdb/File[/etc/pki/katello/nssdb/key3.db]: Starting to evaluate the resource (370 of 1766) 2022-04-26 08:04:27 [DEBUG ] [configure] /Stage[main]/Certs::Ssltools::Nssdb/File[/etc/pki/katello/nssdb/key3.db]: Evaluated in 0.00 seconds 2022-04-26 08:04:27 [DEBUG ] [configure] /Stage[main]/Certs::Ssltools::Nssdb/File[/etc/pki/katello/nssdb/secmod.db]: Starting to evaluate the resource (371 of 1766) 2022-04-26 08:04:27 [DEBUG ] [configure] /Stage[main]/Certs::Ssltools::Nssdb/File[/etc/pki/katello/nssdb/secmod.db]: Evaluated in 0.00 seconds 2022-04-26 08:04:27 [DEBUG ] [configure] /etc/pki/katello/nssdb: Starting to evaluate the resource (372 of 1766) 2022-04-26 08:04:27 [DEBUG ] [configure] /etc/pki/katello/nssdb: Evaluated in 0.00 seconds 2022-04-26 08:04:27 [DEBUG ] [configure] Class[Certs::Ssltools::Nssdb]: Starting to evaluate the resource (373 of 1766) 2022-04-26 08:04:27 [DEBUG ] [configure] Class[Certs::Ssltools::Nssdb]: Evaluated in 0.00 seconds 2022-04-26 08:04:27 [DEBUG ] [configure] /Stage[main]/Certs::Qpid/File[/etc/pki/katello/private/dhcp-3-82.vms.sat.rdu2.redhat.com-qpid-broker.key]: Starting to evaluate the resource (374 of 1766) 2022-04-26 08:04:27 [DEBUG ] [configure] /Stage[main]/Certs::Qpid/File[/etc/pki/katello/private/dhcp-3-82.vms.sat.rdu2.redhat.com-qpid-broker.key]: Nothing to manage: no ensure and the resource doesn't exist 2022-04-26 08:04:27 [DEBUG ] [configure] /Stage[main]/Certs::Qpid/File[/etc/pki/katello/private/dhcp-3-82.vms.sat.rdu2.redhat.com-qpid-broker.key]: Evaluated in 0.00 seconds 2022-04-26 08:04:27 [DEBUG ] [configure] /Stage[main]/Certs::Qpid/File[/etc/pki/katello/certs/dhcp-3-82.vms.sat.rdu2.redhat.com-qpid-broker.crt]: Starting to evaluate the resource (375 of 1766) 2022-04-26 08:04:27 [DEBUG ] [configure] /Stage[main]/Certs::Qpid/File[/etc/pki/katello/certs/dhcp-3-82.vms.sat.rdu2.redhat.com-qpid-broker.crt]: Nothing to manage: no ensure and the resource doesn't exist 2022-04-26 08:04:27 [DEBUG ] [configure] /Stage[main]/Certs::Qpid/File[/etc/pki/katello/certs/dhcp-3-82.vms.sat.rdu2.redhat.com-qpid-broker.crt]: Evaluated in 0.00 seconds 2022-04-26 08:04:27 [DEBUG ] [configure] /Stage[main]/Certs::Qpid/Nssdb_certificate[/etc/pki/katello/nssdb:ca]: Starting to evaluate the resource (376 of 1766) 2022-04-26 08:04:27 [DEBUG ] [configure] Executing: '/bin/certutil -L -a -d /etc/pki/katello/nssdb -n ca' 2022-04-26 08:04:27 [DEBUG ] [configure] Executing: '/bin/certutil -L -a -d /etc/pki/katello/nssdb -n ca' 2022-04-26 08:04:27 [DEBUG ] [configure] Executing: '/bin/openssl x509 -sha256 -noout -fingerprint -in /tmp/cert20220426-69708-5x8erg' 2022-04-26 08:04:27 [DEBUG ] [configure] Executing: '/bin/openssl x509 -sha256 -noout -fingerprint -in /etc/pki/katello/certs/katello-default-ca.crt' 2022-04-26 08:04:27 [DEBUG ] [configure] /Stage[main]/Certs::Qpid/Nssdb_certificate[/etc/pki/katello/nssdb:ca]: Evaluated in 0.14 seconds 2022-04-26 08:04:27 [DEBUG ] [configure] /Stage[main]/Certs::Qpid/Nssdb_certificate[/etc/pki/katello/nssdb:broker]: Starting to evaluate the resource (377 of 1766) 2022-04-26 08:04:27 [DEBUG ] [configure] Executing: '/bin/certutil -L -a -d /etc/pki/katello/nssdb -n broker' 2022-04-26 08:04:27 [DEBUG ] [configure] Executing: '/bin/certutil -L -a -d /etc/pki/katello/nssdb -n broker' 2022-04-26 08:04:27 [DEBUG ] [configure] Executing: '/bin/openssl x509 -sha256 -noout -fingerprint -in /tmp/cert20220426-69708-1h3z7ov' 2022-04-26 08:04:27 [DEBUG ] [configure] Executing: '/bin/openssl x509 -sha256 -noout -fingerprint -in /root/ssl-build/dhcp-3-82.vms.sat.rdu2.redhat.com/dhcp-3-82.vms.sat.rdu2.redhat.com-qpid-broker.crt' 2022-04-26 08:04:27 [DEBUG ] [configure] Executing: '/bin/certutil -L -a -d /etc/pki/katello/nssdb -n broker' 2022-04-26 08:04:27 [DEBUG ] [configure] Executing: '/bin/certutil -D -d /etc/pki/katello/nssdb -n broker' 2022-04-26 08:04:27 [DEBUG ] [configure] Executing: '/bin/certutil -A -a -d /etc/pki/katello/nssdb -n broker -t ,, -i /root/ssl-build/dhcp-3-82.vms.sat.rdu2.redhat.com/dhcp-3-82.vms.sat.rdu2.redhat.com-qpid-broker.crt -f /etc/pki/k atello/nss_db_password-file' 2022-04-26 08:04:27 [DEBUG ] [configure] Executing: '/bin/openssl x509 -sha256 -noout -fingerprint -in /root/ssl-build/dhcp-3-82.vms.sat.rdu2.redhat.com/dhcp-3-82.vms.sat.rdu2.redhat.com-qpid-broker.crt' 2022-04-26 08:04:27 [INFO ] [configure] /Stage[main]/Certs::Qpid/Nssdb_certificate[/etc/pki/katello/nssdb:broker]/certificate: certificate changed 'C0:B5:FF:BB:4E:63:37:E8:34:E7:25:00:AE:F5:32:9F:A9:F4:23:7F:36:F2:2D:11:19:68:3C:A 6:7A:F6:B0:70' to 'FB:68:D5:F5:D0:BB:22:6A:73:DD:39:45:51:B9:2C:68:E3:3F:36:D6:FE:D6:58:C9:06:41:7F:FB:FB:66:83:4A' 2022-04-26 08:04:27 [DEBUG ] [configure] /Stage[main]/Certs::Qpid/Nssdb_certificate[/etc/pki/katello/nssdb:broker]: The container Class[Certs::Qpid] will propagate my refresh event 2022-04-26 08:04:27 [DEBUG ] [configure] /Stage[main]/Certs::Qpid/Nssdb_certificate[/etc/pki/katello/nssdb:broker]: Evaluated in 0.31 seconds Now someone smarter than me needs to tell me *why* it decided to re-do the certs?! DING DING DING! 2022-04-26 08:04:13 [DEBUG ] [configure] Found key: "certs::regenerate" value: true And it's true, /etc/foreman-installer/scenarios.d/satellite-answers.yaml has: certs: regenerate: true Which means it regenerates the certs on every run. So I'd argue there are two bugs here: - it should not regenerate the certs on every run - if it is regenerating, it should make qpidd happy afterwards Right. And the real reproducer is: - deploy 6.11 - enable agent support: satellite-installer --foreman-proxy-content-enable-katello-agent true - regen certs: satellite-installer --certs-regenerate true I am guessing for Lukas the origin of the problem is the fact that this is a QE-template machine, that got deployed as sat-template.example.com and then used katello-change-hostname to be renamed to satellite.example.com. This left the regen=true in the answers. The first run to enable agent succeeded (there were no old certs), and the second regenerated the certs again and boom. Workaround: rm -rf /etc/pki/katello/nss* Actually, just purging /etc/pki/katello/nssdb/key3.db is enough. Yeah, so (I am talking to myself, right?) The problem is that the private key is not updated in the nssdb: from a run that properly generated the db: [root@sat-6-11-qa-rhel7 ~]# grep -E '(pkcs12|pk12).*nss' /var/log/foreman-installer/satellite.log 2022-04-27 10:57:39 [DEBUG ] [configure] Executing: '/bin/openssl pkcs12 -export -in /root/ssl-build/sat-6-11-qa-rhel7.tanso.example.com/sat-6-11-qa-rhel7.tanso.example.com-qpid-broker.crt -inkey /root/ssl-build/sat-6-11-qa-rhel7.tanso.example.com/sat-6-11-qa-rhel7.tanso.example.com-qpid-broker.key -out /tmp/pkcs1220220427-30609-jpxpdb -password file:/etc/pki/katello/nss_db_password-file -name broker' 2022-04-27 10:57:39 [DEBUG ] [configure] Executing: '/bin/pk12util -i /tmp/pkcs1220220427-30609-jpxpdb -d /etc/pki/katello/nssdb -w /etc/pki/katello/nss_db_password-file -k /etc/pki/katello/nss_db_password-file' from a failing one: [root@sat-6-11-qa-rhel7 ~]# grep -E '(pkcs12|pk12).*nss' /var/log/foreman-installer/satellite.log <empty> But at that point my understanding of Puppet providers ends and I'm handing over to @ehelms VERIFIED. @Satellite 6.11.0 Snap20 foreman-installer-3.1.2.5-1.el7sat.noarch by the manual reproducer described in comment#15: 1) Install Satellite 6.11.0 2) Enable optional katello-agent support # satellite-installer --foreman-proxy-content-enable-katello-agent true 3) Run the installer again to regenerate certs # satellite-installer --certs-regenerate true 2022-05-19 05:57:19 [NOTICE] [root] Loading installer configuration. This will take some time. 2022-05-19 05:57:25 [NOTICE] [root] Running installer with log based terminal output at level NOTICE. 2022-05-19 05:57:25 [NOTICE] [root] Use -l to set the terminal output log level to ERROR, WARN, NOTICE, INFO, or DEBUG. See --full-help for definitions. Package versions are locked. Continuing with unlock. 2022-05-19 05:57:43 [NOTICE] [configure] Starting system configuration. 2022-05-19 05:57:57 [NOTICE] [configure] 250 configuration steps out of 1762 steps complete. 2022-05-19 05:58:09 [NOTICE] [configure] 500 configuration steps out of 2570 steps complete. 2022-05-19 05:58:10 [NOTICE] [configure] 750 configuration steps out of 2570 steps complete. 2022-05-19 05:58:10 [NOTICE] [configure] 1000 configuration steps out of 2570 steps complete. 2022-05-19 05:58:11 [NOTICE] [configure] 1250 configuration steps out of 2570 steps complete. 2022-05-19 05:58:22 [NOTICE] [configure] 1500 configuration steps out of 2572 steps complete. 2022-05-19 05:58:29 [NOTICE] [configure] 1750 configuration steps out of 2575 steps complete. 2022-05-19 05:58:30 [NOTICE] [configure] 2000 configuration steps out of 2580 steps complete. 2022-05-19 05:58:32 [NOTICE] [configure] 2250 configuration steps out of 2581 steps complete. 2022-05-19 05:59:47 [NOTICE] [configure] 2500 configuration steps out of 2581 steps complete. 2022-05-19 05:59:59 [NOTICE] [configure] System configuration has finished. Success! * Satellite is running at https://sat.example.com ... >>> there are no qpidd failures since not only cert but also private key is now being regenerated Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Satellite 6.11 Release), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5498 The needinfo request[s] on this closed bug have been removed as they have been unresolved for 365 days |