Description of problem: candlepin_events service fails to start after satellite installation. Restarting the service with foreman-maintain health check fails with: Couldn't connect to the server: undefined method `to_sym' for nil:NilClass ausearch shows denials for tomcat: ``` ---- time->Thu Nov 12 15:13:55 2020 type=PROCTITLE msg=audit(1605212035.779:855): proctitle=2F7573722F6C69622F6A766D2F6A72652F62696E2F6A617661002D586D73313032346D002D586D78343039366D002D446A6176612E73656375726974792E617574682E6C6F67696E2E636F6E6669673D2F7573722F73686172652F746F6D6361742F636F6E662F6C6F67696E2E636F6E666967002D636C61737370617468002F type=SYSCALL msg=audit(1605212035.779:855): arch=c000003e syscall=42 success=no exit=-13 a0=35 a1=7f95c034a120 a2=1c a3=c5a items=0 ppid=1 pid=33157 auid=4294967295 uid=53 gid=53 euid=53 suid=53 fsuid=53 egid=53 sgid=53 fsgid=53 tty=(none) ses=4294967295 comm="Thread-11" exe="/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.272.b10-1.el7_9.x86_64/jre/bin/java" subj=system_u:system_r:tomcat_t:s0 key=(null) type=AVC msg=audit(1605212035.779:855): avc: denied { name_connect } for pid=33157 comm="Thread-11" dest=23443 scontext=system_u:system_r:tomcat_t:s0 tcontext=system_u:object_r:katello_candlepin_port_t:s0 tclass=tcp_socket permissive=0 ``` Version-Release number of selected component (if applicable): 6.9.0-1.0 * candlepin-3.1.22-1.el7sat.noarch * candlepin-selinux-3.1.22-1.el7sat.noarch * foreman-2.3.0-0.7.rc1.el7sat.noarch * foreman-bootloaders-redhat-202005201200-1.el7sat.noarch * foreman-bootloaders-redhat-tftpboot-202005201200-1.el7sat.noarch * foreman-cli-2.3.0-0.7.rc1.el7sat.noarch * foreman-debug-2.3.0-0.7.rc1.el7sat.noarch * foreman-dynflow-sidekiq-2.3.0-0.7.rc1.el7sat.noarch * foreman-ec2-2.3.0-0.7.rc1.el7sat.noarch * foreman-gce-2.3.0-0.7.rc1.el7sat.noarch * foreman-installer-2.3.0-0.3.rc1.el7sat.noarch * foreman-installer-katello-2.3.0-0.3.rc1.el7sat.noarch * foreman-libvirt-2.3.0-0.7.rc1.el7sat.noarch * foreman-openstack-2.3.0-0.7.rc1.el7sat.noarch * foreman-ovirt-2.3.0-0.7.rc1.el7sat.noarch * foreman-postgresql-2.3.0-0.7.rc1.el7sat.noarch * foreman-proxy-2.3.0-0.4.rc1.el7sat.noarch * foreman-selinux-2.3.0-0.1.rc1.el7sat.noarch * foreman-service-2.3.0-0.7.rc1.el7sat.noarch * foreman-vmware-2.3.0-0.7.rc1.el7sat.noarch * katello-3.18.0-0.1.rc1.el7sat.noarch * katello-ca-consumer-dhcp-8-29-228.lab.eng.rdu2.redhat.com-1.0-6.noarch * katello-certs-tools-2.7.3-1.el7sat.noarch * katello-client-bootstrap-1.7.5-1.el7sat.noarch * katello-common-3.18.0-0.1.rc1.el7sat.noarch * katello-debug-3.18.0-0.1.rc1.el7sat.noarch * katello-default-ca-1.0-1.noarch * katello-selinux-3.5.0-1.el7sat.noarch * katello-server-ca-1.0-1.noarch * openldap-2.4.44-21.el7_6.x86_64 * pulp-client-1.0-2.noarch * pulp-docker-plugins-3.2.8-1.el7sat.noarch * pulp-katello-1.0.3-1.el7sat.noarch * pulp-maintenance-2.21.4-1.el7sat.noarch * pulp-ostree-plugins-1.3.1-2.el7sat.noarch * pulp-puppet-plugins-2.21.4-1.el7sat.noarch * pulp-puppet-tools-2.21.4-1.el7sat.noarch * pulp-rpm-plugins-2.21.4-1.el7sat.noarch * pulp-selinux-2.21.4-1.el7sat.noarch * pulp-server-2.21.4-1.el7sat.noarch * python-ldap-2.4.15-2.el7.x86_64 * tfm-rubygem-ldap_fluff-0.4.7-5.el7sat.noarch * tfm-rubygem-net-ldap-0.16.1-1.el7sat.noarch How reproducible: 100% Using QE systems, this is present on satellite instances both created in libvirt and in RHV under SatLab. restorecon workaround from BZ1873319 is applied within SatLab: https://bugzilla.redhat.com/show_bug.cgi?id=1873319 Additional info: This is similar in behavior to BZ 1851952, resolved in katello-selinux 3.2.0 https://bugzilla.redhat.com/show_bug.cgi?id=1851952
Is the installer currently following the instructions to enable the selinux port? Specifically, what is returned from "getsebool -a | grep candlepin" ? If that is returning candlepin_can_bind_activemq_port --> off Toggle the bool as desired via "sudo setsebool candlepin_can_bind_activemq_port on" or "sudo setsebool candlepin_can_bind_activemq_port off"
Since this is working now according to Mike's findings (and my own) I'll close this out. We can re-open once we have a clear reproducer.
*** Bug 1901983 has been marked as a duplicate of this bug. ***
Reassigning this bug to the Installer component. On the reproducer I found that the Candlepin truststore doesn't get updated by the installer with the new client cert which is used by the Foreman rails app to connect to the Artemis message broker embedded in Candlepin. That's why candlepin_events shows up as FAIL via hammer ping. Here's what I found: The java-client cert and key in /etc/pki/katello are correctly updated, and are a valid pair => [root@dhcp-2-190 certs]# openssl x509 -noout -modulus -in java-client.crt | openssl md5 (stdin)= d74483a4ae79b6b2a6ea09afe1b21095 [root@dhcp-2-190 certs]# openssl rsa -noout -modulus -in ../private/java-client.key | openssl md5 (stdin)= d74483a4ae79b6b2a6ea09afe1b21095 However, candlepin's truststore doesn't know about the new java-client.crt (called 'artemis-client' in the store) => [root@dhcp-2-190 certs]# keytool -list -keystore truststore Enter keystore password: Keystore type: PKCS12 Keystore provider: SUN Your keystore contains 2 entries artemis-client, Dec 10, 2020, trustedCertEntry, Certificate fingerprint (SHA1): 17:91:F0:47:4C:18:8B:19:57:49:D3:4C:1E:05:38:D9:59:66:82:3B Compare that fingerprint to /etc/pki/katello/certs/java-client.crt => [root@dhcp-2-190 certs]# openssl x509 -noout -fingerprint -sha1 -inform pem -in java-client.crt SHA1 Fingerprint=2C:E3:3C:D1:B3:A5:01:EF:B7:5E:00:5D:6B:87:DF:6B:CA:28:A3:56 They should match, but don't
*** Bug 1906747 has been marked as a duplicate of this bug. ***
Created redmine issue https://projects.theforeman.org/issues/31574 from this bug
*** Bug 1914122 has been marked as a duplicate of this bug. ***
In my testing, this issue seems to come up only when running satellite-change-hostname, because /usr/share/katello/hostname-change.rb deletes the tomcat and candlepin certs and keystore file, but not the truststore, before running foreman-installer / satellite-installer. I think deleting the truststore here is the resolution, instead of adding extra logic to the puppet classes to check if the stores need updating.
This patch resolves the issue in my testing: # diff -pruN /usr/share/katello/hostname-change.rb.bak /usr/share/katello/hostname-change.rb --- /usr/share/katello/hostname-change.rb.bak 2021-01-25 10:25:33.325936156 -0500 +++ /usr/share/katello/hostname-change.rb 2021-01-25 10:26:18.088624628 -0500 @@ -476,9 +476,12 @@ If not done, all hosts will lose connect self.run_cmd("rm -rf /etc/candlepin/certs/amqp{,.bak}") self.run_cmd("rm -f /etc/candlepin/certs/candlepin-ca.crt /etc/candlepin/certs/candlepin-ca.key") self.run_cmd("rm -f /etc/candlepin/certs/keystore") + self.run_cmd("rm -f /etc/candlepin/certs/truststore") self.run_cmd("rm -f /etc/tomcat/keystore") + self.run_cmd("rm -f /etc/tomcat/truststore") self.run_cmd("rm -rf /etc/foreman/old-certs") self.run_cmd("rm -f /etc/pki/katello/keystore") + self.run_cmd("rm -f /etc/pki/katello/truststore") self.run_cmd("rm -rf #{@scenario_answers["foreman"]["client_ssl_cert"]}") self.run_cmd("rm -rf #{@scenario_answers["foreman"]["client_ssl_key"]}") end After running satellite-change-hostname with this patch, katello's candlepin events listener successfully connects to candlepin: # hammer ping [...] candlepin: Status: ok Server Response: Duration: 391ms candlepin_events: Status: ok message: 0 Processed, 0 Failed Server Response: Duration: 0ms [...]
Automation is still seeing the candlepin issue on snap 12 when attempting to run satellite-change-hostname
Verified on 6.9 Snap 12. Verification points: 1- Install 6.9 Snap 12 on the Satlab machine. 2- Checked the hammer ping and satellite services status, I didn't see any candlepin issue. 3- the changes that made in /usr/share/katello/hostname-change.rb file is reflected. 4- Checked the fix package. rpm -qa|grep katello-3.18.1-2 katello-3.18.1-2.el7sat.noarch Can we mark this bug as verified and start a separate discussion on pondrejk reported problem in the new bug?
There are two grouped but entirely separate issues that have been in this bug. 1) a candlepin SELinux denial that can appear when doing a change hostname sometimes 2) truststore not getting updated with new certificates when they change There is a fix for the SELinux issue -- https://github.com/Katello/katello-selinux/pull/31 but I don't think it made it into downstream yet. Then there is the truststore issue fixes which has been the primary target of this BZ. I would be happy if we split them as they are different concerns. They are both caused by change hostname but they are different issues.
pondrejk @ehelms I have created a separate issue BZ#1925616 for "candlepin SELinux denial" that can appear when doing a satellite hostname change. As per the @ehelms comment, this bug is more related to trustcore certificates update, So I am marking this bug as verified based on BZ#1897360#c20 comment.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Satellite 6.9 Release), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:1313