Note: This bug is displayed in read-only format because
the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Description of problem:
this is about an outdated version of certmonger-0.61-3.el6.x86_64 on RHEL 6.4
for a request to review a stack trace and explain what is happening.
in this case, certmonger shuts down for a request that seem correct,
Request ID '20160110091219':
status: NEED_TO_SAVE_CERT
see some gdb details below.
in particular, could this be the same as the following fix on RHEL 7, either:
* Mon Jan 27 2014 Nalin Dahyabhai <nalin> 0.71-1
...snip...
- only handle one watch at a time, which should avoid abort() during
attempts to reconnect to the message bus after losing our connection
to it (#1055521)
or
* Mon Dec 09 2013 Nalin Dahyabhai <nalin> 0.69-1
...snip,,,
- don't abort() due to incorrect locking when an attempt to save an issued
certificate to the designated location fails (part of #1032760/#1033333,
ticket #22)
( but bz 1032760 - Certmonger crashes when trying to decode certificate with invalid data in header
says fixed in version certmonger-0.75.2-1.el6 while upstream ticket 22 indicates fix is in 0.69 and later, which version has the fix? )
https://fedorahosted.org/certmonger/ticket/22
Certmonger aborts after unsuccessful attempt to save certificate
"Certmonger acquires writing lock in the NEED_TO_SAVE_CERT state but doesn't release it after unsucessful attempt to save the certificate before it goes to NEED_TO_SAVE_CERT again, causing it to abort itself:
...snip...
Fixed upstream in 0.69 and later. The RHEL6-specific resolution of this is being tracked at https://bugzilla.redhat.com/show_bug.cgi?id=1032760, and since we no longer need this ticket to track it, I'm marking this one as fixed.
"
?
In which case the recommendation would be an update to a recent version of certmonger.
I can provide access to a VM to read a core file collected from the customer's case number 01556025 (sev1 for multiple morphing issues)
Thanks,
M.
Version-Release number of selected component (if applicable):
RHEL 6.4
certmonger-0.61-3.el6.x86_64
How reproducible:
N/A
Steps to Reproduce:
1. N/A
2.
3.
Actual results:
certmonger SIGABRT
Expected results:
Additional info:
information collected from customer's attachment
880-sosreport-puncmnldap01.01556025-20160111094416-ed89.tar.xz
certmonger-0.61-3.el6.x86_64
redhat-release-server-6Server-6.4.0.4.el6.x86_64
880-sosreport-puncmnldap01.01556025-20160111094416-ed89.tar.xz/puncmnldap01-2016011109411452485482/var/log/messages
...
Jan 10 14:42:18 puncmnldap01 certmonger: Certificate named "auditSigningCert cert-pki-ca" in token "NSS Certificate DB" in database "/var/lib/pki-ca/alias" is no longer valid.
Jan 10 14:42:19 puncmnldap01 certmonger: Certificate named "ocspSigningCert cert-pki-ca" in token "NSS Certificate DB" in database "/var/lib/pki-ca/alias" is no longer valid.
Jan 10 14:42:20 puncmnldap01 certmonger: Certificate named "subsystemCert cert-pki-ca" in token "NSS Certificate DB" in database "/var/lib/pki-ca/alias" is no longer valid.
Jan 10 14:42:21 puncmnldap01 python: Exception trying to retrieve auditSigningCert cert-pki-ca: Error initializing principal host/puncmnldap01.accelyakale.com in /etc/krb5.keytab: (-1765328230, 'Cannot find KDC for requested realm')
Jan 10 14:42:21 puncmnldap01 certmonger: #033[?1034h
Jan 10 14:42:22 puncmnldap01 python: Exception trying to retrieve ocspSigningCert cert-pki-ca: Error initializing principal host/puncmnldap01.accelyakale.com in /etc/krb5.keytab: (-1765328230, 'Cannot find KDC for requested realm')
Jan 10 14:42:22 puncmnldap01 certmonger: #033[?1034h
Jan 10 14:42:22 puncmnldap01 python: Exception trying to retrieve subsystemCert cert-pki-ca: Error initializing principal host/puncmnldap01.accelyakale.com in /etc/krb5.keytab: (-1765328230, 'Cannot find KDC for requested realm')
Jan 10 14:42:23 puncmnldap01 certmonger: #033[?1034h
Jan 10 14:43:37 puncmnldap01 ntpd[25973]: synchronized to LOCAL(0), stratum 10
Jan 10 14:43:39 puncmnldap01 python: Exception trying to retrieve ocspSigningCert cert-pki-ca: Error initializing principal host/puncmnldap01.accelyakale.com in /etc/krb5.keytab: (-1765328230, 'Cannot find KDC for requested realm')
Jan 10 14:43:39 puncmnldap01 certmonger: #033[?1034h
Jan 10 14:43:39 puncmnldap01 python: Exception trying to retrieve auditSigningCert cert-pki-ca: Error initializing principal host/puncmnldap01.accelyakale.com in /etc/krb5.keytab: (-1765328230, 'Cannot find KDC for requested realm')
Jan 10 14:43:39 puncmnldap01 certmonger: #033[?1034h
Jan 10 14:43:40 puncmnldap01 python: Exception trying to retrieve subsystemCert cert-pki-ca: Error initializing principal host/puncmnldap01.accelyakale.com in /etc/krb5.keytab: (-1765328230, 'Cannot find KDC for requested realm')
Jan 10 14:43:40 puncmnldap01 certmonger: #033[?1034h
Jan 10 14:44:51 puncmnldap01 ns-slapd: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (Credentials cache file '/tmp/krb5cc_495' not found)
...snip...
Jan 11 09:29:11 puncmnldap01 python: certmonger started pki-cad, nickname 'ocspSigningCert cert-pki-ca'
Jan 11 09:29:11 puncmnldap01 certmonger: Certificate named "ocspSigningCert cert-pki-ca" in token "NSS Certificate DB" in database "/var/lib/pki-ca/alias" issued by CA and saved.
Jan 11 09:29:12 puncmnldap01 named[9357]: zone accelyakale.com/IN: sending notifies (serial 1452484748)
Jan 11 09:29:15 puncmnldap01 python: certmonger stopping pki-cad
Jan 11 09:29:16 puncmnldap01 python: Cannot stop pki-cad: Command '/sbin/service pki-cad stop pki-ca' returned non-zero exit status 4
Jan 11 09:29:16 puncmnldap01 certmonger: Certificate named "ocspSigningCert cert-pki-ca" in token "NSS Certificate DB" in database "/var/lib/pki-ca/alias" is no longer valid.
Jan 11 09:29:18 puncmnldap01 python: Updating certificate for ocspSigningCert cert-pki-ca
...snip...
Jan 11 09:29:46 puncmnldap01 certmonger: Certificate named "auditSigningCert cert-pki-ca" in token "NSS Certificate DB" in database "/var/lib/pki-ca/alias" issued by CA but not saved.
Jan 11 09:30:13 puncmnldap01 named[9357]: client 10.10.9.64#12977: received notify for zone 'accelyakale.com'
Jan 11 09:30:15 puncmnldap01 named[9357]: zone accelyakale.com/IN: sending notifies (serial 1452484815)
Jan 11 09:30:16 puncmnldap01 abrtd: Directory 'ccpp-2016-01-11-09:30:16-12680' creation detected
Jan 11 09:30:16 puncmnldap01 abrt[14239]: Saved core dump of pid 12680 (/usr/sbin/certmonger) to /var/spool/abrt/ccpp-2016-01-11-09:30:16-12680 (1064960 bytes)
...snip...
Jan 11 10:39:59 puncmnldap01 abrtd: Directory 'ccpp-2016-01-11-10:39:59-21333' creation detected
Jan 11 10:39:59 puncmnldap01 abrt[23407]: Saved core dump of pid 21333 (/usr/sbin/certmonger) to /var/spool/abrt/ccpp-2016-01-11-10:39:59-21333 (1073152 bytes)
I do not seem to see a relevant fix for this kind of crash per the certmonger rpm changelog from RHEL 6.7:
on rhel67
rpm -q certmonger
certmonger-0.77.5-1.el6.x86_64
rpm -q certmonger --changelog | less
* Thu May 28 2015 Nalin Dahyabhai <nalin> 0.77.5-1
...
* Fri Feb 27 2015 Nalin Dahyabhai <nalin> 0.77.1-1
- update to 0.77
- ipa-getcert: fix a crash when using DNS discovery to locate servers (#39)
...
* Tue Jan 08 2013 Nalin Dahyabhai <nalin> 0.65-1
- fix a crash in the self-tests
...
* Thu Nov 29 2012 Nalin Dahyabhai <nalin> 0.62-1
- don't crash when OpenSSL can't build a template certificate from a request
when we're in FIPS mode
...
* Tue Nov 27 2012 Nalin Dahyabhai <nalin> 0.61-3
- backport change from git to not choke if X509_REQ_to_X509() fails when we're
self-signing using OpenSSL
- backport another change from git to represent this as a CA-rejected error
and reading from attachment
10-core.tar.gz
Core was generated by `/usr/sbin/certmonger -S -p /var/run/certmonger.pid'.
Program terminated with signal 6, Aborted.
#0 0x00000031d02328a5 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
64 return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);
Thread 1 (Thread 0x7f4d38f16800 (LWP 21333)):
#0 0x00000031d02328a5 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
resultvar = 0
pid = <value optimized out>
selftid = 21333
#1 0x00000031d0234085 in abort () at abort.c:92
save_stage = 2
act = {__sigaction_handler = {sa_handler = 0x436, sa_sigaction = 0x436}, sa_mask = {__val = {1079, 1168, 213945652101, 0, 1159, 213948886760, 4305167, 6, 2, 0, 160, 0, 0, 0, 214748364808, 532575944795}}, sa_flags = 119, sa_restorer = 0xb}
sigs = {__val = {32, 0 <repeats 15 times>}}
#2 0x00000000004096a0 in cm_writing_lock (entry=0x225a490) at iterate.c:256
No locals.
#3 0x000000000040a2e0 in cm_iterate (entry=0x225a490, ca=0x2265f50, context=0x2255590, get_ca_by_index=0x407b50 <cm_get_ca_by_index>, get_n_cas=0x407b70 <cm_get_n_cas>, emit_entry_saved_cert=0x41bbd0 <cm_tdbush_property_emit_entry_saved_cert>, emit_entry_changes=0x41bc90 <cm_tdbush_property_emit_entry_changes>, cm_iterate_state=0x22576f0, when=0x7fff17b3c204, delay=0x7fff17b3c20c, readfd=0x7fff17b3c208) at iterate.c:822
i = <value optimized out>
j = <value optimized out>
remaining = <value optimized out>
state = 0x22576f0
tmp_ca = <value optimized out>
old_entry = 0x226af50
serial = <value optimized out>
tmp_ca_name = <value optimized out>
#4 0x000000000040895f in cm_service_one (context=0x2255590, i=2, current_time=0x0) at cm.c:322
ret = <value optimized out>
delay = <value optimized out>
fd = -1
now = {tv_sec = 1452488999, tv_usec = 753394}
then = {tv_sec = 140733591044608, tv_usec = 140733591845052}
when = <value optimized out>
t = <value optimized out>
#5 0x0000000000408d69 in cm_timer_h (ec=<value optimized out>, te=<value optimized out>, current_time=..., pvt=0x2255590) at cm.c:146
context = 0x2255590
i = <value optimized out>
#6 0x00000031d8207bd9 in tevent_common_loop_timer_delay (ev=0x2255450) at ../tevent_timed.c:254
current_time = {tv_sec = 1452488999, tv_usec = 753389}
te = 0x2273760
#7 0x00000031d8207490 in epoll_event_loop (ev=<value optimized out>, location=<value optimized out>) at ../tevent_standard.c:298
ret = 0
i = <value optimized out>
events = {{events = 16, data = {ptr = 0x22738c0, fd = 36124864, u32 = 36124864, u64 = 36124864}}}
timeout = <value optimized out>
#8 std_event_loop_once (ev=<value optimized out>, location=<value optimized out>) at ../tevent_standard.c:567
std_ev = 0x2255520
tval = {tv_sec = 2, tv_usec = 15464}
#9 0x00000031d82038f0 in _tevent_loop_once (ev=0x2255450, location=0x42ab1c "main.c:287") at ../tevent.c:507
ret = <value optimized out>
nesting_stack_ptr = 0x0
#10 0x00000000004075bd in main (argc=<value optimized out>, argv=0x7fff17b3c4d8) at main.c:287
...snip...
(gdb) where
#0 0x00000031d02328a5 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1 0x00000031d0234085 in abort () at abort.c:92
#2 0x00000000004096a0 in cm_writing_lock (entry=0x225a490) at iterate.c:256
#3 0x000000000040a2e0 in cm_iterate (entry=0x225a490, ca=0x2265f50, context=0x2255590, get_ca_by_index=0x407b50 <cm_get_ca_by_index>, get_n_cas=0x407b70 <cm_get_n_cas>,
emit_entry_saved_cert=0x41bbd0 <cm_tdbush_property_emit_entry_saved_cert>, emit_entry_changes=0x41bc90 <cm_tdbush_property_emit_entry_changes>, cm_iterate_state=0x22576f0, when=0x7fff17b3c204, delay=0x7fff17b3c20c,
readfd=0x7fff17b3c208) at iterate.c:822
#4 0x000000000040895f in cm_service_one (context=0x2255590, i=2, current_time=0x0) at cm.c:322
#5 0x0000000000408d69 in cm_timer_h (ec=<value optimized out>, te=<value optimized out>, current_time=..., pvt=0x2255590) at cm.c:146
#6 0x00000031d8207bd9 in tevent_common_loop_timer_delay (ev=0x2255450) at ../tevent_timed.c:254
#7 0x00000031d8207490 in epoll_event_loop (ev=<value optimized out>, location=<value optimized out>) at ../tevent_standard.c:298
#8 std_event_loop_once (ev=<value optimized out>, location=<value optimized out>) at ../tevent_standard.c:567
#9 0x00000031d82038f0 in _tevent_loop_once (ev=0x2255450, location=0x42ab1c "main.c:287") at ../tevent.c:507
#10 0x00000000004075bd in main (argc=<value optimized out>, argv=0x7fff17b3c4d8) at main.c:287
(gdb)
(gdb) up
#1 0x00000031d0234085 in abort () at abort.c:92
92 raise (SIGABRT);
(gdb) up
#2 0x00000000004096a0 in cm_writing_lock (entry=0x225a490) at iterate.c:256
256 abort();
(gdb) list
251 if (writing_lock == NULL) {
252 cm_log(3, "%s('%s') taking writing lock\n",
253 entry->cm_busname, entry->cm_nickname);
254 writing_lock = entry;
255 } else {
256 abort();
257 }
258 return TRUE;
259 } else {
260 return FALSE;
(gdb)
│247 static dbus_bool_t │
│248 cm_writing_lock(struct cm_store_entry *entry) │
│249 { │
│250 if ((writing_lock == entry) || (writing_lock == NULL)) { │
│251 if (writing_lock == NULL) { │
│252 cm_log(3, "%s('%s') taking writing lock\n", │
│253 entry->cm_busname, entry->cm_nickname); │
│254 writing_lock = entry; │
│255 } else { │
>│256 abort(); │
│257 } │
│258 return TRUE; │
│259 } else { │
│260 return FALSE; │
│261 } │
│262 } │
(gdb) p writing_lock
$1 = (struct cm_store_entry *) 0x225a490
(gdb)
(gdb) p entry
$2 = (struct cm_store_entry *) 0x225a490
(gdb)
(gdb) p * writing_lock
$3 = {cm_busname = 0x225c4f0 "Request3", cm_store_private = 0x225c560, cm_nickname = 0x2258ef0 "20160110091219", cm_key_type = {cm_key_algorithm = cm_key_rsa, cm_key_gen_algorithm = cm_key_rsa, cm_key_size = 2048,
cm_key_gen_size = 2048}, cm_key_storage_type = cm_key_storage_nssdb, cm_key_storage_location = 0x225a700 "/var/lib/pki-ca/alias", cm_key_token = 0x2256ee0 "NSS Certificate DB",
cm_key_nickname = 0x225a130 "ocspSigningCert cert-pki-ca", cm_key_pin = 0x2258320 "501310396051", cm_key_pin_file = 0x0,
cm_key_pubkey = 0x225a8b0 "3082010A0282010100E004C44FD3C01B34C5137DA7D0C555695B2323935594251DC1758E53183B77CFDC19081395AA9738DB39B30349111A85EC15DB883F6A82110573340776CB4917797795D0A7B4476702F5CDC715989094E0D915FEE66BE34E7A7C6B"...,
cm_cert_storage_type = cm_cert_storage_nssdb, cm_cert_storage_location = 0x225a690 "/var/lib/pki-ca/alias", cm_cert_token = 0x2257860 "NSS Certificate DB", cm_cert_nickname = 0x2258390 "ocspSigningCert cert-pki-ca",
cm_cert_issuer = 0x2259390 "CN=Certificate Authority,O=ACCELYAKALE.COM", cm_cert_serial = 0x225abd0 "02", cm_cert_subject = 0x225ac30 "CN=OCSP Subsystem,O=ACCELYAKALE.COM",
cm_cert_spki = 0x225ae10 "30820122300d06092a864886f70d01010105000382010f003082010a0282010100e004c44fd3c01b34c5137da7d0c555695b2323935594251dc1758e53183b77cfdc19081395aa9738db39b30349111a85ec15db883f6a82110573340776cb4917797795"...,
cm_cert_not_before = 1378736462, cm_cert_not_after = 1440944462, cm_cert_hostname = 0x225acb0, cm_cert_email = 0x225ab40, cm_cert_principal = 0x2257790, cm_cert_ku = 0x2257800 "1100011", cm_cert_eku = 0x225ad20 "1.3.6.1.5.5.7.3.9",
cm_cert_profile = 0x0, cm_last_need_notify_check = 1452488898, cm_last_need_enroll_check = 1452488898, cm_notification_method = cm_notification_unspecified, cm_notification_destination = 0x0,
cm_template_subject = 0x225b1f0 "CN=OCSP Subsystem,O=ACCELYAKALE.COM", cm_template_hostname = 0x0, cm_template_email = 0x0, cm_template_principal = 0x0, cm_template_ku = 0x225b280 "1100011",
cm_template_eku = 0x225ad90 "1.3.6.1.5.5.7.3.9", cm_challenge_password = 0x0,
cm_csr = 0x2271960 "-----BEGIN NEW CERTIFICATE REQUEST-----\nMIIC3TCCAcUCAQAwMzEYMBYGA1UEChMPQUNDRUxZQUtBTEUuQ09NMRcwFQYDVQQD\nEw5PQ1NQIFN1YnN5c3RlbTCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEB\nAOAExE/TwBs0xRN9p9DFVWlbIyOTVZ"...,
cm_spkac = 0x225b430 "MIICQDCCASgwggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQDgBMRP08AbNMUTfafQxVVpWyMjk1WUJR3BdY5TGDt3z9wZCBOVqpc42zmzA0kRGoXsFduIP2qCEQVzNAd2y0kXeXeV0Ke0R2cC9c3HFZiQlODZFf7ma+NOenxrLt6L5ygBvWqmFD+6qyMO6Tgj"...,
cm_state = CM_NEED_TO_SAVE_CERT, cm_autorenew = 1, cm_monitor = 1, cm_ca_nickname = 0x225b360 "dogtag-ipa-retrieve-agent-submit", cm_ca_profile = 0x225b8d0 "ocspSigningCert cert-pki-ca", cm_submitted = 1452488901, cm_ca_cookie = 0x0,
cm_ca_error = 0x0,
cm_cert = 0x2269df0 "\033[?1034h-----BEGIN CERTIFICATE-----\nMIIDgDCCAmigAwIBAgICAOIwDQYJKoZIhvcNAQELBQAwOjEYMBYGA1UEChMPQUND\nRUxZQUtBTEUuQ09NMR4wHAYDVQQDExVDZXJ0aWZpY2F0ZSBBdXRob3JpdHkwHhcN\nMTUwODI5MDMwOTU4WhcNMTcwODE4MDMwOT"...,
cm_pre_certsave_command = 0x225be10 "/usr/lib64/ipa/certmonger/stop_pkicad", cm_pre_certsave_uid = 0x225beb0 "0", cm_post_certsave_command = 0x225c690 "/usr/lib64/ipa/certmonger/restart_pkicad \"ocspSigningCert cert-pki-ca\"",
cm_post_certsave_uid = 0x225bf20 "0"}
(gdb)
this matches the third record from output of
getcert list
...
Request ID '20160110091219':
status: NEED_TO_SAVE_CERT
stuck: no
key pair storage: type=NSSDB,location='/var/lib/pki-ca/alias',nickname='ocspSigningCert cert-pki-ca',token='NSS Certificate DB',pin='501310396051'
certificate: type=NSSDB,location='/var/lib/pki-ca/alias',nickname='ocspSigningCert cert-pki-ca',token='NSS Certificate DB'
CA: dogtag-ipa-retrieve-agent-submit
issuer: CN=Certificate Authority,O=ACCELYAKALE.COM
subject: CN=OCSP Subsystem,O=ACCELYAKALE.COM
expires: 2015-08-30 14:21:02 UTC
eku: id-kp-OCSPSigning
pre-save command: /usr/lib64/ipa/certmonger/stop_pkicad
post-save command: /usr/lib64/ipa/certmonger/restart_pkicad "ocspSigningCert cert-pki-ca"
track: yes
auto-renew: yes
gdb) up
#3 0x000000000040a2e0 in cm_iterate (entry=0x225a490, ca=0x2265f50, context=0x2255590, get_ca_by_index=0x407b50 <cm_get_ca_by_index>, get_n_cas=0x407b70 <cm_get_n_cas>,
emit_entry_saved_cert=0x41bbd0 <cm_tdbush_property_emit_entry_saved_cert>, emit_entry_changes=0x41bc90 <cm_tdbush_property_emit_entry_changes>, cm_iterate_state=0x22576f0, when=0x7fff17b3c204, delay=0x7fff17b3c20c,
readfd=0x7fff17b3c208) at iterate.c:822
(gdb)
│821 case CM_NEED_TO_SAVE_CERT:
>│822 if (!cm_writing_lock(entry)) {
│823 /* Just hang out in this state while we're messing
│824 * around with the outside world for another entry. */
│825 cm_log(3, "%s('%s') waiting for saving lock\n",
│826 entry->cm_busname, entry->cm_nickname);
│827 *when = cm_time_soon;
│828 break;
│829 }
Description of problem: this is about an outdated version of certmonger-0.61-3.el6.x86_64 on RHEL 6.4 for a request to review a stack trace and explain what is happening. in this case, certmonger shuts down for a request that seem correct, Request ID '20160110091219': status: NEED_TO_SAVE_CERT see some gdb details below. in particular, could this be the same as the following fix on RHEL 7, either: * Mon Jan 27 2014 Nalin Dahyabhai <nalin> 0.71-1 ...snip... - only handle one watch at a time, which should avoid abort() during attempts to reconnect to the message bus after losing our connection to it (#1055521) or * Mon Dec 09 2013 Nalin Dahyabhai <nalin> 0.69-1 ...snip,,, - don't abort() due to incorrect locking when an attempt to save an issued certificate to the designated location fails (part of #1032760/#1033333, ticket #22) ( but bz 1032760 - Certmonger crashes when trying to decode certificate with invalid data in header says fixed in version certmonger-0.75.2-1.el6 while upstream ticket 22 indicates fix is in 0.69 and later, which version has the fix? ) https://fedorahosted.org/certmonger/ticket/22 Certmonger aborts after unsuccessful attempt to save certificate "Certmonger acquires writing lock in the NEED_TO_SAVE_CERT state but doesn't release it after unsucessful attempt to save the certificate before it goes to NEED_TO_SAVE_CERT again, causing it to abort itself: ...snip... Fixed upstream in 0.69 and later. The RHEL6-specific resolution of this is being tracked at https://bugzilla.redhat.com/show_bug.cgi?id=1032760, and since we no longer need this ticket to track it, I'm marking this one as fixed. " ? In which case the recommendation would be an update to a recent version of certmonger. I can provide access to a VM to read a core file collected from the customer's case number 01556025 (sev1 for multiple morphing issues) Thanks, M. Version-Release number of selected component (if applicable): RHEL 6.4 certmonger-0.61-3.el6.x86_64 How reproducible: N/A Steps to Reproduce: 1. N/A 2. 3. Actual results: certmonger SIGABRT Expected results: Additional info: information collected from customer's attachment 880-sosreport-puncmnldap01.01556025-20160111094416-ed89.tar.xz certmonger-0.61-3.el6.x86_64 redhat-release-server-6Server-6.4.0.4.el6.x86_64 880-sosreport-puncmnldap01.01556025-20160111094416-ed89.tar.xz/puncmnldap01-2016011109411452485482/var/log/messages ... Jan 10 14:42:18 puncmnldap01 certmonger: Certificate named "auditSigningCert cert-pki-ca" in token "NSS Certificate DB" in database "/var/lib/pki-ca/alias" is no longer valid. Jan 10 14:42:19 puncmnldap01 certmonger: Certificate named "ocspSigningCert cert-pki-ca" in token "NSS Certificate DB" in database "/var/lib/pki-ca/alias" is no longer valid. Jan 10 14:42:20 puncmnldap01 certmonger: Certificate named "subsystemCert cert-pki-ca" in token "NSS Certificate DB" in database "/var/lib/pki-ca/alias" is no longer valid. Jan 10 14:42:21 puncmnldap01 python: Exception trying to retrieve auditSigningCert cert-pki-ca: Error initializing principal host/puncmnldap01.accelyakale.com in /etc/krb5.keytab: (-1765328230, 'Cannot find KDC for requested realm') Jan 10 14:42:21 puncmnldap01 certmonger: #033[?1034h Jan 10 14:42:22 puncmnldap01 python: Exception trying to retrieve ocspSigningCert cert-pki-ca: Error initializing principal host/puncmnldap01.accelyakale.com in /etc/krb5.keytab: (-1765328230, 'Cannot find KDC for requested realm') Jan 10 14:42:22 puncmnldap01 certmonger: #033[?1034h Jan 10 14:42:22 puncmnldap01 python: Exception trying to retrieve subsystemCert cert-pki-ca: Error initializing principal host/puncmnldap01.accelyakale.com in /etc/krb5.keytab: (-1765328230, 'Cannot find KDC for requested realm') Jan 10 14:42:23 puncmnldap01 certmonger: #033[?1034h Jan 10 14:43:37 puncmnldap01 ntpd[25973]: synchronized to LOCAL(0), stratum 10 Jan 10 14:43:39 puncmnldap01 python: Exception trying to retrieve ocspSigningCert cert-pki-ca: Error initializing principal host/puncmnldap01.accelyakale.com in /etc/krb5.keytab: (-1765328230, 'Cannot find KDC for requested realm') Jan 10 14:43:39 puncmnldap01 certmonger: #033[?1034h Jan 10 14:43:39 puncmnldap01 python: Exception trying to retrieve auditSigningCert cert-pki-ca: Error initializing principal host/puncmnldap01.accelyakale.com in /etc/krb5.keytab: (-1765328230, 'Cannot find KDC for requested realm') Jan 10 14:43:39 puncmnldap01 certmonger: #033[?1034h Jan 10 14:43:40 puncmnldap01 python: Exception trying to retrieve subsystemCert cert-pki-ca: Error initializing principal host/puncmnldap01.accelyakale.com in /etc/krb5.keytab: (-1765328230, 'Cannot find KDC for requested realm') Jan 10 14:43:40 puncmnldap01 certmonger: #033[?1034h Jan 10 14:44:51 puncmnldap01 ns-slapd: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (Credentials cache file '/tmp/krb5cc_495' not found) ...snip... Jan 11 09:29:11 puncmnldap01 python: certmonger started pki-cad, nickname 'ocspSigningCert cert-pki-ca' Jan 11 09:29:11 puncmnldap01 certmonger: Certificate named "ocspSigningCert cert-pki-ca" in token "NSS Certificate DB" in database "/var/lib/pki-ca/alias" issued by CA and saved. Jan 11 09:29:12 puncmnldap01 named[9357]: zone accelyakale.com/IN: sending notifies (serial 1452484748) Jan 11 09:29:15 puncmnldap01 python: certmonger stopping pki-cad Jan 11 09:29:16 puncmnldap01 python: Cannot stop pki-cad: Command '/sbin/service pki-cad stop pki-ca' returned non-zero exit status 4 Jan 11 09:29:16 puncmnldap01 certmonger: Certificate named "ocspSigningCert cert-pki-ca" in token "NSS Certificate DB" in database "/var/lib/pki-ca/alias" is no longer valid. Jan 11 09:29:18 puncmnldap01 python: Updating certificate for ocspSigningCert cert-pki-ca ...snip... Jan 11 09:29:46 puncmnldap01 certmonger: Certificate named "auditSigningCert cert-pki-ca" in token "NSS Certificate DB" in database "/var/lib/pki-ca/alias" issued by CA but not saved. Jan 11 09:30:13 puncmnldap01 named[9357]: client 10.10.9.64#12977: received notify for zone 'accelyakale.com' Jan 11 09:30:15 puncmnldap01 named[9357]: zone accelyakale.com/IN: sending notifies (serial 1452484815) Jan 11 09:30:16 puncmnldap01 abrtd: Directory 'ccpp-2016-01-11-09:30:16-12680' creation detected Jan 11 09:30:16 puncmnldap01 abrt[14239]: Saved core dump of pid 12680 (/usr/sbin/certmonger) to /var/spool/abrt/ccpp-2016-01-11-09:30:16-12680 (1064960 bytes) ...snip... Jan 11 10:39:59 puncmnldap01 abrtd: Directory 'ccpp-2016-01-11-10:39:59-21333' creation detected Jan 11 10:39:59 puncmnldap01 abrt[23407]: Saved core dump of pid 21333 (/usr/sbin/certmonger) to /var/spool/abrt/ccpp-2016-01-11-10:39:59-21333 (1073152 bytes) I do not seem to see a relevant fix for this kind of crash per the certmonger rpm changelog from RHEL 6.7: on rhel67 rpm -q certmonger certmonger-0.77.5-1.el6.x86_64 rpm -q certmonger --changelog | less * Thu May 28 2015 Nalin Dahyabhai <nalin> 0.77.5-1 ... * Fri Feb 27 2015 Nalin Dahyabhai <nalin> 0.77.1-1 - update to 0.77 - ipa-getcert: fix a crash when using DNS discovery to locate servers (#39) ... * Tue Jan 08 2013 Nalin Dahyabhai <nalin> 0.65-1 - fix a crash in the self-tests ... * Thu Nov 29 2012 Nalin Dahyabhai <nalin> 0.62-1 - don't crash when OpenSSL can't build a template certificate from a request when we're in FIPS mode ... * Tue Nov 27 2012 Nalin Dahyabhai <nalin> 0.61-3 - backport change from git to not choke if X509_REQ_to_X509() fails when we're self-signing using OpenSSL - backport another change from git to represent this as a CA-rejected error and reading from attachment 10-core.tar.gz Core was generated by `/usr/sbin/certmonger -S -p /var/run/certmonger.pid'. Program terminated with signal 6, Aborted. #0 0x00000031d02328a5 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 64 return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig); Thread 1 (Thread 0x7f4d38f16800 (LWP 21333)): #0 0x00000031d02328a5 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 resultvar = 0 pid = <value optimized out> selftid = 21333 #1 0x00000031d0234085 in abort () at abort.c:92 save_stage = 2 act = {__sigaction_handler = {sa_handler = 0x436, sa_sigaction = 0x436}, sa_mask = {__val = {1079, 1168, 213945652101, 0, 1159, 213948886760, 4305167, 6, 2, 0, 160, 0, 0, 0, 214748364808, 532575944795}}, sa_flags = 119, sa_restorer = 0xb} sigs = {__val = {32, 0 <repeats 15 times>}} #2 0x00000000004096a0 in cm_writing_lock (entry=0x225a490) at iterate.c:256 No locals. #3 0x000000000040a2e0 in cm_iterate (entry=0x225a490, ca=0x2265f50, context=0x2255590, get_ca_by_index=0x407b50 <cm_get_ca_by_index>, get_n_cas=0x407b70 <cm_get_n_cas>, emit_entry_saved_cert=0x41bbd0 <cm_tdbush_property_emit_entry_saved_cert>, emit_entry_changes=0x41bc90 <cm_tdbush_property_emit_entry_changes>, cm_iterate_state=0x22576f0, when=0x7fff17b3c204, delay=0x7fff17b3c20c, readfd=0x7fff17b3c208) at iterate.c:822 i = <value optimized out> j = <value optimized out> remaining = <value optimized out> state = 0x22576f0 tmp_ca = <value optimized out> old_entry = 0x226af50 serial = <value optimized out> tmp_ca_name = <value optimized out> #4 0x000000000040895f in cm_service_one (context=0x2255590, i=2, current_time=0x0) at cm.c:322 ret = <value optimized out> delay = <value optimized out> fd = -1 now = {tv_sec = 1452488999, tv_usec = 753394} then = {tv_sec = 140733591044608, tv_usec = 140733591845052} when = <value optimized out> t = <value optimized out> #5 0x0000000000408d69 in cm_timer_h (ec=<value optimized out>, te=<value optimized out>, current_time=..., pvt=0x2255590) at cm.c:146 context = 0x2255590 i = <value optimized out> #6 0x00000031d8207bd9 in tevent_common_loop_timer_delay (ev=0x2255450) at ../tevent_timed.c:254 current_time = {tv_sec = 1452488999, tv_usec = 753389} te = 0x2273760 #7 0x00000031d8207490 in epoll_event_loop (ev=<value optimized out>, location=<value optimized out>) at ../tevent_standard.c:298 ret = 0 i = <value optimized out> events = {{events = 16, data = {ptr = 0x22738c0, fd = 36124864, u32 = 36124864, u64 = 36124864}}} timeout = <value optimized out> #8 std_event_loop_once (ev=<value optimized out>, location=<value optimized out>) at ../tevent_standard.c:567 std_ev = 0x2255520 tval = {tv_sec = 2, tv_usec = 15464} #9 0x00000031d82038f0 in _tevent_loop_once (ev=0x2255450, location=0x42ab1c "main.c:287") at ../tevent.c:507 ret = <value optimized out> nesting_stack_ptr = 0x0 #10 0x00000000004075bd in main (argc=<value optimized out>, argv=0x7fff17b3c4d8) at main.c:287 ...snip... (gdb) where #0 0x00000031d02328a5 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 #1 0x00000031d0234085 in abort () at abort.c:92 #2 0x00000000004096a0 in cm_writing_lock (entry=0x225a490) at iterate.c:256 #3 0x000000000040a2e0 in cm_iterate (entry=0x225a490, ca=0x2265f50, context=0x2255590, get_ca_by_index=0x407b50 <cm_get_ca_by_index>, get_n_cas=0x407b70 <cm_get_n_cas>, emit_entry_saved_cert=0x41bbd0 <cm_tdbush_property_emit_entry_saved_cert>, emit_entry_changes=0x41bc90 <cm_tdbush_property_emit_entry_changes>, cm_iterate_state=0x22576f0, when=0x7fff17b3c204, delay=0x7fff17b3c20c, readfd=0x7fff17b3c208) at iterate.c:822 #4 0x000000000040895f in cm_service_one (context=0x2255590, i=2, current_time=0x0) at cm.c:322 #5 0x0000000000408d69 in cm_timer_h (ec=<value optimized out>, te=<value optimized out>, current_time=..., pvt=0x2255590) at cm.c:146 #6 0x00000031d8207bd9 in tevent_common_loop_timer_delay (ev=0x2255450) at ../tevent_timed.c:254 #7 0x00000031d8207490 in epoll_event_loop (ev=<value optimized out>, location=<value optimized out>) at ../tevent_standard.c:298 #8 std_event_loop_once (ev=<value optimized out>, location=<value optimized out>) at ../tevent_standard.c:567 #9 0x00000031d82038f0 in _tevent_loop_once (ev=0x2255450, location=0x42ab1c "main.c:287") at ../tevent.c:507 #10 0x00000000004075bd in main (argc=<value optimized out>, argv=0x7fff17b3c4d8) at main.c:287 (gdb) (gdb) up #1 0x00000031d0234085 in abort () at abort.c:92 92 raise (SIGABRT); (gdb) up #2 0x00000000004096a0 in cm_writing_lock (entry=0x225a490) at iterate.c:256 256 abort(); (gdb) list 251 if (writing_lock == NULL) { 252 cm_log(3, "%s('%s') taking writing lock\n", 253 entry->cm_busname, entry->cm_nickname); 254 writing_lock = entry; 255 } else { 256 abort(); 257 } 258 return TRUE; 259 } else { 260 return FALSE; (gdb) │247 static dbus_bool_t │ │248 cm_writing_lock(struct cm_store_entry *entry) │ │249 { │ │250 if ((writing_lock == entry) || (writing_lock == NULL)) { │ │251 if (writing_lock == NULL) { │ │252 cm_log(3, "%s('%s') taking writing lock\n", │ │253 entry->cm_busname, entry->cm_nickname); │ │254 writing_lock = entry; │ │255 } else { │ >│256 abort(); │ │257 } │ │258 return TRUE; │ │259 } else { │ │260 return FALSE; │ │261 } │ │262 } │ (gdb) p writing_lock $1 = (struct cm_store_entry *) 0x225a490 (gdb) (gdb) p entry $2 = (struct cm_store_entry *) 0x225a490 (gdb) (gdb) p * writing_lock $3 = {cm_busname = 0x225c4f0 "Request3", cm_store_private = 0x225c560, cm_nickname = 0x2258ef0 "20160110091219", cm_key_type = {cm_key_algorithm = cm_key_rsa, cm_key_gen_algorithm = cm_key_rsa, cm_key_size = 2048, cm_key_gen_size = 2048}, cm_key_storage_type = cm_key_storage_nssdb, cm_key_storage_location = 0x225a700 "/var/lib/pki-ca/alias", cm_key_token = 0x2256ee0 "NSS Certificate DB", cm_key_nickname = 0x225a130 "ocspSigningCert cert-pki-ca", cm_key_pin = 0x2258320 "501310396051", cm_key_pin_file = 0x0, cm_key_pubkey = 0x225a8b0 "3082010A0282010100E004C44FD3C01B34C5137DA7D0C555695B2323935594251DC1758E53183B77CFDC19081395AA9738DB39B30349111A85EC15DB883F6A82110573340776CB4917797795D0A7B4476702F5CDC715989094E0D915FEE66BE34E7A7C6B"..., cm_cert_storage_type = cm_cert_storage_nssdb, cm_cert_storage_location = 0x225a690 "/var/lib/pki-ca/alias", cm_cert_token = 0x2257860 "NSS Certificate DB", cm_cert_nickname = 0x2258390 "ocspSigningCert cert-pki-ca", cm_cert_issuer = 0x2259390 "CN=Certificate Authority,O=ACCELYAKALE.COM", cm_cert_serial = 0x225abd0 "02", cm_cert_subject = 0x225ac30 "CN=OCSP Subsystem,O=ACCELYAKALE.COM", cm_cert_spki = 0x225ae10 "30820122300d06092a864886f70d01010105000382010f003082010a0282010100e004c44fd3c01b34c5137da7d0c555695b2323935594251dc1758e53183b77cfdc19081395aa9738db39b30349111a85ec15db883f6a82110573340776cb4917797795"..., cm_cert_not_before = 1378736462, cm_cert_not_after = 1440944462, cm_cert_hostname = 0x225acb0, cm_cert_email = 0x225ab40, cm_cert_principal = 0x2257790, cm_cert_ku = 0x2257800 "1100011", cm_cert_eku = 0x225ad20 "1.3.6.1.5.5.7.3.9", cm_cert_profile = 0x0, cm_last_need_notify_check = 1452488898, cm_last_need_enroll_check = 1452488898, cm_notification_method = cm_notification_unspecified, cm_notification_destination = 0x0, cm_template_subject = 0x225b1f0 "CN=OCSP Subsystem,O=ACCELYAKALE.COM", cm_template_hostname = 0x0, cm_template_email = 0x0, cm_template_principal = 0x0, cm_template_ku = 0x225b280 "1100011", cm_template_eku = 0x225ad90 "1.3.6.1.5.5.7.3.9", cm_challenge_password = 0x0, cm_csr = 0x2271960 "-----BEGIN NEW CERTIFICATE REQUEST-----\nMIIC3TCCAcUCAQAwMzEYMBYGA1UEChMPQUNDRUxZQUtBTEUuQ09NMRcwFQYDVQQD\nEw5PQ1NQIFN1YnN5c3RlbTCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEB\nAOAExE/TwBs0xRN9p9DFVWlbIyOTVZ"..., cm_spkac = 0x225b430 "MIICQDCCASgwggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQDgBMRP08AbNMUTfafQxVVpWyMjk1WUJR3BdY5TGDt3z9wZCBOVqpc42zmzA0kRGoXsFduIP2qCEQVzNAd2y0kXeXeV0Ke0R2cC9c3HFZiQlODZFf7ma+NOenxrLt6L5ygBvWqmFD+6qyMO6Tgj"..., cm_state = CM_NEED_TO_SAVE_CERT, cm_autorenew = 1, cm_monitor = 1, cm_ca_nickname = 0x225b360 "dogtag-ipa-retrieve-agent-submit", cm_ca_profile = 0x225b8d0 "ocspSigningCert cert-pki-ca", cm_submitted = 1452488901, cm_ca_cookie = 0x0, cm_ca_error = 0x0, cm_cert = 0x2269df0 "\033[?1034h-----BEGIN CERTIFICATE-----\nMIIDgDCCAmigAwIBAgICAOIwDQYJKoZIhvcNAQELBQAwOjEYMBYGA1UEChMPQUND\nRUxZQUtBTEUuQ09NMR4wHAYDVQQDExVDZXJ0aWZpY2F0ZSBBdXRob3JpdHkwHhcN\nMTUwODI5MDMwOTU4WhcNMTcwODE4MDMwOT"..., cm_pre_certsave_command = 0x225be10 "/usr/lib64/ipa/certmonger/stop_pkicad", cm_pre_certsave_uid = 0x225beb0 "0", cm_post_certsave_command = 0x225c690 "/usr/lib64/ipa/certmonger/restart_pkicad \"ocspSigningCert cert-pki-ca\"", cm_post_certsave_uid = 0x225bf20 "0"} (gdb) this matches the third record from output of getcert list ... Request ID '20160110091219': status: NEED_TO_SAVE_CERT stuck: no key pair storage: type=NSSDB,location='/var/lib/pki-ca/alias',nickname='ocspSigningCert cert-pki-ca',token='NSS Certificate DB',pin='501310396051' certificate: type=NSSDB,location='/var/lib/pki-ca/alias',nickname='ocspSigningCert cert-pki-ca',token='NSS Certificate DB' CA: dogtag-ipa-retrieve-agent-submit issuer: CN=Certificate Authority,O=ACCELYAKALE.COM subject: CN=OCSP Subsystem,O=ACCELYAKALE.COM expires: 2015-08-30 14:21:02 UTC eku: id-kp-OCSPSigning pre-save command: /usr/lib64/ipa/certmonger/stop_pkicad post-save command: /usr/lib64/ipa/certmonger/restart_pkicad "ocspSigningCert cert-pki-ca" track: yes auto-renew: yes gdb) up #3 0x000000000040a2e0 in cm_iterate (entry=0x225a490, ca=0x2265f50, context=0x2255590, get_ca_by_index=0x407b50 <cm_get_ca_by_index>, get_n_cas=0x407b70 <cm_get_n_cas>, emit_entry_saved_cert=0x41bbd0 <cm_tdbush_property_emit_entry_saved_cert>, emit_entry_changes=0x41bc90 <cm_tdbush_property_emit_entry_changes>, cm_iterate_state=0x22576f0, when=0x7fff17b3c204, delay=0x7fff17b3c20c, readfd=0x7fff17b3c208) at iterate.c:822 (gdb) │821 case CM_NEED_TO_SAVE_CERT: >│822 if (!cm_writing_lock(entry)) { │823 /* Just hang out in this state while we're messing │824 * around with the outside world for another entry. */ │825 cm_log(3, "%s('%s') waiting for saving lock\n", │826 entry->cm_busname, entry->cm_nickname); │827 *when = cm_time_soon; │828 break; │829 }