Description of problem: Several times a day slapd crashes and has to be restarted. This ldap server is used by about 50 Fedora and FreeBSD servers for passwd lookups and pam authentication. The clients show no issues. Syslog shows this message: kernel: [68789.935862] slapd[28586] general protection ip:7f7738fa5340 sp:7f77089437b0 error:0 kernel: [68789.960255] slapd[28582] general protection ip:7f7738fa5419 sp:7f7709e150b0 error:0 in libssl3.so[7f7738f89000+32000] From F9 up to F13 everything was stable. This is happening since we upgraded to F14. I think it might be related to the switch to Mozilla's nss library in F14. Version-Release number of selected component (if applicable): kernel-2.6.35.12-90.fc14.x86_64 openldap-servers-2.4.23-4.fc14.x86_64 nss-3.12.9-10.fc14.x86_64 How reproducible: Steps to Reproduce: 1. Setup a LDAP server with TLS 2. Setup a few clients 3. Wait.. Actual results: Expected results: Additional info: SElinux is disabled. Latest patches have been installed.
Dennis, please try latest OpenLDAP package from updates-testing. https://admin.fedoraproject.org/updates/openldap-2.4.23-10.fc14 # yum --enablerepo updates-testing update "openldap\*"
Sorry, a typo: # yum --enablerepo updates-testing update "openldap*"
Hi Jan/Dennis I've also experienced this error. Updating to openldap-servers-2.4.23-10.fc14 did not resolve the issue. Some additional info: I use nss-pam-ldapd(-0.7.7-1) instead of nss_ldap. Disabling TLS results in a stable slapd.
A stack trace would be very helpful. 1) allow setuid programs to create core files: sysctl -w fs.suid_dumpable=1 2) install the openldap-debuginfo package 3) not sure where openldap leaves core files - Jan? 4) reproduce the crash 5) run gdb with the core file: gdb /usr/sbin/slapd /path/to/core.PID (gdb) set logging on (gdb) set logging redirect on (gdb) set logging file /tmp/openldap.crash.txt (gdb) thread apply all bt full (gdb) set logging redirect off (gdb) quit Then attach /tmp/openldap.crash.txt to this bug as an attachment. Note that you will probably want to scrub/obscure any sensitive information from /tmp/openldap.crash.txt
Hello Jan, I'm now running: openldap-2.4.23-10.fc14.x86_64 openldap-clients-2.4.23-10.fc14.x86_64 openldap-servers-2.4.23-10.fc14.x86_64 The "general protection" kernel messages seem to have disappeared, but slapd still crashes. The crashes do appear to be less frequent. I've installed the debuginfo pkg and enabled the sysctl. When the crash returns I will post the gdb output. Thanks
Sorry Dennis for complicating this a little bit more. (In reply to comment #4) > A stack trace would be very helpful. > 1) allow setuid programs to create core files: > sysctl -w fs.suid_dumpable=1 > 2) install the openldap-debuginfo package > 3) not sure where openldap leaves core files - Jan? Unfortunately, this will not work. Forcing OpenLDAP to drop the core was always problematic. There are two problems, 1) core file size is set to zero by initscripts, 2) slapd will change working directory to / and drops root privileges. In this situation, no core file can be created. But there is one (ugly) solution: # service slapd stop # ulimit -c unlimited # sysctl -w fs.suid_dumpable=1 # mkdir -p /var/run/cores # chmod a+w /var/run/cores # sysctl -w "kernel.core_pattern=/var/run/cores/%e.%p" # slapd -u ldap -g ldap -F /etc/openldap/slapd.d -h ldapi:/// ldap://localhost ldaps:/// This will cause all core files to be created in /var/run/cores directory, where all users are allowed to write. And the server is started manually with appropriate ulimit settings. When the server crashes, you should find slapd.pid file in /var/run/cores. I'm not aware of any other (better) solution... (kill -6 $(pidof slapd) can verify, if the core dumping works)
Hi, I'm a colleague of Dennis. Here's the backtrace you asked for: Thread 1 (Thread 0x7f6b3885f700 (LWP 8827)): #0 0x00007f6b58b040c5 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 #1 0x00007f6b58b05a76 in abort () at abort.c:92 #2 0x00007f6b58b40cfb in __libc_message (do_abort=2, fmt=0x7f6b58c2fa98 "*** glibc detected *** %s: %s: 0x%s ***\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:186 #3 0x00007f6b58b47d63 in malloc_printerr (av=<value optimized out>, p=0x7f6b30213010, have_lock=0) at malloc.c:6283 #4 _int_free (av=<value optimized out>, p=0x7f6b30213010, have_lock=0) at malloc.c:4795 #5 0x00007f6b3b5e04a2 in nss_ZFreeIf (pointer=0x7f6b30213030) at arena.c:975 #6 0x00007f6b3b5d1574 in pem_getPrivateKey (arena=0x7f6b24123230, rawkey=<value optimized out>, pError=0x7f6b3885e0d8, modulus=0x7f6b3417f7a8) at prsa.c:202 #7 0x00007f6b3b5d16cd in pem_mdCryptoOperationRSAPriv_Create (proto=0x7f6b3b7f9140, mdMechanism=0x7f6b3b7f9200, pError=0x7f6b3885e0d8, mdKey=<value optimized out>) at prsa.c:361 #8 0x00007f6b3b5de72f in nssCKFWMechanism_DecryptInit (fwMechanism=0x7f6b24127f10, pMechanism=0x7f6b3885e3f0, fwSession=0x7f6b1c16af30, fwObject=0x7f6b34156298) at mechanism.c:538 #9 0x00007f6b3b5db5d1 in NSSCKFWC_DecryptInit (fwInstance=<value optimized out>, hSession=<value optimized out>, pMechanism=0x7f6b3885e3f0, hKey=<value optimized out>) at wrap.c:3105 #10 0x00007f6b5993b091 in pk11_HandUnwrap (slot=0x7f6b34154cc0, wrappingKey=12, mech=0x7f6b3885e3f0, target=881, keyTemplate=0x7f6b3885e250, templateCount=3, key_size=0, wincx=0x0, crvp=0x7f6b3885e408, isPerm=0, inKey=<value optimized out>) at pk11skey.c:1960 #11 0x00007f6b5993b687 in pk11_AnyUnwrapKey (slot=0x7f6b34154cc0, wrappingKey=12, wrapType=1, param=<value optimized out>, wrappedKey=0x7f6b3885e5e0, target=881, operation=268, keySize=0, wincx=0x0, userAttr=<value optimized out>, numAttrs=0, isPerm=0) at pk11skey.c:2104 #12 0x00007f6b5993e6ec in PK11_PubUnwrapSymKey (wrappingKey=0x7f6b2c1124d0, wrappedKey=0x7f6b3885e5e0, target=881, operation=268, keySize=0) at pk11skey.c:2232 #13 0x00007f6b59e63f54 in ssl3_HandleRSAClientKeyExchange (ss=0x7f6b1c164bb0, b=<value optimized out>, length=<value optimized out>) at ssl3con.c:7360 #14 ssl3_HandleClientKeyExchange (ss=0x7f6b1c164bb0, b=<value optimized out>, length=<value optimized out>) at ssl3con.c:7476 #15 ssl3_HandleHandshakeMessage (ss=0x7f6b1c164bb0, b=<value optimized out>, length=<value optimized out>) at ssl3con.c:8648 #16 0x00007f6b59e6694c in ssl3_HandleHandshake (ss=0x7f6b1c164bb0, cText=<value optimized out>, databuf=0x7f6b1c164f18) at ssl3con.c:8727 #17 ssl3_HandleRecord (ss=0x7f6b1c164bb0, cText=<value optimized out>, databuf=0x7f6b1c164f18) at ssl3con.c:9066 #18 0x00007f6b59e67a06 in ssl3_GatherCompleteHandshake (ss=0x7f6b1c164bb0, flags=0) at ssl3gthr.c:209 #19 0x00007f6b59e7024f in SSL_ForceHandshake (fd=<value optimized out>) at sslsecur.c:397 #20 0x00007f6b5b00a025 in tlsm_session_accept_or_connect (session=<value optimized out>, is_accept=1) at tls_m.c:2350 #21 0x00007f6b5b008087 in ldap_pvt_tls_accept (sb=0x7f6b24151500, ctx_arg=0x7f6b5c62fb10) at tls2.c:429 #22 0x00007f6b5aee5dc3 in connection_read (ctx=0x7f6b3885eb80, argv=0x2a) at ../../../servers/slapd/connection.c:1326 #23 connection_read_thread (ctx=0x7f6b3885eb80, argv=0x2a) at ../../../servers/slapd/connection.c:1238 #24 0x00007f6b5afe5278 in ldap_int_thread_pool_wrapper (xpool=0x7f6b5c5c3760) at ../../../libraries/libldap_r/tpool.c:685 #25 0x00007f6b58e73ccb in start_thread (arg=0x7f6b3885f700) at pthread_create.c:301 Thanks, Ruben
Can you attach your tls configuration (olcTls* parameters from cn=config)? Can you attach the output of openssl x509 -in /path/to/your/servercert.pem -text? Be sure to obscure any sensitive information. Can you paste the header of your private key file? e.g. is it -----BEGIN PRIVATE KEY----- or something else?
Hello Rich, We've not yet switched to cn=config, but here's the relevant parts of our slapd.conf: TLSCACertificateFile /etc/openldap/cacerts/ca.pem TLSCertificateFile /etc/pki/tls/certs/slapd.pem TLSCertificateKeyFile /etc/pki/tls/private/slapd.key TLSCipherSuite TLSv1+RSA:!EXPORT:!NULL:!MD5 security ssf=128 Our private key: # head -2 /etc/pki/tls/private/slapd.key -----BEGIN RSA PRIVATE KEY----- MIICXAIBAAKBgQCngGIkmOR94SzNXsULQn9EmzUE1Cy+6QrQE/2d1iwDOCKJ5KH4 The certificate: # openssl x509 -in /etc/pki/tls/certs/slapd.pem -text Certificate: Data: Version: 3 (0x2) Serial Number: 117 (0x75) Signature Algorithm: sha1WithRSAEncryption Issuer: CN=puppet.ourdomain Validity Not Before: Dec 26 12:14:35 2010 GMT Not After : Dec 25 12:14:35 2015 GMT Subject: CN=fqdn.of.ldap.host Subject Public Key Info: Public Key Algorithm: rsaEncryption Public-Key: (1024 bit) Modulus: 00:a7:80:62:24:98:e4:7d:e1:2c:cd:5e:c5:0b:42: 7f:44:9b:35:04:d4:2c:be:e9:0a:d0:13:fd:9d:d6: 2c:03:38:22:89:e4:a1:f8:9a:19:45:0e:6f:a6:e2: 5b:77:4a:aa:df:a8:7f:39:f3:8e:44:6f:8e:89:38: 24:2c:a0:b3:83:ae:d9:89:89:42:58:7a:3b:f5:a3: a0:84:39:1e:5e:02:bc:a2:c1:69:74:3c:a9:1c:51: c1:d8:6c:13:49:27:18:f9:c6:40:48:11:62:f7:60: 1a:c2:40:0a:ba:93:b5:ab:74:01:4f:20:eb:2b:a8: e2:eb:a7:88:57:48:9e:38:c5 Exponent: 65537 (0x10001) X509v3 extensions: Netscape Comment: Puppet Ruby/OpenSSL Generated Certificate X509v3 Basic Constraints: critical CA:FALSE X509v3 Subject Key Identifier: 01:2F:91:13:B7:07:94:7E:65:82:0C:EC:72:E2:BE:F2:51:8E:95:AF X509v3 Key Usage: Digital Signature, Key Encipherment X509v3 Extended Key Usage: TLS Web Server Authentication, TLS Web Client Authentication, E-mail Protection X509v3 Subject Alternative Name: DNS:fqdn.of.ldap.host, DNS:fqdn.of.vip, DNS:another.alias Signature Algorithm: sha1WithRSAEncryption Thanks Dennis
Created attachment 498225 [details] slapd stackstrace Full stacktrace for another slapd crash
The same crash happens with openldap-servers-2.4.24-2.fc15.x86_64: Thread 1 (Thread 0x7f52ca7fd700 (LWP 11854)): #0 0x00007f52f943b0c5 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 #1 0x00007f52f943ca76 in abort () at abort.c:92 #2 0x00007f52f9477cfb in __libc_message (do_abort=2, fmt=0x7f52f9566a98 "*** glibc detected *** %s: %s: 0x%s ***\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:186 #3 0x00007f52f947ed63 in malloc_printerr (av=<value optimized out>, p=0x7f52d42dab50, have_lock=0) at malloc.c:6283 #4 _int_free (av=<value optimized out>, p=0x7f52d42dab50, have_lock=0) at malloc.c:4795 #5 0x00007f52e01fa4a2 in nss_ZFreeIf (pointer=0x7f52d42dab70) at arena.c:975 #6 0x00007f52e01eb574 in pem_getPrivateKey (arena=0x7f52bc11f750, rawkey=<value optimized out>, pError=0x7f52ca7fc0f8, modulus=0x7f52d417f868) at prsa.c:202 #7 0x00007f52e01eb6cd in pem_mdCryptoOperationRSAPriv_Create (proto=0x7f52e0413140, mdMechanism=0x7f52e0413200, pError=0x7f52ca7fc0f8, mdKey=<value optimized out>) at prsa.c:361 #8 0x00007f52e01f872f in nssCKFWMechanism_DecryptInit (fwMechanism=0x7f52bc147cc0, pMechanism=0x7f52ca7fc410, fwSession=0x7f52c412a0e0, fwObject=0x7f52d4156358) at mechanism.c:538 #9 0x00007f52e01f55d1 in NSSCKFWC_DecryptInit (fwInstance=<value optimized out>, hSession=<value optimized out>, pMechanism=0x7f52ca7fc410, hKey=<value optimized out>) at wrap.c:3105 #10 0x00007f52fa272091 in pk11_HandUnwrap (slot=0x7f52d4154d80, wrappingKey=12, mech=0x7f52ca7fc410, target=881, keyTemplate=0x7f52ca7fc270, templateCount=3, key_size=0, wincx=0x0, crvp=0x7f52ca7fc428, isPerm=0, inKey=<value optimized out>) at pk11skey.c:1960 #11 0x00007f52fa272687 in pk11_AnyUnwrapKey (slot=0x7f52d4154d80, wrappingKey=12, wrapType=1, param=<value optimized out>, wrappedKey=0x7f52ca7fc600, target=881, operation=268, keySize=0, wincx=0x0, userAttr=<value optimized out>, numAttrs=0, isPerm=0) at pk11skey.c:2104 #12 0x00007f52fa2756ec in PK11_PubUnwrapSymKey (wrappingKey=0x7f52d41c3020, wrappedKey=0x7f52ca7fc600, target=881, operation=268, keySize=0) at pk11skey.c:2232 #13 0x00007f52fa79af54 in ssl3_HandleRSAClientKeyExchange (ss=0x7f52b410aa30, b=<value optimized out>, length=<value optimized out>) at ssl3con.c:7360 #14 ssl3_HandleClientKeyExchange (ss=0x7f52b410aa30, b=<value optimized out>, length=<value optimized out>) at ssl3con.c:7476 #15 ssl3_HandleHandshakeMessage (ss=0x7f52b410aa30, b=<value optimized out>, length=<value optimized out>) at ssl3con.c:8648 #16 0x00007f52fa79d94c in ssl3_HandleHandshake (ss=0x7f52b410aa30, cText=<value optimized out>, databuf=0x7f52b410ad98) at ssl3con.c:8727 #17 ssl3_HandleRecord (ss=0x7f52b410aa30, cText=<value optimized out>, databuf=0x7f52b410ad98) at ssl3con.c:9066 #18 0x00007f52fa79ea06 in ssl3_GatherCompleteHandshake (ss=0x7f52b410aa30, flags=0) at ssl3gthr.c:209 #19 0x00007f52fa7a724f in SSL_ForceHandshake (fd=<value optimized out>) at sslsecur.c:397 #20 0x00007f52fb943f8a in tlsm_session_accept_or_connect (session=0x7f52b41246f0, is_accept=1) at tls_m.c:2338 #21 0x00007f52fb9420d3 in ldap_pvt_tls_accept (sb=0x7f52b011d700, ctx_arg=0x7f52fc5c50b0) at tls2.c:421 #22 0x00007f52fb81e143 in connection_read (ctx=0x7f52ca7fcb80, argv=0x2b) at ../../../servers/slapd/connection.c:1330 #23 connection_read_thread (ctx=0x7f52ca7fcb80, argv=0x2b) at ../../../servers/slapd/connection.c:1242 #24 0x00007f52fb91eeea in ldap_int_thread_pool_wrapper (xpool=0x7f52fc55b810) at ../../../libraries/libldap_r/tpool.c:685 #25 0x00007f52f97aaccb in start_thread (arg=0x7f52ca7fd700) at pthread_create.c:301 #26 0x00007f52f94e8c2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115
I tried to reproduce on the latest RHEL6.1 but no luck I used /etc/pki/tls/misc/CA to create a CA cert and an SSL server cert which matches the one from Comment 9 as closely as possible. One hitch was that it creates a private key like this: -----BEGIN PRIVATE KEY----- I used openssl rsa to convert it to a key like this: -----BEGIN RSA PRIVATE KEY----- I'm not sure if that's valid. How was the key in comment 9 created? are you able to reproduce the problem with ldapsearch? on f14 or on another platform?
(In reply to comment #12) > I'm not sure if that's valid. How was the key in comment 9 created? > are you able to reproduce the problem with ldapsearch? on f14 or on another > platform? It seems that it was created by Puppet. # yum install puppet-server # puppetca --generate ldaphostname CA certificate: /var/lib/puppet/ssl/ca/ca_crt.pem Certificate: /var/lib/puppet/ssl/ca/signed/ldaphostname.pem Key: /var/lib/puppet/ssl/private_keys/ldaphostname.pem But I'm still not experiencing the crash...
Tried to reproduce on F14, F15 and RHEL6... unsuccessfully.
Yes, the CA and the client certs have been generated by puppetca. In the meantime we've downgraded one server to openldap-servers-2.4.21-11.fc13.x86_64 and haven't seen any more issues there. The rest of the stack on this server is still F14 and is still using the Mozilla nss package (same version). So the problem might in fact not be caused by Mozilla nss, but be a regression in openldap itself. It's unfortunate that you can't reproduce this issue.. Have you put the ldap server under load? Are you using the same security settings we have? TLSCipherSuite TLSv1+RSA:!EXPORT:!NULL:!MD5 security ssf=128
(In reply to comment #15) > In the meantime we've downgraded one server to > openldap-servers-2.4.21-11.fc13.x86_64 and haven't seen any more issues there. > The rest of the stack on this server is still F14 and is still using the > Mozilla nss package (same version). > > So the problem might in fact not be caused by Mozilla nss, but be a regression > in openldap itself. > openldap-servers-2.4.21-11.fc13.x86_64 is linked against OpenSSL. And according to backtrace you posted, this is a crash "somewhere inside" SSL_ForceHandshake. > It's unfortunate that you can't reproduce this issue.. Have you put the ldap > server under load? Are you using the same security settings we have? > > TLSCipherSuite TLSv1+RSA:!EXPORT:!NULL:!MD5 > security ssf=128 I have the same settings. Can you be more specific about the load? I tried to create some fake load by simple multi-thread script querying the server. But maybe it was not enough. How long does it take, before the server crashes? Please, can you provide your configuration file with obfuscated sensitive data? Do you have some ulimit settings applied?
Oh I assumed slapd was dynamically linked to libnss, but apparently it's been hard linked. The server doesn't crash very often anymore and we don't see the general protection messages in the kernel log anymore either. The server now crashes perhaps one or two times a week. So perhaps if you keep the load script running for a week or so you might be able to reproduce it. Our obfuscated slapd conf. Nothing special in there: loglevel sync stats include /etc/openldap/schema/core.schema include /etc/openldap/schema/cosine.schema include /etc/openldap/schema/inetorgperson.schema include /etc/openldap/schema/misc.schema include /etc/openldap/schema/openldap.schema include /etc/openldap/schema/rfc2307bis.schema pidfile /var/run/openldap/slapd.pid argsfile /var/run/openldap/slapd.args TLSCACertificateFile /etc/openldap/cacerts/ca.pem TLSCertificateFile /etc/pki/tls/certs/slapd.pem TLSCertificateKeyFile /etc/pki/tls/private/slapd.key TLSCipherSuite TLSv1+RSA:!EXPORT:!NULL:!MD5 security ssf=128 ####################################################################### # ldbm and/or bdb database definitions ####################################################################### database hdb suffix "dc=foo,dc=bar" checkpoint 1024 15 rootdn "cn=admin,dc=foo,dc=bar" rootpw sekrit:) directory /var/lib/ldap # Ldap slave syncrepl rid=001 provider=ldap://ldap1a.foo.bar binddn="uid=replicator,dc=foo,dc=bar" starttls=critical bindmethod=simple credentials=sekrit:) searchbase="dc=foo,dc=bar" type=refreshAndPersist retry="5 5 300 5" syncdata=accesslog logbase="cn=log" logfilter="(&(objectclass=auditWriteObject)(reqResult=0))" # refer writes to master updateref ldap://ldap1a.foo.bar index cn,uid pres,eq index objectClass eq index uidNumber eq index gidNumber eq index memberUid eq index entryCSN eq index entryUUID eq index ou eq index uniqueMember eq index member eq index default eq,sub # self can change password access to attrs=userPassword by anonymous auth by self write by dn="uid=replicator,dc=foo,dc=bar" read by * none # anyone can read access to dn.subtree="dc=foo,dc=bar" by self write by * read # enable monitoring database monitor access to dn.subtree="cn=monitor" by * read
Hi Jan, One thing that may differ from our setup to yours is that we let puppet create the certificate with an extra X509v3 Subject Alternative Name (via the puppet certdnsname setting). This is needed so TLS works behind a VIP. Maybe nss has some difficulties with that. Just a though... Kind regards, Ruben
(In reply to comment #15) > Yes, the CA and the client certs have been generated by puppetca. > > In the meantime we've downgraded one server to > openldap-servers-2.4.21-11.fc13.x86_64 and haven't seen any more issues there. > The rest of the stack on this server is still F14 and is still using the > Mozilla nss package (same version). > > So the problem might in fact not be caused by Mozilla nss, but be a regression > in openldap itself. > > It's unfortunate that you can't reproduce this issue.. Have you put the ldap > server under load? Are you using the same security settings we have? > > TLSCipherSuite TLSv1+RSA:!EXPORT:!NULL:!MD5 > security ssf=128 Yes. I'm using cn=config - here are my settings: olcTLSCACertificateFile: /path/to/CA/cacert.pem olcTLSCertificateFile: /path/to/CA/certs/localhost.localdomain.1024.cert.pem olcTLSCertificateKeyFile: /path/to/CA/certs/localhost.localdomain.1024.rsa.key.pem olcTLSCipherSuite: TLSv1+RSA:!EXPORT:!NULL:!MD5 olcSecurity: ssf=128 cert.pem and key.pem look almost identical to https://bugzilla.redhat.com/show_bug.cgi?id=701678#c9 including SubjectAltName (hostname, issuer, and subject values are different). Is it possible to reproduce the problem with ldapsearch or some other simple load client? It is impractical to set up an environment with many client machines using nss_ldap/pam_ldap.
(In reply to comment #15) > In the meantime we've downgraded one server to > openldap-servers-2.4.21-11.fc13.x86_64 and haven't seen any more issues there. (In reply to comment #17) > The server doesn't crash very often anymore and we don't see the general > protection messages in the kernel log anymore either. The server now crashes > perhaps one or two times a week. Can you explain what you mean here? The server doesn't crash very often anymore? So at some point it was crashing more and now it doesn't crash as often? What changed? Are you saying that it crashes with openldap-servers-2.4.21, but less often than it crashed with openldap-servers-2.4.24-2.fc15.x86_64? Does f15 crash more/less often than f14? I notice in your syncrepl configuration you have starttls=critical but do not provide a ca cert - does it fallback to TLSCACertificateFile /etc/openldap/cacerts/ca.pem?
Hello Rick, Sorry for the late response. I've been busy :) With openldap-servers-2.4.23-4.fc14.x86_64 we've seen crashes up to 4 times a day. With openldap-servers-2.4.23-10.fc14.x86_64 we see maybe 1 or 2 crashes a week. With openldap-servers-2.4.21-11.fc13.x86_64 we don't see any crashes at all. Why would you want to specify the same CA twice? :) Replication works, so I guess it does use the global CA cert setting. I'll see if we can set up a lab environment and try to reproduce the crashes as well as test out F15.
Hey Rich, Just upgraded last friday to openldap-servers-2.6.24 from Rawhide, which I've rebuild for F-14, and we're still seeing the same crashes unfortunately.
(In reply to comment #22) > Hey Rich, > > Just upgraded last friday to openldap-servers-2.6.24 from Rawhide, which I've > rebuild for F-14, and we're still seeing the same crashes unfortunately. We cannot reproduce this issue. We are not going to be able to fix this issue without a reproducer or a full stack trace from core file.
Hi Rich, I still haven't been able to reliably trigger this. In the meantime we switched to sssd on all our clients, except for a few which use nss_ldapd. So there are no clients using nss_ldap anymore. Here's a backtrace of openldap-2.4.26-1.fc14.x86_64: (gdb) thread all apply bt No symbol "all" in current context. (gdb) thread apply all bt Thread 6 (Thread 0x7fc90d05e700 (LWP 25667)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 #1 0x00007fc94632a167 in ldap_int_thread_pool_wrapper (xpool=0x7fc9481cb830) at ../../../libraries/libldap_r/tpool.c:672 #2 0x00007fc9441abccb in start_thread (arg=0x7fc90d05e700) at pthread_create.c:301 #3 0x00007fc943ee9c2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 5 (Thread 0x7fc90fffd700 (LWP 22431)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 #1 0x00007fc94632a167 in ldap_int_thread_pool_wrapper (xpool=0x7fc9481cb830) at ../../../libraries/libldap_r/tpool.c:672 #2 0x00007fc9441abccb in start_thread (arg=0x7fc90fffd700) at pthread_create.c:301 #3 0x00007fc943ee9c2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 4 (Thread 0x7fc9461c5720 (LWP 22245)): #0 0x00007fc9441acde5 in pthread_join (threadid=140502006011648, thread_return=0x0) at pthread_join.c:89 #1 0x00007fc9462212e2 in slapd_daemon () at ../../../servers/slapd/daemon.c:2922 #2 0x00007fc946208b01 in main (argc=6, argv=<value optimized out>) at ../../../servers/slapd/main.c:983 Thread 3 (Thread 0x7fc92c289700 (LWP 22428)): #0 0x00007fc943eea193 in epoll_wait () at ../sysdeps/unix/syscall-template.S:82 #1 0x00007fc94621f535 in slapd_daemon_task (ptr=<value optimized out>) at ../../../servers/slapd/daemon.c:2528 #2 0x00007fc9441abccb in start_thread (arg=0x7fc92c289700) at pthread_create.c:301 #3 0x00007fc943ee9c2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 2 (Thread 0x7fc928ea8700 (LWP 22430)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 #1 0x00007fc94632a167 in ldap_int_thread_pool_wrapper (xpool=0x7fc9481cb830) at ../../../libraries/libldap_r/tpool.c:672 #2 0x00007fc9441abccb in start_thread (arg=0x7fc928ea8700) at pthread_create.c:301 #3 0x00007fc943ee9c2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 1 (Thread 0x7fc9296a9700 (LWP 22429)): #0 0x00007fc943e3c0c5 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 #1 0x00007fc943e3da76 in abort () at abort.c:92 #2 0x00007fc943e78cfb in __libc_message (do_abort=2, fmt=0x7fc943f67a98 "*** glibc detected *** %s: %s: 0x%s ***\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:186 #3 0x00007fc943e7fd63 in malloc_printerr (av=<value optimized out>, p=0x7fc900392b20, have_lock=0) at malloc.c:6283 #4 _int_free (av=<value optimized out>, p=0x7fc900392b20, have_lock=0) at malloc.c:4795 #5 0x00007fc90f5de4a2 in nss_ZFreeIf (pointer=0x7fc900392b40) at arena.c:975 #6 0x00007fc90f5cf574 in pem_getPrivateKey (arena=0x7fc9081f09d0, rawkey=<value optimized out>, pError=0x7fc9296a80d8, modulus=0x7fc900179938) at prsa.c:202 #7 0x00007fc90f5cf6cd in pem_mdCryptoOperationRSAPriv_Create (proto=0x7fc90f7f7140, mdMechanism=0x7fc90f7f7200, pError=0x7fc9296a80d8, mdKey=<value optimized out>) at prsa.c:361 #8 0x00007fc90f5dc72f in nssCKFWMechanism_DecryptInit (fwMechanism=0x7fc9081f3730, pMechanism=0x7fc9296a83f0, fwSession=0x7fc900381b40, fwObject=0x7fc900150408) at mechanism.c:538 #9 0x00007fc90f5d95d1 in NSSCKFWC_DecryptInit (fwInstance=<value optimized out>, hSession=<value optimized out>, pMechanism=0x7fc9296a83f0, hKey=<value optimized out>) at wrap.c:3105 #10 0x00007fc944c733a1 in pk11_HandUnwrap (slot=0x7fc90014ee30, wrappingKey=9, mech=0x7fc9296a83f0, target=881, keyTemplate=0x7fc9296a8250, templateCount=3, key_size=0, wincx=0x0, crvp=0x7fc9296a8408, isPerm=0, inKey=<value optimized out>) at pk11skey.c:1960 #11 0x00007fc944c73997 in pk11_AnyUnwrapKey (slot=0x7fc90014ee30, wrappingKey=9, wrapType=1, param=<value optimized out>, wrappedKey=0x7fc9296a85e0, target=881, operation=268, keySize=0, wincx=0x0, userAttr=<value optimized out>, numAttrs=0, isPerm=0) at pk11skey.c:2104 #12 0x00007fc944c769fc in PK11_PubUnwrapSymKey (wrappingKey=0x7fc900181ad0, wrappedKey=0x7fc9296a85e0, target=881, operation=268, keySize=0) at pk11skey.c:2232 #13 0x00007fc94519d05c in ssl3_HandleRSAClientKeyExchange (ss=0x7fc90038ab50, b=<value optimized out>, length=<value optimized out>) at ssl3con.c:7358 #14 ssl3_HandleClientKeyExchange (ss=0x7fc90038ab50, b=<value optimized out>, length=<value optimized out>) at ssl3con.c:7474 #15 ssl3_HandleHandshakeMessage (ss=0x7fc90038ab50, b=<value optimized out>, length=<value optimized out>) at ssl3con.c:8646 #16 0x00007fc94519fa2c in ssl3_HandleHandshake (ss=0x7fc90038ab50, cText=<value optimized out>, databuf=0x7fc90038aeb8) at ssl3con.c:8725 #17 ssl3_HandleRecord (ss=0x7fc90038ab50, cText=<value optimized out>, databuf=0x7fc90038aeb8) at ssl3con.c:9064 #18 0x00007fc9451a0ae6 in ssl3_GatherCompleteHandshake (ss=0x7fc90038ab50, flags=0) at ssl3gthr.c:209 #19 0x00007fc9451a93af in SSL_ForceHandshake (fd=<value optimized out>) at sslsecur.c:397 #20 0x00007fc94634fa15 in tlsm_session_accept_or_connect (session=<value optimized out>, is_accept=1) at tls_m.c:2341 #21 0x00007fc94634d877 in ldap_pvt_tls_accept (sb=0x7fc8f4196490, ctx_arg=0x7fc948242e90) at tls2.c:421 #22 0x00007fc9462247b3 in connection_read (ctx=0x7fc9296a8b80, argv=0x38) at ../../../servers/slapd/connection.c:1355 #23 connection_read_thread (ctx=0x7fc9296a8b80, argv=0x38) at ../../../servers/slapd/connection.c:1267 #24 0x00007fc94632a108 in ldap_int_thread_pool_wrapper (xpool=0x7fc9481cb830) at ../../../libraries/libldap_r/tpool.c:685 #25 0x00007fc9441abccb in start_thread (arg=0x7fc9296a9700) at pthread_create.c:301 #26 0x00007fc943ee9c2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115
I have still not been able to reproduce this crash. But testing with syncrepl over TLS/SSL has revealed that NSS_Init* functions are not thread-safe, and I was able to crash the server quite easily. see https://bugzilla.redhat.com/show_bug.cgi?id=731112 I don't know if you are able to build from source and test, but if so, would it be possible for you to attempt to reproduce your crash with the patch posted in the ITS linked to the bug?
I will prepare scratch build with mentioned patch applied.
Scratch build is done: http://koji.fedoraproject.org/koji/taskinfo?taskID=3279353 In case the build got deleted: http://jvcelak.fedorapeople.org/openldap-701678/ If you need a build for later Fedora, let me know.
Hi Jan, thanks for the build. We've downgraded our ldap servers. Hopefully they keep running, but in any case we'll report in a few days.
I can reliably reproduce this every single time under RHEL 6.1. Do you want me to open a separate ticket under that product? And separately, do you want me to provide any additional information or does everyone feel the non-thread safe calls are the problem? For what it's worth, I'm using a wild card certificate from GoDaddy. The problem happens immediately every single time I start a consumer using syncrepl. The provider dies immediately as soon as the consumer tries to initiate a connection. I'm still using slapd.conf and my TLS options look like: --- TLSCACertificateFile /etc/pki/tls/certs/ca-bundle.crt TLSCertificateFile /etc/pki/tls/certs/wild_intermediate.crt TLSCertificateKeyFile /etc/pki/tls/private/wild_intermediate.key TLSVerifyClient never The main certificate file is the wild card certificate and the one intermediate certificate needed to verify trust back up to the GoDaddy CA certificate.
(In reply to comment #29) > I can reliably reproduce this every single time under RHEL 6.1. Do you want me > to open a separate ticket under that product? Yes, please. We need a reliable reproducer (or steps to reproduce). > And separately, do you want me to provide any additional information or does > everyone feel the non-thread safe calls are the problem? I don't know if I understand correctly. You were able to reproduce it on F14 as well? And did the crashes disappear after installing the scratch build from comment #27? In that case, it should be fixed in F15 (openldap-2.4.24-4.fc15) and F16 (openldap-2.4.26-2.fc16). And I will update F14 as well because this seems quite critical.
(In reply to comment #30) > (In reply to comment #29) > > I can reliably reproduce this every single time under RHEL 6.1. Do you want me > > to open a separate ticket under that product? > > Yes, please. We need a reliable reproducer (or steps to reproduce). > > > And separately, do you want me to provide any additional information or does > > everyone feel the non-thread safe calls are the problem? > > I don't know if I understand correctly. You were able to reproduce it on F14 as > well? And did the crashes disappear after installing the scratch build from > comment #27? In that case, it should be fixed in F15 (openldap-2.4.24-4.fc15) > and F16 (openldap-2.4.26-2.fc16). And I will update F14 as well because this > seems quite critical. Your comment is equally confusing to me. Let's take it one at a time. Did you want me to open a separate ticket on this bug for the version of OpenLDAP in Red Hat Enterprise Linux 6.x (2.4.23-15.el6_1.1)? I have not tried any of this in Fedora. But I found this bug report after having almost the exact same issue in RHEL, and I assume they're related or the exact same problem. I have not tried any other version of OpenLDAP. I simply turned off TLS for the time being between my provider and consumers to avoid the problem. I will attach my provider and consumer slapd.conf files. If it's the same problem, you shouldn't even need any data in your databases. Just starting the consumers is enough to crash the provider, even when all of them are empty. Hopefully it won't be specific to my certificates either since that is harder to reproduce on your end. Give me a second to sanitize and attach my configuration files.
Created attachment 520145 [details] slapd.conf for provider
Created attachment 520146 [details] slapd.conf for consumer
Derp. You need to uncomment the two starttls lines in the consumer slapd.conf obviously to test this (and provide your own SSL certificates obviously).
(In reply to comment #31) > Did you want me to open a separate ticket on this bug for the version of > OpenLDAP in Red Hat Enterprise Linux 6.x (2.4.23-15.el6_1.1)? > > I have not tried any of this in Fedora. But I found this bug report after > having almost the exact same issue in RHEL, and I assume they're related or the > exact same problem. I have not tried any other version of OpenLDAP. I simply > turned off TLS for the time being between my provider and consumers to avoid > the problem. > > I will attach my provider and consumer slapd.conf files. If it's the same > problem, you shouldn't even need any data in your databases. Just starting the > consumers is enough to crash the provider, even when all of them are empty. > Hopefully it won't be specific to my certificates either since that is harder > to reproduce on your end. Give me a second to sanitize and attach my > configuration files. The steps to reproduce are almost the same as for bug #701678, where the cause is clear. And we have already fixed that in RHEL (bug #731168). So I don't need a separate ticket. This bug is about random crashes. And we would like to know, if the cause is the same. Mark, thank you for your patience. :-) Let's give Ruben some time to test.
It took a while for this to reoccur, but attached is a backtrace. This time on a slave. Loaded symbols for /lib64/libgcc_s-4.5.1-20100924.so.1 Core was generated by `/usr/sbin/slapd -h ldap:/// ldaps:/// -u ldap -4'. Program terminated with signal 6, Aborted. #0 0x00007f1a23a720c5 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 64 return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig); (gdb) thread apply all bt Thread 5 (Thread 0x7f1a0b720700 (LWP 14467)): #0 0xffffffffff600046 in ?? () #1 0x00007f1a002292b0 in ?? () #2 0x00007f1a23ab3a03 in _IO_str_init_static_internal (sf=0x0, ptr=0x7f1a0b71f7c0 "\360\367q\v\032\177", size=<value optimized out>, pstart=0x15220490 <Address 0x15220490 out of bounds>) at strops.c:50 #3 0x0000000000034ecc in ?? () #4 0x00007f1a25d4cad0 in ?? () #5 0x00007f1a001d23a0 in ?? () #6 0x00007f1a0b71f8c8 in ?? () #7 0x00007f1a23ada12d in time () at ../sysdeps/unix/sysv/linux/x86_64/time.S:36 #8 0x00007f1a23b1b020 in __vsyslog_chk (pri=<value optimized out>, flag=-10484695, fmt=0x7f1a25f8c4f8 "conn=%lu fd=%d TLS established tls_ssf=%u ssf=%u\n", ap=0x7f1a0b71f910) at ../misc/syslog.c:196 #9 0x00007f1a23b1b5a3 in __syslog_chk (pri=<value optimized out>, flag=<value optimized out>, fmt=<value optimized out>) at ../misc/syslog.c:131 #10 0x00007f1a25e565ba in syslog (ctx=0x7f1a0b71fb80, argv=<value optimized out>) at /usr/include/bits/syslog.h:32 #11 connection_read (ctx=0x7f1a0b71fb80, argv=<value optimized out>) at ../../../servers/slapd/connection.c:1360 #12 connection_read_thread (ctx=0x7f1a0b71fb80, argv=<value optimized out>) at ../../../servers/slapd/connection.c:1238 #13 0x00007f1a25f55218 in ldap_int_thread_pool_wrapper (xpool=0x7f1a26aa1760) at ../../../libraries/libldap_r/tpool.c:685 #14 0x00007f1a23de1ccb in start_thread (arg=0x7f1a0b720700) at pthread_create.c:301 #15 0x00007f1a23b1fc2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 4 (Thread 0x7f1a0bf21700 (LWP 14466)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 #1 0x00007f1a25f55277 in ldap_int_thread_pool_wrapper (xpool=0x7f1a26aa1760) at ../../../libraries/libldap_r/tpool.c:672 #2 0x00007f1a23de1ccb in start_thread (arg=0x7f1a0bf21700) at pthread_create.c:301 #3 0x00007f1a23b1fc2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 3 (Thread 0x7f1a0c722700 (LWP 14465)): #0 0x00007f1a23b20193 in epoll_wait () at ../sysdeps/unix/syscall-template.S:82 #1 0x00007f1a25e512b3 in slapd_daemon_task (ptr=<value optimized out>) at ../../../servers/slapd/daemon.c:2467 #2 0x00007f1a23de1ccb in start_thread (arg=0x7f1a0c722700) at pthread_create.c:301 #3 0x00007f1a23b1fc2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 2 (Thread 0x7f1a25df8720 (LWP 14462)): #0 0x00007f1a23de2de5 in pthread_join (threadid=139749854684928, thread_return=0x0) at pthread_join.c:89 #1 0x00007f1a25e529fc in slapd_daemon () at ../../../servers/slapd/daemon.c:2842 #2 0x00007f1a25e3b251 in main (argc=6, argv=<value optimized out>) at ../../../servers/slapd/main.c:961 Thread 1 (Thread 0x7f19fbfff700 (LWP 16053)): #0 0x00007f1a23a720c5 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 #1 0x00007f1a23a73a76 in abort () at abort.c:92 #2 0x00007f1a23aaecfb in __libc_message (do_abort=2, fmt=0x7f1a23b9da78 "*** glibc detected *** %s: %s: 0x%s ***\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:186 #3 0x00007f1a23ab5d63 in malloc_printerr (av=<value optimized out>, p=0x7f1a001fdb40, have_lock=0) at malloc.c:6283 #4 _int_free (av=<value optimized out>, p=0x7f1a001fdb40, have_lock=0) at malloc.c:4795 #5 0x00007f1a0a8314a2 in nss_ZFreeIf (pointer=0x7f1a001fdb60) at arena.c:975 #6 0x00007f1a0a822574 in pem_getPrivateKey (arena=0x7f19f4119e00, rawkey=<value optimized out>, pError=0x7f19fbffe0d8, modulus=0x7f19fc17f838) at prsa.c:202 #7 0x00007f1a0a8226cd in pem_mdCryptoOperationRSAPriv_Create (proto=0x7f1a0aa4a140, mdMechanism=0x7f1a0aa4a200, pError=0x7f19fbffe0d8, mdKey=<value optimized out>) at prsa.c:361 #8 0x00007f1a0a82f72f in nssCKFWMechanism_DecryptInit (fwMechanism=0x7f19f4119dd0, pMechanism=0x7f19fbffe3f0, fwSession=0x7f19f4119420, fwObject=0x7f19fc156308) at mechanism.c:538 #9 0x00007f1a0a82c5d1 in NSSCKFWC_DecryptInit (fwInstance=<value optimized out>, hSession=<value optimized out>, pMechanism=0x7f19fbffe3f0, hKey=<value optimized out>) at wrap.c:3105 ---Type <return> to continue, or q <return> to quit--- #10 0x00007f1a248a93a1 in pk11_HandUnwrap (slot=0x7f19fc154d30, wrappingKey=12, mech=0x7f19fbffe3f0, target=881, keyTemplate=0x7f19fbffe250, templateCount=3, key_size=0, wincx=0x0, crvp=0x7f19fbffe408, isPerm=0, inKey=<value optimized out>) at pk11skey.c:1960 #11 0x00007f1a248a9997 in pk11_AnyUnwrapKey (slot=0x7f19fc154d30, wrappingKey=12, wrapType=1, param=<value optimized out>, wrappedKey=0x7f19fbffe5e0, target=881, operation=268, keySize=0, wincx=0x0, userAttr=<value optimized out>, numAttrs=0, isPerm=0) at pk11skey.c:2104 #12 0x00007f1a248ac9fc in PK11_PubUnwrapSymKey (wrappingKey=0x7f19fc1c3090, wrappedKey=0x7f19fbffe5e0, target=881, operation=268, keySize=0) at pk11skey.c:2232 #13 0x00007f1a24dd305c in ssl3_HandleRSAClientKeyExchange (ss=0x7f19f4115cc0, b=<value optimized out>, length=<value optimized out>) at ssl3con.c:7358 #14 ssl3_HandleClientKeyExchange (ss=0x7f19f4115cc0, b=<value optimized out>, length=<value optimized out>) at ssl3con.c:7474 #15 ssl3_HandleHandshakeMessage (ss=0x7f19f4115cc0, b=<value optimized out>, length=<value optimized out>) at ssl3con.c:8646 #16 0x00007f1a24dd5a2c in ssl3_HandleHandshake (ss=0x7f19f4115cc0, cText=<value optimized out>, databuf=0x7f19f4116028) at ssl3con.c:8725 #17 ssl3_HandleRecord (ss=0x7f19f4115cc0, cText=<value optimized out>, databuf=0x7f19f4116028) at ssl3con.c:9064 #18 0x00007f1a24dd6ae6 in ssl3_GatherCompleteHandshake (ss=0x7f19f4115cc0, flags=0) at ssl3gthr.c:209 #19 0x00007f1a24ddf3af in SSL_ForceHandshake (fd=<value optimized out>) at sslsecur.c:397 #20 0x00007f1a25f7a025 in tlsm_session_accept_or_connect (session=<value optimized out>, is_accept=1) at tls_m.c:2434 #21 0x00007f1a25f78027 in ldap_pvt_tls_accept (sb=0x7f19fc2d4200, ctx_arg=0x7f1a26b06e70) at tls2.c:429 #22 0x00007f1a25e55d63 in connection_read (ctx=0x7f19fbffeb80, argv=0x22) at ../../../servers/slapd/connection.c:1326 #23 connection_read_thread (ctx=0x7f19fbffeb80, argv=0x22) at ../../../servers/slapd/connection.c:1238 #24 0x00007f1a25f55218 in ldap_int_thread_pool_wrapper (xpool=0x7f1a26aa1760) at ../../../libraries/libldap_r/tpool.c:685 #25 0x00007f1a23de1ccb in start_thread (arg=0x7f19fbfff700) at pthread_create.c:301 #26 0x00007f1a23b1fc2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115
Almost always in the same place - #5 0x00007f1a0a8314a2 in nss_ZFreeIf (pointer=0x7f1a001fdb60) at arena.c:975 #6 0x00007f1a0a822574 in pem_getPrivateKey (arena=0x7f19f4119e00, rawkey=<value optimized out>, pError=0x7f19fbffe0d8, modulus=0x7f19fc17f838) at prsa.c:202 #7 0x00007f1a0a8226cd in pem_mdCryptoOperationRSAPriv_Create (proto=0x7f1a0aa4a140, mdMechanism=0x7f1a0aa4a200, pError=0x7f19fbffe0d8, mdKey=<value optimized out>) at prsa.c:361 #8 0x00007f1a0a82f72f in nssCKFWMechanism_DecryptInit (fwMechanism=0x7f19f4119dd0, pMechanism=0x7f19fbffe3f0, fwSession=0x7f19f4119420, fwObject=0x7f19fc156308) at mechanism.c:538 I'm inclined to believe this is a problem with moznss pem - looks like a double free. The section of code around 202 is quite strange. Ruben, best case - how long does it take to reproduce this problem? 1 day? several days? I'm pretty sure the problem that Mark ran into is https://bugzilla.redhat.com/show_bug.cgi?id=731168
And after further investigation, I don't even thing my configuration was actually kosher. pbind (ldap backend) and syncprov don't play very well together. I was trying to create a configuration where a primary LDAP server was feeding two slaves, and all three would be using pbind to proxy bind requests to an external server. That's the configuration I have attached here in this report. Unfortunately, this meant that syncprov wasn't propagating to the slaves from the DIT on the master which was using pbind as the ldap backend which pbind uses also turns off lastmod which creates the information used by syncprov. I'm not sure whether this is anything that can be worked around. But basically, I think it's a "bad" configuration in OpenLDAP. Now, having said that, it still seems like a TLS connection initiated from an, as yet, unauthenticated consumer shouldn't be able to crash the provider. But once I stopped using pbind on the provider entirely and switched to using only one instance of syncprov/syncrepl on each end, TLS does not crash the consumer any longer. Blah. Ignore my contributions here. :)
(In reply to comment #37) > Almost always in the same place - > > #5 0x00007f1a0a8314a2 in nss_ZFreeIf (pointer=0x7f1a001fdb60) at arena.c:975 > #6 0x00007f1a0a822574 in pem_getPrivateKey (arena=0x7f19f4119e00, > rawkey=<value optimized out>, pError=0x7f19fbffe0d8, modulus=0x7f19fc17f838) at > prsa.c:202 > #7 0x00007f1a0a8226cd in pem_mdCryptoOperationRSAPriv_Create > (proto=0x7f1a0aa4a140, mdMechanism=0x7f1a0aa4a200, pError=0x7f19fbffe0d8, > mdKey=<value optimized out>) at prsa.c:361 > #8 0x00007f1a0a82f72f in nssCKFWMechanism_DecryptInit > (fwMechanism=0x7f19f4119dd0, pMechanism=0x7f19fbffe3f0, > fwSession=0x7f19f4119420, fwObject=0x7f19fc156308) at mechanism.c:538 > > I'm inclined to believe this is a problem with moznss pem - looks like a double > free. The section of code around 202 is quite strange. > > Ruben, best case - how long does it take to reproduce this problem? 1 day? > several days? It depends, my master has been up for several days now. This particular slave crashed twice today, but due to some missing sysctl settings I've only been able catch a core of the last crash. Our ldap tree is fairly static and only has about 10 objects in it, so there's almost no load and no replication traffic. Anyway, I'll keep supplying you with more stackstraces...
Another clue here Thread 5 has just accepted an incoming TLS connection Thread 1 which has the segv is in the process of accepting an incoming TLS connection Might be some sort of thread safety issue here - perhaps thread 5 has just freed the private key modulus out from under thread 1. Do you get the crashes on both the syncrepl provider and consumer?
Just on the consumer at the time of that particular stacktrace. There's nothing strange in the logs, I've increased the loglevel on all servers to capture more information.
Here's another one, from the same consumer: Core was generated by `/usr/sbin/slapd -h ldap:/// ldaps:/// -u ldap -4'. Program terminated with signal 6, Aborted. #0 0x00007ff5303cc0c5 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 64 return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig); (gdb) thread apply all bt Thread 7 (Thread 0x7ff502ffe700 (LWP 9589)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 #1 0x00007ff5328af277 in ldap_int_thread_pool_wrapper (xpool=0x7ff5342fe740) at ../../../libraries/libldap_r/tpool.c:672 #2 0x00007ff53073bccb in start_thread (arg=0x7ff502ffe700) at pthread_create.c:301 #3 0x00007ff530479c2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 6 (Thread 0x7ff51907c700 (LWP 24955)): #0 0x00007ff53047a193 in epoll_wait () at ../sysdeps/unix/syscall-template.S:82 #1 0x00007ff5327ab2b3 in slapd_daemon_task (ptr=<value optimized out>) at ../../../servers/slapd/daemon.c:2467 #2 0x00007ff53073bccb in start_thread (arg=0x7ff51907c700) at pthread_create.c:301 #3 0x00007ff530479c2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 5 (Thread 0x7ff513fff700 (LWP 24957)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 #1 0x00007ff5328af277 in ldap_int_thread_pool_wrapper (xpool=0x7ff5342fe740) at ../../../libraries/libldap_r/tpool.c:672 #2 0x00007ff53073bccb in start_thread (arg=0x7ff513fff700) at pthread_create.c:301 #3 0x00007ff530479c2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 4 (Thread 0x7ff510b90700 (LWP 27857)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 #1 0x00007ff5328af277 in ldap_int_thread_pool_wrapper (xpool=0x7ff5342fe740) at ../../../libraries/libldap_r/tpool.c:672 #2 0x00007ff53073bccb in start_thread (arg=0x7ff510b90700) at pthread_create.c:301 #3 0x00007ff530479c2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 3 (Thread 0x7ff5017fc700 (LWP 24325)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 #1 0x00007ff5328af277 in ldap_int_thread_pool_wrapper (xpool=0x7ff5342fe740) at ../../../libraries/libldap_r/tpool.c:672 #2 0x00007ff53073bccb in start_thread (arg=0x7ff5017fc700) at pthread_create.c:301 #3 0x00007ff530479c2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 2 (Thread 0x7ff532752720 (LWP 24951)): #0 0x00007ff53073cde5 in pthread_join (threadid=140690663655168, thread_return=0x0) at pthread_join.c:89 #1 0x00007ff5327ac9fc in slapd_daemon () at ../../../servers/slapd/daemon.c:2842 #2 0x00007ff532795251 in main (argc=6, argv=<value optimized out>) at ../../../servers/slapd/main.c:961 Thread 1 (Thread 0x7ff51887b700 (LWP 24956)): #0 0x00007ff5303cc0c5 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 #1 0x00007ff5303cda76 in abort () at abort.c:92 #2 0x00007ff530408cfb in __libc_message (do_abort=2, fmt=0x7ff5304f7a78 "*** glibc detected *** %s: %s: 0x%s ***\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:186 #3 0x00007ff53040fd63 in malloc_printerr (av=<value optimized out>, p=0x7ff504236300, have_lock=0) at malloc.c:6283 #4 _int_free (av=<value optimized out>, p=0x7ff504236300, have_lock=0) at malloc.c:4795 #5 0x00007ff5131104a2 in nss_ZFreeIf (pointer=0x7ff504236320) at arena.c:975 #6 0x00007ff513101574 in pem_getPrivateKey (arena=0x7ff50c386e20, rawkey=<value optimized out>, pError=0x7ff51887a0d8, modulus=0x7ff50c17f838) at prsa.c:202 #7 0x00007ff5131016cd in pem_mdCryptoOperationRSAPriv_Create (proto=0x7ff513329140, mdMechanism=0x7ff513329200, pError=0x7ff51887a0d8, mdKey=<value optimized out>) at prsa.c:361 #8 0x00007ff51310e72f in nssCKFWMechanism_DecryptInit (fwMechanism=0x7ff50c386df0, pMechanism=0x7ff51887a3f0, fwSession=0x7ff50c3865a0, fwObject=0x7ff50c156308) at mechanism.c:538 #9 0x00007ff51310b5d1 in NSSCKFWC_DecryptInit (fwInstance=<value optimized out>, hSession=<value optimized out>, pMechanism=0x7ff51887a3f0, hKey=<value optimized out>) at wrap.c:3105 #10 0x00007ff5312033a1 in pk11_HandUnwrap (slot=0x7ff50c154d30, wrappingKey=12, mech=0x7ff51887a3f0, target=881, keyTemplate=0x7ff51887a250, templateCount=3, key_size=0, wincx=0x0, ---Type <return> to continue, or q <return> to quit--- crvp=0x7ff51887a408, isPerm=0, inKey=<value optimized out>) at pk11skey.c:1960 #11 0x00007ff531203997 in pk11_AnyUnwrapKey (slot=0x7ff50c154d30, wrappingKey=12, wrapType=1, param=<value optimized out>, wrappedKey=0x7ff51887a5e0, target=881, operation=268, keySize=0, wincx=0x0, userAttr=<value optimized out>, numAttrs=0, isPerm=0) at pk11skey.c:2104 #12 0x00007ff5312069fc in PK11_PubUnwrapSymKey (wrappingKey=0x7ff50c1c30a0, wrappedKey=0x7ff51887a5e0, target=881, operation=268, keySize=0) at pk11skey.c:2232 #13 0x00007ff53172d05c in ssl3_HandleRSAClientKeyExchange (ss=0x7ff5042d10b0, b=<value optimized out>, length=<value optimized out>) at ssl3con.c:7358 #14 ssl3_HandleClientKeyExchange (ss=0x7ff5042d10b0, b=<value optimized out>, length=<value optimized out>) at ssl3con.c:7474 #15 ssl3_HandleHandshakeMessage (ss=0x7ff5042d10b0, b=<value optimized out>, length=<value optimized out>) at ssl3con.c:8646 #16 0x00007ff53172fa2c in ssl3_HandleHandshake (ss=0x7ff5042d10b0, cText=<value optimized out>, databuf=0x7ff5042d1418) at ssl3con.c:8725 #17 ssl3_HandleRecord (ss=0x7ff5042d10b0, cText=<value optimized out>, databuf=0x7ff5042d1418) at ssl3con.c:9064 #18 0x00007ff531730ae6 in ssl3_GatherCompleteHandshake (ss=0x7ff5042d10b0, flags=0) at ssl3gthr.c:209 #19 0x00007ff5317393af in SSL_ForceHandshake (fd=<value optimized out>) at sslsecur.c:397 #20 0x00007ff5328d4025 in tlsm_session_accept_or_connect (session=<value optimized out>, is_accept=1) at tls_m.c:2434 #21 0x00007ff5328d2027 in ldap_pvt_tls_accept (sb=0x7ff5080017e0, ctx_arg=0x7ff534363cd0) at tls2.c:429 #22 0x00007ff5327afd63 in connection_read (ctx=0x7ff51887ab80, argv=0x21) at ../../../servers/slapd/connection.c:1326 #23 connection_read_thread (ctx=0x7ff51887ab80, argv=0x21) at ../../../servers/slapd/connection.c:1238 #24 0x00007ff5328af218 in ldap_int_thread_pool_wrapper (xpool=0x7ff5342fe740) at ../../../libraries/libldap_r/tpool.c:685 #25 0x00007ff53073bccb in start_thread (arg=0x7ff51887b700) at pthread_create.c:301 #26 0x00007ff530479c2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115
(In reply to comment #42) > Here's another one, from the same consumer: > snip This one doesn't have another thread which has just accepted a TLS connection. I suppose you could try increasing or decreasing the number of threads that openldap uses. If it is a race condition, decreasing the number of threads may make the problem harder to reproduce, and conversely for increasing the number of threads.
Ok, I'll try to do that. Meanwhile, here's another trace, from the master: Core was generated by `/usr/sbin/slapd -h ldap:/// ldaps:/// -u ldap -4'. Program terminated with signal 6, Aborted. #0 0x00007f19dc7210c5 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 64 return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig); (gdb) thread all apply bt No symbol "all" in current context. (gdb) thread apply all bt Thread 8 (Thread 0x7f19ae305700 (LWP 11681)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 #1 0x00007f19dec04277 in ldap_int_thread_pool_wrapper (xpool=0x7f19df43d760) at ../../../libraries/libldap_r/tpool.c:672 #2 0x00007f19dca90ccb in start_thread (arg=0x7f19ae305700) at pthread_create.c:301 #3 0x00007f19dc7cec2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 7 (Thread 0x7f19aeb06700 (LWP 11680)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 #1 0x00007f19dec04277 in ldap_int_thread_pool_wrapper (xpool=0x7f19df43d760) at ../../../libraries/libldap_r/tpool.c:672 #2 0x00007f19dca90ccb in start_thread (arg=0x7f19aeb06700) at pthread_create.c:301 #3 0x00007f19dc7cec2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 6 (Thread 0x7f19a4ffc700 (LWP 16613)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 #1 0x00007f19dec04277 in ldap_int_thread_pool_wrapper (xpool=0x7f19df43d760) at ../../../libraries/libldap_r/tpool.c:672 #2 0x00007f19dca90ccb in start_thread (arg=0x7f19a4ffc700) at pthread_create.c:301 #3 0x00007f19dc7cec2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 5 (Thread 0x7f19af307700 (LWP 11679)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 #1 0x00007f19dec04277 in ldap_int_thread_pool_wrapper (xpool=0x7f19df43d760) at ../../../libraries/libldap_r/tpool.c:672 #2 0x00007f19dca90ccb in start_thread (arg=0x7f19af307700) at pthread_create.c:301 #3 0x00007f19dc7cec2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 4 (Thread 0x7f19afb08700 (LWP 11678)): #0 0x00007f19dc7cf193 in epoll_wait () at ../sysdeps/unix/syscall-template.S:82 #1 0x00007f19deb002b3 in slapd_daemon_task (ptr=<value optimized out>) at ../../../servers/slapd/daemon.c:2467 #2 0x00007f19dca90ccb in start_thread (arg=0x7f19afb08700) at pthread_create.c:301 #3 0x00007f19dc7cec2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 3 (Thread 0x7f19deaa6720 (LWP 11674)): #0 0x00007f19dca91de5 in pthread_join (threadid=139748298491648, thread_return=0x0) at pthread_join.c:89 #1 0x00007f19deb019fc in slapd_daemon () at ../../../servers/slapd/daemon.c:2842 #2 0x00007f19deaea251 in main (argc=6, argv=<value optimized out>) at ../../../servers/slapd/main.c:961 Thread 2 (Thread 0x7f1991ffd700 (LWP 26835)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 #1 0x00007f19dec04277 in ldap_int_thread_pool_wrapper (xpool=0x7f19df43d760) at ../../../libraries/libldap_r/tpool.c:672 #2 0x00007f19dca90ccb in start_thread (arg=0x7f1991ffd700) at pthread_create.c:301 #3 0x00007f19dc7cec2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 1 (Thread 0x7f19ace98700 (LWP 3007)): #0 0x00007f19dc7210c5 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 #1 0x00007f19dc722a76 in abort () at abort.c:92 #2 0x00007f19dc75dcfb in __libc_message (do_abort=2, fmt=0x7f19dc84ca78 "*** glibc detected *** %s: %s: 0x%s ***\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:186 #3 0x00007f19dc764d63 in malloc_printerr (av=<value optimized out>, p=0x7f1994379a80, have_lock=0) at malloc.c:6283 #4 _int_free (av=<value optimized out>, p=0x7f1994379a80, have_lock=0) at malloc.c:4795 #5 0x00007f19ad4164a2 in nss_ZFreeIf (pointer=0x7f1994379aa0) at arena.c:975 #6 0x00007f19ad407574 in pem_getPrivateKey (arena=0x7f198c201170, rawkey=<value optimized out>, pError=0x7f19ace970d8, modulus=0x7f19940770b8) at prsa.c:202 #7 0x00007f19ad4076cd in pem_mdCryptoOperationRSAPriv_Create (proto=0x7f19ad62f140, mdMechanism=0x7f19ad62f200, pError=0x7f19ace970d8, mdKey=<value optimized out>) at prsa.c:361 #8 0x00007f19ad41472f in nssCKFWMechanism_DecryptInit (fwMechanism=0x7f198c201140, pMechanism=0x7f19ace973f0, fwSession=0x7f1994402350, fwObject=0x7f199404db88) at mechanism.c:538 #9 0x00007f19ad4115d1 in NSSCKFWC_DecryptInit (fwInstance=<value optimized out>, hSession=<value optimized out>, pMechanism=0x7f19ace973f0, hKey=<value optimized out>) at wrap.c:3105 #10 0x00007f19dd5583a1 in pk11_HandUnwrap (slot=0x7f199404c5b0, wrappingKey=9, mech=0x7f19ace973f0, target=881, keyTemplate=0x7f19ace97250, templateCount=3, key_size=0, wincx=0x0, crvp=0x7f19ace97408, isPerm=0, inKey=<value optimized out>) at pk11skey.c:1960 #11 0x00007f19dd558997 in pk11_AnyUnwrapKey (slot=0x7f199404c5b0, wrappingKey=9, wrapType=1, param=<value optimized out>, wrappedKey=0x7f19ace975e0, target=881, operation=268, keySize=0, wincx=0x0, userAttr=<value optimized out>, numAttrs=0, isPerm=0) at pk11skey.c:2104 #12 0x00007f19dd55b9fc in PK11_PubUnwrapSymKey (wrappingKey=0x7f199407f250, wrappedKey=0x7f19ace975e0, target=881, operation=268, keySize=0) at pk11skey.c:2232 #13 0x00007f19dda8205c in ssl3_HandleRSAClientKeyExchange (ss=0x7f19a03d9470, b=<value optimized out>, length=<value optimized out>) at ssl3con.c:7358 #14 ssl3_HandleClientKeyExchange (ss=0x7f19a03d9470, b=<value optimized out>, length=<value optimized out>) at ssl3con.c:7474 #15 ssl3_HandleHandshakeMessage (ss=0x7f19a03d9470, b=<value optimized out>, length=<value optimized out>) at ssl3con.c:8646 #16 0x00007f19dda84a2c in ssl3_HandleHandshake (ss=0x7f19a03d9470, cText=<value optimized out>, databuf=0x7f19a03d97d8) at ssl3con.c:8725 #17 ssl3_HandleRecord (ss=0x7f19a03d9470, cText=<value optimized out>, databuf=0x7f19a03d97d8) at ssl3con.c:9064 #18 0x00007f19dda85ae6 in ssl3_GatherCompleteHandshake (ss=0x7f19a03d9470, flags=0) at ssl3gthr.c:209 #19 0x00007f19dda8e3af in SSL_ForceHandshake (fd=<value optimized out>) at sslsecur.c:397 #20 0x00007f19dec29025 in tlsm_session_accept_or_connect (session=<value optimized out>, is_accept=1) at tls_m.c:2434 #21 0x00007f19dec27027 in ldap_pvt_tls_accept (sb=0x7f198c1c7de0, ctx_arg=0x7f19df4afa10) at tls2.c:429 ---Type <return> to continue, or q <return> to quit--- #22 0x00007f19deb04d63 in connection_read (ctx=0x7f19ace97b80, argv=0x3f) at ../../../servers/slapd/connection.c:1326 #23 connection_read_thread (ctx=0x7f19ace97b80, argv=0x3f) at ../../../servers/slapd/connection.c:1238 #24 0x00007f19dec04218 in ldap_int_thread_pool_wrapper (xpool=0x7f19df43d760) at ../../../libraries/libldap_r/tpool.c:685 #25 0x00007f19dca90ccb in start_thread (arg=0x7f19ace98700) at pthread_create.c:301 #26 0x00007f19dc7cec2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 (gdb)
Wow, that was easy. I got the consumer to crash in a few minutes by running: while true; do ldapsearch -xZZ -H ldap://localhost uid=ruben >/dev/null || exit 1; done & about ten times in parallel in the background. Here's the trace: Thread 20 (Thread 0x7fc886ffe700 (LWP 25932)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 #1 0x00007fc8b13fa277 in ldap_int_thread_pool_wrapper (xpool=0x7fc8b308b760) at ../../../libraries/libldap_r/tpool.c:672 #2 0x00007fc8af286ccb in start_thread (arg=0x7fc886ffe700) at pthread_create.c:301 #3 0x00007fc8aefc4c2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 19 (Thread 0x7fc871ffc700 (LWP 2377)): #0 __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:97 #1 0x00007fc8aefc07b4 in _L_lock_505 () from /lib64/libc-2.13.so #2 0x00007fc8aefc01aa in __vsyslog_chk (pri=<value optimized out>, flag=1, fmt=0x7fc8b14315a8 "ber_get_next on fd %d failed errno=%d (%s)\n", ap=0x7fc871ffb910) at ../misc/syslog.c:260 #3 0x00007fc8aefc05a3 in __syslog_chk (pri=<value optimized out>, flag=<value optimized out>, fmt=<value optimized out>) at ../misc/syslog.c:131 #4 0x00007fc8b12fa9e9 in syslog (ctx=0x7fc871ffbb80, argv=<value optimized out>) at /usr/include/bits/syslog.h:32 #5 connection_input (ctx=0x7fc871ffbb80, argv=<value optimized out>) at ../../../servers/slapd/connection.c:1493 #6 connection_read (ctx=0x7fc871ffbb80, argv=<value optimized out>) at ../../../servers/slapd/connection.c:1414 #7 connection_read_thread (ctx=0x7fc871ffbb80, argv=<value optimized out>) at ../../../servers/slapd/connection.c:1238 #8 0x00007fc8b13fa218 in ldap_int_thread_pool_wrapper (xpool=0x7fc8b308b760) at ../../../libraries/libldap_r/tpool.c:685 #9 0x00007fc8af286ccb in start_thread (arg=0x7fc871ffc700) at pthread_create.c:301 #10 0x00007fc8aefc4c2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 18 (Thread 0x7fc8517fc700 (LWP 2531)): #0 __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:97 #1 0x00007fc8aefc07b4 in _L_lock_505 () from /lib64/libc-2.13.so #2 0x00007fc8aefc01aa in __vsyslog_chk (pri=<value optimized out>, flag=1, fmt=0x7fc8b1430d60 "connection_get(%d): got connid=%lu\n", ap=0x7fc8517fb8e0) at ../misc/syslog.c:260 #3 0x00007fc8aefc05a3 in __syslog_chk (pri=<value optimized out>, flag=<value optimized out>, fmt=<value optimized out>) at ../misc/syslog.c:131 #4 0x00007fc8b12f7e05 in syslog (s=37) at /usr/include/bits/syslog.h:32 #5 connection_get (s=37) at ../../../servers/slapd/connection.c:286 #6 0x00007fc8b12fa685 in connection_read (ctx=0x7fc8517fbb80, argv=0x25) at ../../../servers/slapd/connection.c:1292 #7 connection_read_thread (ctx=0x7fc8517fbb80, argv=0x25) at ../../../servers/slapd/connection.c:1238 #8 0x00007fc8b13fa218 in ldap_int_thread_pool_wrapper (xpool=0x7fc8b308b760) at ../../../libraries/libldap_r/tpool.c:685 #9 0x00007fc8af286ccb in start_thread (arg=0x7fc8517fc700) at pthread_create.c:301 #10 0x00007fc8aefc4c2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 17 (Thread 0x7fc8427fd700 (LWP 2551)): #0 __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:97 #1 0x00007fc8aefc07b4 in _L_lock_505 () from /lib64/libc-2.13.so #2 0x00007fc8aefc01aa in __vsyslog_chk (pri=<value optimized out>, flag=1, fmt=0x7fc8b142fe07 ">>> slap_listener(%s)\n", ap=0x7fc8427fb4e0) at ../misc/syslog.c:260 #3 0x00007fc8aefc05a3 in __syslog_chk (pri=<value optimized out>, flag=<value optimized out>, fmt=<value optimized out>) at ../misc/syslog.c:131 #4 0x00007fc8b12f3cae in syslog (sl=0x7fc8b305b240) at /usr/include/bits/syslog.h:32 #5 slap_listener (sl=0x7fc8b305b240) at ../../../servers/slapd/daemon.c:1781 #6 0x00007fc8b12f44e9 in slap_listener_thread (ctx=<value optimized out>, ptr=0x7fc8b305b240) at ../../../servers/slapd/daemon.c:2040 #7 0x00007fc8b13fa218 in ldap_int_thread_pool_wrapper (xpool=0x7fc8b308b760) at ../../../libraries/libldap_r/tpool.c:685 #8 0x00007fc8af286ccb in start_thread (arg=0x7fc8427fd700) at pthread_create.c:301 #9 0x00007fc8aefc4c2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 16 (Thread 0x7fc897bc7700 (LWP 24122)): #0 0x00007fc8aefc5193 in epoll_wait () at ../sysdeps/unix/syscall-template.S:82 #1 0x00007fc8b12f62b3 in slapd_daemon_task (ptr=<value optimized out>) at ../../../servers/slapd/daemon.c:2467 #2 0x00007fc8af286ccb in start_thread (arg=0x7fc897bc7700) at pthread_create.c:301 #3 0x00007fc8aefc4c2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 15 (Thread 0x7fc87dffd700 (LWP 30558)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 #1 0x00007fc8b13fa277 in ldap_int_thread_pool_wrapper (xpool=0x7fc8b308b760) at ../../../libraries/libldap_r/tpool.c:672 #2 0x00007fc8af286ccb in start_thread (arg=0x7fc87dffd700) at pthread_create.c:301 #3 0x00007fc8aefc4c2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 14 (Thread 0x7fc896bc5700 (LWP 24124)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 #1 0x00007fc8b13fa277 in ldap_int_thread_pool_wrapper (xpool=0x7fc8b308b760) at ../../../libraries/libldap_r/tpool.c:672 #2 0x00007fc8af286ccb in start_thread (arg=0x7fc896bc5700) at pthread_create.c:301 #3 0x00007fc8aefc4c2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 13 (Thread 0x7fc8717fb700 (LWP 2378)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 #1 0x00007fc8b13fa277 in ldap_int_thread_pool_wrapper (xpool=0x7fc8b308b760) at ../../../libraries/libldap_r/tpool.c:672 #2 0x00007fc8af286ccb in start_thread (arg=0x7fc8717fb700) at pthread_create.c:301 #3 0x00007fc8aefc4c2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 12 (Thread 0x7fc870ffa700 (LWP 2384)): #0 __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:97 #1 0x00007fc8aefc07b4 in _L_lock_505 () from /lib64/libc-2.13.so #2 0x00007fc8aefc01aa in __vsyslog_chk (pri=<value optimized out>, flag=1, fmt=0x7fc8b14339a0 "send_ldap_extended: err=%d oid=%s len=%ld\n", ap=0x7fc870ff9770) at ../misc/syslog.c:260 #3 0x00007fc8aefc05a3 in __syslog_chk (pri=<value optimized out>, flag=<value optimized out>, fmt=<value optimized out>) at ../misc/syslog.c:131 #4 0x00007fc8b130b9ba in syslog (op=0x7fc8580181d0, rs=0x7fc870ff9930) at /usr/include/bits/syslog.h:32 #5 slap_send_ldap_extended (op=0x7fc8580181d0, rs=0x7fc870ff9930) at ../../../servers/slapd/result.c:793 #6 0x00007fc8b132c809 in fe_extended (op=0x7fc8580181d0, rs=0x7fc870ff9930) at ../../../servers/slapd/extended.c:240 #7 0x00007fc8b132c572 in do_extended (op=0x7fc8580181d0, rs=0x7fc870ff9930) at ../../../servers/slapd/extended.c:180 #8 0x00007fc8b12fa48f in connection_operation (ctx=0x7fc870ff9b80, arg_v=0x7fc8580181d0) at ../../../servers/slapd/connection.c:1109 #9 0x00007fc8b12fadff in connection_read_thread (ctx=0x7fc870ff9b80, argv=<value optimized out>) at ../../../servers/slapd/connection.c:1245 #10 0x00007fc8b13fa218 in ldap_int_thread_pool_wrapper (xpool=0x7fc8b308b760) at ../../../libraries/libldap_r/tpool.c:685 #11 0x00007fc8af286ccb in start_thread (arg=0x7fc870ffa700) at pthread_create.c:301 #12 0x00007fc8aefc4c2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 11 (Thread 0x7fc842ffe700 (LWP 2550)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 #1 0x00007fc8b13fa277 in ldap_int_thread_pool_wrapper (xpool=0x7fc8b308b760) at ../../../libraries/libldap_r/tpool.c:672 #2 0x00007fc8af286ccb in start_thread (arg=0x7fc842ffe700) at pthread_create.c:301 #3 0x00007fc8aefc4c2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 10 (Thread 0x7fc8b129d720 (LWP 24118)): #0 0x00007fc8af287de5 in pthread_join (threadid=140499515897600, thread_return=0x0) at pthread_join.c:89 #1 0x00007fc8b12f79fc in slapd_daemon () at ../../../servers/slapd/daemon.c:2842 #2 0x00007fc8b12e0251 in main (argc=6, argv=<value optimized out>) at ../../../servers/slapd/main.c:961 Thread 9 (Thread 0x7fc850ffb700 (LWP 2549)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 #1 0x00007fc8b13fa277 in ldap_int_thread_pool_wrapper (xpool=0x7fc8b308b760) at ../../../libraries/libldap_r/tpool.c:672 #2 0x00007fc8af286ccb in start_thread (arg=0x7fc850ffb700) at pthread_create.c:301 #3 0x00007fc8aefc4c2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 8 (Thread 0x7fc851ffd700 (LWP 2530)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 #1 0x00007fc8b13fa277 in ldap_int_thread_pool_wrapper (xpool=0x7fc8b308b760) at ../../../libraries/libldap_r/tpool.c:672 #2 0x00007fc8af286ccb in start_thread (arg=0x7fc851ffd700) at pthread_create.c:301 #3 0x00007fc8aefc4c2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 7 (Thread 0x7fc83dff8700 (LWP 3777)): #0 0x00007fc8af28df1d in write () at ../sysdeps/unix/syscall-template.S:82 #1 0x00007fc8b142892c in sb_debug_write (sbiod=0x7fc854012f40, buf=0x7fc84c255fe0, len=59) at ../../../libraries/liblber/sockbuf.c:852 #2 0x00007fc8b141f1ed in tlsm_PR_Send (fd=<value optimized out>, buf=<value optimized out>, len=59, flags=<value optimized out>, timeout=<value optimized out>) at tls_m.c:2819 #3 0x00007fc8b02807fb in ssl_DefSend (ss=0x7fc8540008c0, buf=0x7fc84c255fe0 "\024\003\001", len=59, flags=0) at ssldef.c:127 #4 0x00007fc8b02845fe in ssl_SendSavedWriteData (ss=0x7fc8540008c0) at sslsecur.c:508 #5 0x00007fc8b027130a in ssl3_SendRecord (ss=0x7fc8540008c0, type=content_handshake, pIn=<value optimized out>, nIn=<value optimized out>, flags=0) at ssl3con.c:2268 #6 0x00007fc8b027141e in ssl3_FlushHandshake (ss=0x7fc8540008c0, flags=<value optimized out>) at ssl3con.c:2418 #7 0x00007fc8b02748e2 in ssl3_SendFinished (ss=0x7fc8540008c0, flags=0) at ssl3con.c:8226 #8 0x00007fc8b0276ef2 in ssl3_HandleFinished (ss=0x7fc8540008c0, b=<value optimized out>, length=<value optimized out>, hashes=<value optimized out>) at ssl3con.c:8422 #9 0x00007fc8b02772f1 in ssl3_HandleHandshakeMessage (ss=0x7fc8540008c0, b=0x7fc854131504 "\304d\316\177.'\362\240\066\275\330Y\327\b\217\252w\022\261闅\022\321\064Y\360\237\326 \344\002", '\v' <repeats 12 times>"\355, xI\357\071\237\377\303\322\324N\240\035\t\217#\225c\314\301\260\357\362>쳓\264\243\021da\250\324Bg\335\311\375\364c\036\306?K\305\064\246o\335zi\326\330\335;\027\247Q\255\336\270\002\360\230i<ug9\302M\257\204\030\325\377\257\033\371R¨ɓ", <incomplete sequence \346>, length=12) at ssl3con.c:8657 #10 0x00007fc8b027aa2c in ssl3_HandleHandshake (ss=0x7fc8540008c0, cText=<value optimized out>, databuf=0x7fc854000c28) at ssl3con.c:8725 #11 ssl3_HandleRecord (ss=0x7fc8540008c0, cText=<value optimized out>, databuf=0x7fc854000c28) at ssl3con.c:9064 #12 0x00007fc8b027bae6 in ssl3_GatherCompleteHandshake (ss=0x7fc8540008c0, flags=0) at ssl3gthr.c:209 #13 0x00007fc8b02843af in SSL_ForceHandshake (fd=<value optimized out>) at sslsecur.c:397 #14 0x00007fc8b141f025 in tlsm_session_accept_or_connect (session=<value optimized out>, is_accept=1) at tls_m.c:2434 #15 0x00007fc8b141d027 in ldap_pvt_tls_accept (sb=0x7fc88819b570, ctx_arg=0x7fc8b30f0d00) at tls2.c:429 #16 0x00007fc8b12fad63 in connection_read (ctx=0x7fc83dff7b80, argv=0x24) at ../../../servers/slapd/connection.c:1326 #17 connection_read_thread (ctx=0x7fc83dff7b80, argv=0x24) at ../../../servers/slapd/connection.c:1238 #18 0x00007fc8b13fa218 in ldap_int_thread_pool_wrapper (xpool=0x7fc8b308b760) at ../../../libraries/libldap_r/tpool.c:685 #19 0x00007fc8af286ccb in start_thread (arg=0x7fc83dff8700) at pthread_create.c:301 #20 0x00007fc8aefc4c2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 6 (Thread 0x7fc8973c6700 (LWP 24123)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 #1 0x00007fc8b13fa277 in ldap_int_thread_pool_wrapper (xpool=0x7fc8b308b760) at ../../../libraries/libldap_r/tpool.c:672 #2 0x00007fc8af286ccb in start_thread (arg=0x7fc8973c6700) at pthread_create.c:301 #3 0x00007fc8aefc4c2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 5 (Thread 0x7fc85cffc700 (LWP 2491)): #0 __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:97 #1 0x00007fc8aefc07b4 in _L_lock_505 () from /lib64/libc-2.13.so #2 0x00007fc8aefc01aa in __vsyslog_chk (pri=<value optimized out>, flag=1, fmt=0x7fc8b1430d60 "connection_get(%d): got connid=%lu\n", ap=0x7fc85cffb8e0) at ../misc/syslog.c:260 #3 0x00007fc8aefc05a3 in __syslog_chk (pri=<value optimized out>, flag=<value optimized out>, fmt=<value optimized out>) at ../misc/syslog.c:131 #4 0x00007fc8b12f7e05 in syslog (s=30) at /usr/include/bits/syslog.h:32 #5 connection_get (s=30) at ../../../servers/slapd/connection.c:286 #6 0x00007fc8b12fa685 in connection_read (ctx=0x7fc85cffbb80, argv=0x1e) at ../../../servers/slapd/connection.c:1292 #7 connection_read_thread (ctx=0x7fc85cffbb80, argv=0x1e) at ../../../servers/slapd/connection.c:1238 #8 0x00007fc8b13fa218 in ldap_int_thread_pool_wrapper (xpool=0x7fc8b308b760) at ../../../libraries/libldap_r/tpool.c:685 #9 0x00007fc8af286ccb in start_thread (arg=0x7fc85cffc700) at pthread_create.c:301 #10 0x00007fc8aefc4c2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 4 (Thread 0x7fc8737fe700 (LWP 2369)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 #1 0x00007fc8b13fa277 in ldap_int_thread_pool_wrapper (xpool=0x7fc8b308b760) at ../../../libraries/libldap_r/tpool.c:672 #2 0x00007fc8af286ccb in start_thread (arg=0x7fc8737fe700) at pthread_create.c:301 #3 0x00007fc8aefc4c2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 3 (Thread 0x7fc873fff700 (LWP 2368)): #0 __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:97 #1 0x00007fc8aefc07b4 in _L_lock_505 () from /lib64/libc-2.13.so #2 0x00007fc8aefc01aa in __vsyslog_chk (pri=<value optimized out>, flag=1, fmt=0x7fc8b1430d60 "connection_get(%d): got connid=%lu\n", ap=0x7fc873ffe8e0) at ../misc/syslog.c:260 #3 0x00007fc8aefc05a3 in __syslog_chk (pri=<value optimized out>, flag=<value optimized out>, fmt=<value optimized out>) at ../misc/syslog.c:131 #4 0x00007fc8b12f7e05 in syslog (s=35) at /usr/include/bits/syslog.h:32 #5 connection_get (s=35) at ../../../servers/slapd/connection.c:286 #6 0x00007fc8b12fa685 in connection_read (ctx=0x7fc873ffeb80, argv=0x23) at ../../../servers/slapd/connection.c:1292 #7 connection_read_thread (ctx=0x7fc873ffeb80, argv=0x23) at ../../../servers/slapd/connection.c:1238 #8 0x00007fc8b13fa218 in ldap_int_thread_pool_wrapper (xpool=0x7fc8b308b760) at ../../../libraries/libldap_r/tpool.c:685 #9 0x00007fc8af286ccb in start_thread (arg=0x7fc873fff700) at pthread_create.c:301 #10 0x00007fc8aefc4c2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 2 (Thread 0x7fc884ffb700 (LWP 28529)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 #1 0x00007fc8b13fa277 in ldap_int_thread_pool_wrapper (xpool=0x7fc8b308b760) at ../../../libraries/libldap_r/tpool.c:672 #2 0x00007fc8af286ccb in start_thread (arg=0x7fc884ffb700) at pthread_create.c:301 #3 0x00007fc8aefc4c2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 1 (Thread 0x7fc8857fc700 (LWP 27301)): #0 0x00007fc8aef170c5 in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 #1 0x00007fc8aef18a76 in abort () at abort.c:92 #2 0x00007fc8aef53cfb in __libc_message (do_abort=2, fmt=0x7fc8af042a78 "*** glibc detected *** %s: %s: 0x%s ***\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:186 #3 0x00007fc8aef5ad63 in malloc_printerr (av=<value optimized out>, p=0x7fc868012030, have_lock=0) at malloc.c:6283 #4 _int_free (av=<value optimized out>, p=0x7fc868012030, have_lock=0) at malloc.c:4795 #5 0x00007fc895cd64a2 in nss_ZFreeIf (pointer=0x7fc868012050) at arena.c:975 #6 0x00007fc895cc7574 in pem_getPrivateKey (arena=0x7fc87814c0d0, rawkey=<value optimized out>, pError=0x7fc8857fb0d8, modulus=0x7fc88817f838) at prsa.c:202 #7 0x00007fc895cc76cd in pem_mdCryptoOperationRSAPriv_Create (proto=0x7fc895eef140, mdMechanism=0x7fc895eef200, pError=0x7fc8857fb0d8, mdKey=<value optimized out>) at prsa.c:361 #8 0x00007fc895cd472f in nssCKFWMechanism_DecryptInit (fwMechanism=0x7fc87814c0a0, pMechanism=0x7fc8857fb3f0, fwSession=0x7fc878134e70, fwObject=0x7fc888156308) at mechanism.c:538 #9 0x00007fc895cd15d1 in NSSCKFWC_DecryptInit (fwInstance=<value optimized out>, hSession=<value optimized out>, pMechanism=0x7fc8857fb3f0, hKey=<value optimized out>) at wrap.c:3105 #10 0x00007fc8afd4e3a1 in pk11_HandUnwrap (slot=0x7fc888154d30, wrappingKey=12, mech=0x7fc8857fb3f0, target=881, keyTemplate=0x7fc8857fb250, templateCount=3, key_size=0, wincx=0x0, crvp=0x7fc8857fb408, isPerm=0, inKey=<value optimized out>) at pk11skey.c:1960 #11 0x00007fc8afd4e997 in pk11_AnyUnwrapKey (slot=0x7fc888154d30, wrappingKey=12, wrapType=1, param=<value optimized out>, wrappedKey=0x7fc8857fb5e0, target=881, operation=268, keySize=0, wincx=0x0, userAttr=<value optimized out>, numAttrs=0, isPerm=0) at pk11skey.c:2104 #12 0x00007fc8afd519fc in PK11_PubUnwrapSymKey (wrappingKey=0x7fc8881c30a0, wrappedKey=0x7fc8857fb5e0, target=881, operation=268, keySize=0) at pk11skey.c:2232 #13 0x00007fc8b027805c in ssl3_HandleRSAClientKeyExchange (ss=0x7fc88c258b90, b=<value optimized out>, length=<value optimized out>) at ssl3con.c:7358 #14 ssl3_HandleClientKeyExchange (ss=0x7fc88c258b90, b=<value optimized out>, length=<value optimized out>) at ssl3con.c:7474 #15 ssl3_HandleHandshakeMessage (ss=0x7fc88c258b90, b=<value optimized out>, length=<value optimized out>) at ssl3con.c:8646 #16 0x00007fc8b027aa2c in ssl3_HandleHandshake (ss=0x7fc88c258b90, cText=<value optimized out>, databuf=0x7fc88c258ef8) at ssl3con.c:8725 #17 ssl3_HandleRecord (ss=0x7fc88c258b90, cText=<value optimized out>, databuf=0x7fc88c258ef8) at ssl3con.c:9064 #18 0x00007fc8b027bae6 in ssl3_GatherCompleteHandshake (ss=0x7fc88c258b90, flags=0) at ssl3gthr.c:209 #19 0x00007fc8b02843af in SSL_ForceHandshake (fd=<value optimized out>) at sslsecur.c:397 #20 0x00007fc8b141f025 in tlsm_session_accept_or_connect (session=<value optimized out>, is_accept=1) at tls_m.c:2434 #21 0x00007fc8b141d027 in ldap_pvt_tls_accept (sb=0x7fc874125640, ctx_arg=0x7fc8b30f0d00) at tls2.c:429 #22 0x00007fc8b12fad63 in connection_read (ctx=0x7fc8857fbb80, argv=0x21) at ../../../servers/slapd/connection.c:1326 #23 connection_read_thread (ctx=0x7fc8857fbb80, argv=0x21) at ../../../servers/slapd/connection.c:1238 #24 0x00007fc8b13fa218 in ldap_int_thread_pool_wrapper (xpool=0x7fc8b308b760) at ../../../libraries/libldap_r/tpool.c:685 #25 0x00007fc8af286ccb in start_thread (arg=0x7fc8857fc700) at pthread_create.c:301 #26 0x00007fc8aefc4c2d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115
Yes! Was able to reproduce the exact same crash 1) using openldap master branch (openldap requires patches to be against master branch) 2) using syncrepl with TLS/SSL 3) while syncrepl is occurring, do ii=0 ; while [ $ii -lt 100 ] ; do LDAPTLS_CACERT=~/save/CA/cacert.pem ldapsearch -xLLL -ZZ -H "ldap://localhost.localdomain:2389" -D "cn=manager,dc=example,dc=com" -w secret -b dc=example,dc=com uid=scarter > /dev/null 2>&1 & ii=`expr $ii + 1` ; done against the syncrepl consumer Thanks for the stack traces, Ruben - I don't think we will need any more - they have been most helpful Stay tuned
Created attachment 521722 [details] stack trace from crash
Created attachment 521723 [details] stack trace of just the two problematic threads
Created attachment 521742 [details] 0001-use-mutex-for-connection-handshake-when-using-PEM-ns.patch This patch is against the ol24 branch rebased on top of the earlier NSS_Init thread safety patch. Is anyone in a position to test this with a build from openldap 2.4.x from source?
Upstream ITS and patch submission http://www.openldap.org/its/index.cgi?findid=7034
That's great to hear Rich! If I can get a Koji build I can test your patch. Jan, would you be willing to do a scratch build for F-14?
Created attachment 521759 [details] 0001-use-mutex-for-connection-handshake-when-using-PEM-ns.patch new patch (also updated in openldap ITS)
(In reply to comment #50) > Upstream ITS and patch submission > http://www.openldap.org/its/index.cgi?findid=7034 Great work, Rich! (In reply to comment #51) > Jan, would you be willing to do a scratch build for F-14? Definitely. I will post the links after submitting the build.
I am able to verify the fix only partially. I still do not manage to reproduce the crash caused by SSL_ForceHandshake(). However I'm able to reproduce the crash caused by SSL_ConfigServerSessionIDCache(). Updated builds will follow soon.
*** Bug 720896 has been marked as a duplicate of this bug. ***
(In reply to comment #54) > I am able to verify the fix only partially. > > I still do not manage to reproduce the crash caused by SSL_ForceHandshake(). It is difficult to do. I've used something like this: 1) set up two openldap servers as SSL servers and set up syncrepl between them also using SSL 2) have a client add a lot of entries to the syncrepl provider and confirm that replication is working to the consumer 3) in another window, do something like this: ii=0 ; while [ $ii -lt 1000 ] ; do LDAPTLS_CACERT=/path/to/cacert.pem ldapsearch -xLLL -ZZ -h fqdn.of.consumer -D "cn=manager,cn=example,dc=com" -w secret -b dc=example,dc=com > /dev/null 2>&1 & ii=`expr $ii + 1` ; done > However I'm able to reproduce the crash caused by > SSL_ConfigServerSessionIDCache(). > > Updated builds will follow soon.
Resolved in: openldap-2.4.24-5.fc15 openldap-2.4.26-3.fc16 openldap-2.4.26-3.fc17 Scratch build with all NSS patches from F17 addressing memory leaks or thread safety is here: http://koji.fedoraproject.org/koji/taskinfo?taskID=3345870 Please verify, that the issue is gone. I can create an update for F14 as well.
(In reply to comment #57) > Scratch build ... ... for F14
openldap-2.4.26-3.fc16 has been submitted as an update for Fedora 16. https://admin.fedoraproject.org/updates/openldap-2.4.26-3.fc16
openldap-2.4.24-5.fc15 has been submitted as an update for Fedora 15. https://admin.fedoraproject.org/updates/openldap-2.4.24-5.fc15
Package openldap-2.4.24-5.fc15: * should fix your issue, * was pushed to the Fedora 15 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing openldap-2.4.24-5.fc15' as soon as you are able to. Please go to the following url: https://admin.fedoraproject.org/updates/openldap-2.4.24-5.fc15 then log in and leave karma (feedback).
I've updated my servers to openldap-2.4.26-3.fc17, rebuild for F-14. I'll report back in a few days.
No crashes for the last 8 days, so it looks likes this is fixed. Thank you all for your help!
(In reply to comment #57) > Resolved in: > openldap-2.4.24-5.fc15 > openldap-2.4.26-3.fc16 > openldap-2.4.26-3.fc17 > Scratch build with all NSS patches from F17 addressing memory leaks or thread > safety is here: > http://koji.fedoraproject.org/koji/taskinfo?taskID=3345870 > Please verify, that the issue is gone. I can create an update for F14 as well. Hi Jan Would it be possible for you to create an update for F14 please? If not, could you point me to the location of the source RPM on koji please?
(In reply to comment #64) > Would it be possible for you to create an update for F14 please? > If not, could you point me to the location of the source RPM on koji please? I do not plan to update the package in F14 now. You can rebuild F15 package for F14, it should work: http://kojipkgs.fedoraproject.org/packages/openldap/2.4.24/5.fc15/src/openldap-2.4.24-5.fc15.src.rpm
openldap-2.4.26-5.fc16 has been pushed to the Fedora 16 stable repository. If problems still persist, please make note of it in this bug report.
openldap-2.4.24-5.fc15 has been pushed to the Fedora 15 stable repository. If problems still persist, please make note of it in this bug report.