Bug 1094468
| Summary: | 389-ds-base server reported crash in stan_GetCERTCertificate (pki3hack.c) under the replication replay failure condition | ||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Sankar Ramalingam <sramling> | ||||||||||||||||
| Component: | nss | Assignee: | Elio Maldonado Batiz <emaldona> | ||||||||||||||||
| Status: | CLOSED ERRATA | QA Contact: | Alicja Kario <hkario> | ||||||||||||||||
| Severity: | high | Docs Contact: | |||||||||||||||||
| Priority: | high | ||||||||||||||||||
| Version: | 7.0 | CC: | amarecek, emaldona, ksrot, nhosoi, nkinder, rrelyea, sforsber, sramling, vanhoof | ||||||||||||||||
| Target Milestone: | rc | Keywords: | Regression, ZStream | ||||||||||||||||
| Target Release: | --- | ||||||||||||||||||
| Hardware: | x86_64 | ||||||||||||||||||
| OS: | Linux | ||||||||||||||||||
| Whiteboard: | |||||||||||||||||||
| Fixed In Version: | nss-3.16.2.3-4.el7 | Doc Type: | Bug Fix | ||||||||||||||||
| Doc Text: |
Cause: nss internal call stan_GetCERTCertificate did nor properly ensure sure that objects do not go away until it it's finished with the operation
Consequence: A crash in 389-ds-base server was reported in in nss's stan_GetCERTCertificate (pki3hack.c) under the replication replay failure condition.
Fix: The certificate code was fixed to properly manage objects references.
Result: The crashes reported by the 389 directory server no longer occur
|
Story Points: | --- | ||||||||||||||||
| Clone Of: | |||||||||||||||||||
| : | 1139349 (view as bug list) | Environment: | |||||||||||||||||
| Last Closed: | 2015-03-05 08:27:44 UTC | Type: | Bug | ||||||||||||||||
| Regression: | --- | Mount Type: | --- | ||||||||||||||||
| Documentation: | --- | CRM: | |||||||||||||||||
| Verified Versions: | Category: | --- | |||||||||||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||||
| Embargoed: | |||||||||||||||||||
| Bug Depends On: | |||||||||||||||||||
| Bug Blocks: | 1064025, 1113520, 1139349 | ||||||||||||||||||
| Attachments: |
|
||||||||||||||||||
|
Description
Sankar Ramalingam
2014-05-05 18:38:18 UTC
Additional information.
Output from valgrind
==9703== Invalid read of size 8
==9703== at 0x61E0880: stan_GetCERTCertificate (pki3hack.c:834)
==9703== by 0x61DC9BF: nssTrustDomain_RemoveTokenCertsFromCache (tdcache.c:448)
==9703== by 0x61E1C58: nssSlot_IsTokenPresent (devslot.c:172)
==9703== by 0x61C65D3: pk11_IsPresentCertLoad (pk11slot.c:1436)
==9703== by 0x61C98B0: SECMOD_CloseUserDB (pk11util.c:1522)
==9703== by 0x5B06C72: tlsm_ctx_free (tls_m.c:2148)
==9703== by 0x5B02C84: ldap_int_tls_destroy (tls2.c:105)
==9703== by 0x5AE76D6: ldap_ld_free (unbind.c:209)
==9703== by 0x10E04F98: close_connection_internal (repl5_connection.c:1203)
==9703== by 0x10E058C5: perform_operation (repl5_connection.c:738)
==9703== by 0x10E0601C: conn_send_add (repl5_connection.c:771)
==9703== by 0x10E08949: replay_update (repl5_inc_protocol.c:1362)
==9703== by 0x10E09E55: repl5_inc_run (repl5_inc_protocol.c:1719)
==9703== by 0x10E0EA5B: prot_thread_main (repl5_protocol.c:296)
==9703== by 0x68E773F: ??? (in /usr/lib64/libnspr4.so)
==9703== by 0x6F26DF2: start_thread (in /usr/lib64/libpthread-2.17.so)
==9703== by 0x72313DC: clone (in /usr/lib64/libc-2.17.so)
==9703== Address 0xe86ca80 is 240 bytes inside a block of size 2,048 free'd
==9703== at 0x4C29577: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==9703== by 0x66BC61A: PL_FinishArenaPool (in /usr/lib64/libplds4.so)
==9703== by 0x61E712E: nssArena_Destroy (arena.c:509)
==9703== by 0x61D92B7: nssCertificate_Destroy (certificate.c:125)
==9703== by 0x5F3ED48: ssl3_CleanupPeerCerts.isra.20 (ssl3con.c:9635)
==9703== by 0x5F4AFE2: ssl3_DestroySSL3Info (ssl3con.c:11990)
==9703== by 0x5F5883A: ssl_DestroySocketContents (sslsock.c:335)
==9703== by 0x5F59BC1: ssl_FreeSocket (sslsock.c:398)
==9703== by 0x5F5006F: ssl_DefClose (ssldef.c:205)
==9703== by 0x5B05B49: tlsm_sb_remove (tls_m.c:3255)
==9703== by 0x5D2BD7C: ber_sockbuf_remove_io (sockbuf.c:230)
==9703== by 0x5D2C584: ber_int_sb_destroy (sockbuf.c:404)
==9703== by 0x5D2C5FB: ber_sockbuf_free (sockbuf.c:75)
==9703== by 0x5AE753D: ldap_ld_free (unbind.c:133)
==9703== by 0x10E04F98: close_connection_internal (repl5_connection.c:1203)
==9703== by 0x10E058C5: perform_operation (repl5_connection.c:738)
==9703== by 0x10E0601C: conn_send_add (repl5_connection.c:771)
==9703== by 0x10E08949: replay_update (repl5_inc_protocol.c:1362)
==9703== by 0x10E09E55: repl5_inc_run (repl5_inc_protocol.c:1719)
==9703== by 0x10E0EA5B: prot_thread_main (repl5_protocol.c:296)
==9703== by 0x68E773F: ??? (in /usr/lib64/libnspr4.so)
==9703== by 0x6F26DF2: start_thread (in /usr/lib64/libpthread-2.17.so)
==9703== by 0x72313DC: clone (in /usr/lib64/libc-2.17.so)
1) Not a race condition. "perform_operation" calls PR_Lock(conn->lock) at the head of the function and holds it when close_connection_internal is called.
==9703== by 0x10E04F98: close_connection_internal (repl5_connection.c:1203)
==9703== by 0x10E058C5: perform_operation (repl5_connection.c:738)
2) Not a double free. close_connection_internal sets NULL to the ldap handle conn->ld once slapi_ldap_unbind is done. Thus,in the valgrind output, the 2 stacktraces cannot be from 2 different calls since if the call for the lower stacktrace is successfully finished, there is no handle conn->ld available to call slapi_ldap_unbind with for the upper stacktrace...
close_connection_internal(Repl_Connection *conn)
if (NULL != conn->ld)
{
/* Since we call slapi_ldap_init, we must call slapi_ldap_unbind */
slapi_ldap_unbind(conn->ld);
}
conn->ld = NULL;
NSS related packages installed.
nss-softokn-freebl-3.15.4-2.el7.x86_64
nss-sysinit-3.15.4-6.el7.x86_64
nss-util-devel-3.15.4-2.el7.x86_64
nss-softokn-devel-3.15.4-2.el7.x86_64
nss-debuginfo-3.15.4-6.el7.x86_64
libsss_nss_idmap-1.11.2-65.el7.x86_64
nss-util-3.15.4-2.el7.x86_64
nss-softokn-freebl-devel-3.15.4-2.el7.x86_64
nss-devel-3.15.4-6.el7.x86_64
mod_nss-1.0.8-32.el7.x86_64
nss-3.15.4-6.el7.x86_64
python-nss-0.14.0-5.el7.x86_64
libsss_nss_idmap-python-1.11.2-65.el7.x86_64
nss-softokn-3.15.4-2.el7.x86_64
nss-tools-3.15.4-6.el7.x86_64
Hi Elio, could there be any progress on this bug? We are getting this crash report from the automated beaker tests (sorry, debuginfo was not installed). Do you need any other information for your debugging? [IPA CRASH ALERT] Test_Suite: Submitter: Beaker JobID: 655359 :Thread 1 (Thread 0x7f29fffff700 (LWP 21597)): :#0 0x00007f2a1d0221eb in PORT_ArenaAlloc_Util () from /lib64/libnssutil3.so :No symbol table info available. :#1 0x00007f2a1eb46b7a in cert_trust_from_stan_trust () from /lib64/libnss3.so :No symbol table info available. :#2 0x00007f2a1eb47640 in nssTrust_GetCERTCertTrustForCert () from /lib64/libnss3.so :No symbol table info available. :#3 0x00007f2a1eb479a5 in stan_GetCERTCertificate () from /lib64/libnss3.so :No symbol table info available. :#4 0x00007f2a1eb439c0 in nssTrustDomain_RemoveTokenCertsFromCache () from /lib64/libnss3.so :No symbol table info available. :#5 0x00007f2a1eb48c59 in nssSlot_IsTokenPresent () from /lib64/libnss3.so :No symbol table info available. :#6 0x00007f2a1eb2d5d4 in pk11_IsPresentCertLoad () from /lib64/libnss3.so :No symbol table info available. :#7 0x00007f2a1eb308b1 in SECMOD_CloseUserDB () from /lib64/libnss3.so :No symbol table info available. :#8 0x00007f2a1f2a9c73 in tlsm_ctx_free () from /lib64/libldap_r-2.4.so.2 :No symbol table info available. :#9 0x00007f2a1f2a5c85 in ldap_int_tls_destroy () from /lib64/libldap_r-2.4.so.2 :No symbol table info available. :#10 0x00007f2a1f28a6d7 in ldap_ld_free () from /lib64/libldap_r-2.4.so.2 :No symbol table info available. :#11 0x00007f2a14b23f99 in close_connection_internal (conn=conn@entry=0x7f2a2179c6d0) at ldap/servers/plugins/replication/repl5_connection.c:1203 [...] Thanks, --noriko Created attachment 924708 [details]
Manage objects references to ensure object does not go away until we finish
This is Bob Relyea's fix to get rid of the crash.
Created attachment 924709 [details]
changes to nss.spec in patch format
Comment on attachment 924709 [details]
changes to nss.spec in patch format
r+ rrelyea
Once you have an official errata filed from NSS team, please let DS QE team and we will do the necessary verification. Please contact sramling for any further queries related to verifying this issue. Tests automated in TET RHEL7 branch for 389-ds-base as trac47606 tests. This crash is observed with the other beaker run. Hence, re-opening the bug. Beaker job - https://beaker.engineering.redhat.com/jobs/812004 (In reply to Sankar Ramalingam from comment #20) > This crash is observed with the other beaker run. Hence, re-opening the bug. > Beaker job - https://beaker.engineering.redhat.com/jobs/812004 Sankar, I cannot find the crash report (including the stacktraces) in the link you put. https://beaker.engineering.redhat.com/jobs/812004 Could you attach the file to this bug? If you don't find it, could you repeat running the test? Also, every time you run, could you give the output of "rpm -qa |egrep '389-ds-base|nss'"? Thanks, --noriko Created attachment 970672 [details]
Fix latest race
Created attachment 970673 [details]
spec file with latest race fix.
Created attachment 977437 [details]
Fix latest race
Updated patch from Bob that gave good results in Noriko's testing.
Created attachment 977439 [details]
spec file with latest patch applied
Created attachment 978539 [details]
spec file with latest patch applied - actually used for build
No crashes observed with the latest nss/nspr and 389-ds-base packages. Hence, marking the bug as Verified. Packages tested: [root@intel-s3eb1-03 MMR_WINSYNC]# rpm -qa |egrep 'nss-|nspr-' nss-softokn-freebl-3.16.2.3-7.el7.x86_64 nss-tools-3.16.2.3-5.el7.x86_64 nspr-4.10.6-3.el7.x86_64 nss-util-3.16.2.3-2.el7.x86_64 nss-softokn-3.16.2.3-7.el7.x86_64 nss-3.16.2.3-5.el7.x86_64 nspr-debuginfo-4.10.6-3.el7.x86_64 nss-sysinit-3.16.2.3-5.el7.x86_64 nss-debuginfo-3.16.2.3-5.el7.x86_64 [root@intel-s3eb1-03 MMR_WINSYNC]# rpm -qa |egrep 389-ds-base 389-ds-base-libs-1.3.3.1-13.el7.x86_64 389-ds-base-debuginfo-1.3.3.1-13.el7.x86_64 389-ds-base-1.3.3.1-13.el7.x86_64 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-0364.html |