Bug 783445

Summary: replication (syncrepl) with TLS causes segfault
Product: Red Hat Enterprise Linux 6 Reporter: Jan Vcelak <jvcelak>
Component: openldapAssignee: Jan Vcelak <jvcelak>
Status: CLOSED ERRATA QA Contact: BaseOS QE Security Team <qe-baseos-security>
Severity: high Docs Contact:
Priority: high    
Version: 6.2CC: dspurek, jsynacek, jvcelak, omoris, rmeggins, tsmetana
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openldap-2.4.23-21.el6 Doc Type: Bug Fix
Doc Text:
- slapd master-master replication with TLS - slapd crashes with segmentation fault when started due to accessing unallocated memory - patch applied to copy and store the TLS initialization parameters, until the deferred TLS initialization takes place - crash caused by accessing unallocated memory during TLS context initialization in master-master replication initialization is no longer present
Story Points: ---
Clone Of: 783431 Environment:
Last Closed: 2012-06-20 07:29:14 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 783431    
Bug Blocks: 707599, 795763    

Description Jan Vcelak 2012-01-20 13:04:37 UTC
+++ This bug was initially created as a clone of Bug #783431 +++

Description of problem:

In my test configuration, master-master replication with TLS enabled (ldaps:// or ldap:// with starttls) causes the segfault of one of the servers.

#0  __strrchr_sse2 () at ../sysdeps/x86_64/strrchr.S:33
#1  0x000055555574874f in tlsm_get_certdb_prefix (certdir=0x34 <Address 0x34 out of bounds>, realcertdir=0x7fffeedd0c28, 
    prefix=0x7fffeedd0c20) at tls_m.c:1521
#2  0x0000555555748960 in tlsm_deferred_init (arg=0x7fffe8108eb0) at tls_m.c:1608
#3  0x00005555557497e8 in tlsm_deferred_ctx_init (arg=0x7fffe8108eb0) at tls_m.c:2068
#4  0x00007ffff63f2ed5 in PR_CallOnceWithArg (once=0x7fffe8108ee8, func=<optimized out>, arg=<optimized out>)
    at ../../../mozilla/nsprpub/pr/src/misc/prinit.c:836
#5  0x000055555574a4b1 in tlsm_session_new (ctx=0x7fffe8108eb0, is_server=0) at tls_m.c:2432
#6  0x0000555555743842 in alloc_handle (ctx_arg=0x7fffe8108eb0, is_server=0) at tls2.c:288
#7  0x0000555555743972 in ldap_int_tls_connect (ld=0x7fffe0100910, conn=0x7fffe0108ec0) at tls2.c:333
#8  0x0000555555744bd5 in ldap_int_tls_start (ld=0x7fffe0100910, conn=0x7fffe0108ec0, srv=0x7fffe0108de0) at tls2.c:834
#9  0x0000555555717920 in ldap_int_open_connection (ld=0x7fffe0100910, conn=0x7fffe0108ec0, srv=0x7fffe0108de0, async=0) at open.c:437
#10 0x000055555572ecb6 in ldap_new_connection (ld=0x7fffe0100910, srvlist=0x7fffe0100a30, use_ldsb=1, connect=1, bind=0x0, m_req=0, 
    m_res=0) at request.c:480
#11 0x0000555555716b43 in ldap_open_defconn (ld=0x7fffe0100910) at open.c:41
#12 0x000055555571ec83 in ldap_int_sasl_bind (ld=0x7fffe0100910, dn=0x555555b406a0 "cn=manager,dc=redhat,dc=bug", 
    mechs=0x555555b40080 "EXTERNAL", sctrls=0x0, cctrls=0x0, flags=2, interact=0x555555709a58 <lutil_sasl_interact>, 
    defaults=0x7fffe0108e80, result=0x0, rmech=0x7fffeedd13a0, msgid=0x7fffeedd1394) at cyrus.c:425
#13 0x0000555555721db7 in ldap_sasl_interactive_bind (ld=0x7fffe0100910, dn=0x555555b406a0 "cn=manager,dc=redhat,dc=bug", 
    mechs=0x555555b40080 "EXTERNAL", serverControls=0x0, clientControls=0x0, flags=2, interact=0x555555709a58 <lutil_sasl_interact>, 
    defaults=0x7fffe0108e80, result=0x0, rmech=0x7fffeedd13a0, msgid=0x7fffeedd1394) at sasl.c:474
#14 0x0000555555721e6a in ldap_sasl_interactive_bind_s (ld=0x7fffe0100910, dn=0x555555b406a0 "cn=manager,dc=redhat,dc=bug", 
    mechs=0x555555b40080 "EXTERNAL", serverControls=0x0, clientControls=0x0, flags=2, interact=0x555555709a58 <lutil_sasl_interact>, 
    defaults=0x7fffe0108e80) at sasl.c:511
#15 0x000055555559c5d6 in slap_client_connect (ldp=0x555555b40570, sb=0x555555b40350) at ../../../servers/slapd/config.c:2041
#16 0x0000555555628d19 in do_syncrep1 (op=0x7fffeedd14c0, si=0x555555b40320) at ../../../servers/slapd/syncrepl.c:611
#17 0x000055555562c5ab in do_syncrepl (ctx=0x7fffeedd1ba0, arg=0x555555b408a0) at ../../../servers/slapd/syncrepl.c:1510
#18 0x000055555571546d in ldap_int_thread_pool_wrapper (xpool=0x555555ae6b70) at ../../../libraries/libldap_r/tpool.c:685
#19 0x00007ffff73e7bd0 in start_thread (arg=0x7fffeedd2700) at pthread_create.c:309
#20 0x00007ffff5cfea0d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115

Version-Release number of selected component (if applicable):
possibly all openldap in Fedora and RHEL since switch to NSS

openldap-2.4.28-1.fc17.x86_64
openldap-2.4.23-20.el6.x86_64

Steps to Reproduce:

Reproducer with instructions will follow

--- Additional comment from jvcelak on 2012-01-20 13:05:50 CET ---

Created attachment 556509 [details]
configs and scripts to reproduce

Steps to reproduce:

(work as a root)

1. decompress in /root (you should get /root/bz783431 dir)
2. append content of /root/bz783431/hosts.add to your /etc/hosts
3. run first server by running "make run" in /root/bz783431/server1
4. run second server by running "make run" in /root/bz783431/server2

the first server will crash in a few moments

--- Additional comment from jvcelak on 2012-01-20 13:39:52 CET ---

When the first server is started, it wants to create a new connection to the second server for replication. This is done in "do_syncrep1" by calling "slapd_client_connect". sb_tls_do_init in slapd_bindconf sb* is unset and new TLS context is created. This context is also stored in sb* and used for the next connections.

Deferred initialization is used with MozNSS backend. When the second server comes up, the real initialization takes place. At this point, the TLS parameters are taken from the TLS context structure (*tc_config member of struct tlsm_ctx). It seems, that these information are no longer valid and an uninitialized memory is touched, which causes segfault.

--- Additional comment from jvcelak on 2012-01-20 13:59:27 CET ---

The TLS parameters are really not available. During tlsm_ctx_new, the pointer to lo->ldo-tls_info is taken:

(gdb) p lo->ldo_tls_info 
$7 = {lt_certfile = 0x7fffe4100d00 "replicator", lt_keyfile = 0x0, lt_dhfile = 0x0, lt_cacertfile = 0x0, 
  lt_cacertdir = 0x7fffe4100ce0 "/root/bz783431/certdb", lt_ciphersuite = 0x0, lt_crlfile = 0x0, lt_randfile = 0x0, 
  lt_protocol_min = 0}
(gdb) p &lo->ldo_tls_info 
$8 = (struct ldaptls *) 0x7fffe41009d8

And when the first connection fails, the structure is freed in ldap_int_tls_destroy.


This can be solved by copying the TLS initialization data into the TLS context structure temporarily. And the data can be freed when the deferred initialization is finished.

I will write a patch.

Comment 5 Jan Vcelak 2012-03-01 15:36:14 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
- slapd master-master replication with TLS
- slapd crashes with segmentation fault when started due to accessing unallocated memory
- patch applied to copy and store the TLS initialization parameters, until the deferred TLS initialization takes place
- crash caused by accessing unallocated memory during TLS context initialization in master-master replication initialization is no longer present

Comment 7 errata-xmlrpc 2012-06-20 07:29:14 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2012-0899.html