Description of problem: The OpenLDAP client library fails with CONNECT_ERROR. The error always occurs with the third LDAP connection within the ldap_start_tls_s(). It's only reproducible when I set LDAP_OPT_X_TLS_CACERTFILE to a valid CA file and apply the settings with ldap_set_option(conn, LDAP_OPT_X_TLS_NEWCTX, 0). It does not occur on Travis CI, where libldap_r is compiled against OpenSSL. I was able to track down the issue to a failing call to SECMOD_RestartModules() in tls_m.c around line 1836. It looks like PR_CallOnceWithArg( &c->tc_callonce, tlsm_deferred_ctx_init, c ); is called more than once. https://github.com/python-ldap/python-ldap/issues/60#issuecomment-348134403 contains additional information. Version-Release number of selected component (if applicable): * nss-3.34.0-1.0.fc27.x86_64 * openldap-2.4.45-3.fc27.x86_64 How reproducible: Always Steps to Reproduce: I have created a reproducer script: 1. sudo dnf builddep python-ldap 2. git clone --branch tls_nss https://github.com/tiran/python-ldap.git 3. cd python-ldap 4. contrib/reproducer_issue60.py Actual results: Iteration 1 dn:cn=Manager,dc=slapd-test,dc=python-ldap,dc=org Iteration 2 dn:cn=Manager,dc=slapd-test,dc=python-ldap,dc=org Iteration 3 Traceback (most recent call last): File "contrib/reproducer_issue60.py", line 55, in <module> main() File "contrib/reproducer_issue60.py", line 49, in main reproducer(server) File "contrib/reproducer_issue60.py", line 40, in reproducer conn = connect(server) File "contrib/reproducer_issue60.py", line 32, in connect conn.start_tls_s() File "Lib/ldap/ldapobject.py", line 806, in start_tls_s return self._ldap_call(self._l.start_tls_s) File "Lib/ldap/ldapobject.py", line 298, in _ldap_call reraise(exc_type, exc_value, exc_traceback) File "Lib/ldap/ldapobject.py", line 282, in _ldap_call result = func(*args,**kwargs) ldap.CONNECT_ERROR: {u'info': 'Is a directory', 'errno': 21, 'desc': u'Connect error'} NOTE: "u'info': 'Is a directory'" is a red herring. python-ldap wrongly maps errno 21 to EISDIR Expected results: No exception Additional info: Upstream issue: https://github.com/python-ldap/python-ldap/issues/60 I don't know if this code ever worked. I just started to add tests for start_tls_s() to python-ldap.
IIUC we don't know yet if there is a bug in NSS, or if openldap is using NSS incorrectly, right? If SECMOD_RestartModules() fails, have you checked that openldap uses this call appropriately, in other words, it only calls it once in response to a fork? Since you were already in the debugger, I guess it would be easy for you to find out at which point inside of SECMOD_RestartModules we run into a failure? (If you do, please paste the failing line, as line numbers can vary between snapshots.) > I was able to track down the issue to a failing call to > SECMOD_RestartModules() in tls_m.c around line 1836. It looks like > PR_CallOnceWithArg( &c->tc_callonce, tlsm_deferred_ctx_init, c ); is called > more than once. Did you draw that conclusion, because of the following debugger output, which you pasted into the github issue? (gdb) n 2777 LDAP_MUTEX_LOCK( &tlsm_init_mutex ); (gdb) n 2778 status = PR_CallOnceWithArg( &c->tc_callonce, tlsm_deferred_ctx_init, c ); (gdb) n 2779 LDAP_MUTEX_UNLOCK( &tlsm_init_mutex ); (gdb) n 2778 status = PR_CallOnceWithArg( &c->tc_callonce, tlsm_deferred_ctx_init, c ); (gdb) n This doesn't look like a duplicate execution. There is no loop here. I'm guessing what you saw is a behavior of gdb, which I've frequently seen with optimized builds, which might seem confusing. gdb might go back multiple times to a single line, after it has executed only portions of the underlying assembler code. To verify your theory that tlsm_deferred_ctx_init gets called multiple times, you'd have to set a breakpoint on tlsm_deferred_ctx_init and check how often it's being reached. Also, you might prefer to use binaries compiled with debug information, to make it easier for gdb. You'll get less confusing experiences when using gdb. Also, at one point your debugger session printed warning: Source file is more recent than executable. so you should make sure there's nothing out of date in your debugging session.
I added a --no-cacertfile option to the script. With --no-cacertfile, it does not set OPT_X_TLS_CACERTFILE and disables cert validation with OPT_X_TLS_REQUIRE_CERT ldap.OPT_X_TLS_NEVER. $ contrib/reproducer_issue60.py --no-cacertfile running build_ext copying build/lib.linux-x86_64-2.7/_ldap.so -> Lib Testing with cacertfile=False... Iteration 1, cacacertfile=False dn:cn=Manager,dc=slapd-test,dc=python-ldap,dc=org Iteration 2, cacacertfile=False dn:cn=Manager,dc=slapd-test,dc=python-ldap,dc=org Iteration 3, cacacertfile=False dn:cn=Manager,dc=slapd-test,dc=python-ldap,dc=org Iteration 4, cacacertfile=False dn:cn=Manager,dc=slapd-test,dc=python-ldap,dc=org Iteration 5, cacacertfile=False dn:cn=Manager,dc=slapd-test,dc=python-ldap,dc=org Iteration 6, cacacertfile=False dn:cn=Manager,dc=slapd-test,dc=python-ldap,dc=org Iteration 7, cacacertfile=False dn:cn=Manager,dc=slapd-test,dc=python-ldap,dc=org Iteration 8, cacacertfile=False dn:cn=Manager,dc=slapd-test,dc=python-ldap,dc=org Iteration 9, cacacertfile=False dn:cn=Manager,dc=slapd-test,dc=python-ldap,dc=org Iteration 10, cacacertfile=False dn:cn=Manager,dc=slapd-test,dc=python-ldap,dc=org $ contrib/reproducer_issue60.py running build_ext copying build/lib.linux-x86_64-2.7/_ldap.so -> Lib Testing with cacertfile=True... Iteration 1, cacacertfile=True dn:cn=Manager,dc=slapd-test,dc=python-ldap,dc=org Iteration 2, cacacertfile=True dn:cn=Manager,dc=slapd-test,dc=python-ldap,dc=org Iteration 3, cacacertfile=True Traceback (most recent call last): File "contrib/reproducer_issue60.py", line 76, in <module> main() File "contrib/reproducer_issue60.py", line 70, in main reproducer(server, cacertfile=args.cacertfile) File "contrib/reproducer_issue60.py", line 52, in reproducer conn = connect(server, cacertfile=cacertfile) File "contrib/reproducer_issue60.py", line 43, in connect conn.start_tls_s() File "Lib/ldap/ldapobject.py", line 806, in start_tls_s return self._ldap_call(self._l.start_tls_s) File "Lib/ldap/ldapobject.py", line 298, in _ldap_call reraise(exc_type, exc_value, exc_traceback) File "Lib/ldap/ldapobject.py", line 282, in _ldap_call result = func(*args,**kwargs) ldap.CONNECT_ERROR: {u'info': 'Is a directory', 'errno': 21, 'desc': u'Connect error'}
Please explain the meaning of comment 2 for this bug.
Please ignore "warning: Source file is more recent than executable.". openldap-debuginfo does not provide debug sources. I had to extract files from the SRPM and add the source directory to gdb I can confirm that tlsm_deferred_ctx_init() is called for every ldap_start_tls_s() call because the old context is destroyed by LDAP_OPT_X_TLS_NEWCTX. Please note that LDAP_OPT_X_TLS_NEWCTX is strictly required by libldap to apply all outstanding LDAP_OPT_X_TLS options to the underlying SSL implementation. Without LDAP_OPT_X_TLS_NEWCTX, new settings are ignored. $ gdb python (gdb) dir openldap-2.4.45-3.fc27.x86_64/libraries/libldap (gdb) b tlsm_deferred_ctx_ini (gdb) run contrib/reproducer_issue60.py ... Testing with cacertfile=True... Iteration 1, cacacertfile=True Breakpoint 1, tlsm_deferred_ctx_init (arg=0x555555973450) at tls_m.c:2329 warning: Source file is more recent than executable. 2329 { (gdb) c Continuing. dn:cn=Manager,dc=slapd-test,dc=python-ldap,dc=org Iteration 2, cacacertfile=True Breakpoint 1, tlsm_deferred_ctx_init (arg=0x555555ae5f40) at tls_m.c:2329 2329 { Missing separate debuginfos, use: dnf debuginfo-install sqlite-libs-3.20.1-1.fc27.x86_64 (gdb) c Continuing. dn:cn=Manager,dc=slapd-test,dc=python-ldap,dc=org Iteration 3, cacacertfile=True Breakpoint 1, tlsm_deferred_ctx_init (arg=0x555555b14060) at tls_m.c:2329 2329 { (gdb) c Continuing. Traceback (most recent call last): File "contrib/reproducer_issue60.py", line 76, in <module> main() File "contrib/reproducer_issue60.py", line 70, in main reproducer(server, cacertfile=args.cacertfile) File "contrib/reproducer_issue60.py", line 52, in reproducer conn = connect(server, cacertfile=cacertfile) File "contrib/reproducer_issue60.py", line 43, in connect conn.start_tls_s() File "Lib/ldap/ldapobject.py", line 806, in start_tls_s return self._ldap_call(self._l.start_tls_s) File "Lib/ldap/ldapobject.py", line 298, in _ldap_call reraise(exc_type, exc_value, exc_traceback) File "Lib/ldap/ldapobject.py", line 282, in _ldap_call result = func(*args,**kwargs) ldap.CONNECT_ERROR: {u'info': 'Is a directory', 'errno': 21, 'desc': u'Connect error'}
(In reply to Kai Engert (:kaie) from comment #3) > Please explain the meaning of comment 2 for this bug. The error does not occur unless I set a CA cert file with LDAP_OPT_X_TLS_CACERTFILE. Without a custom CA cert, start_tls_s() just works fine.