Bug 767832 - Intermittent SSL connection failures to Active Directory using php-ldap
Summary: Intermittent SSL connection failures to Active Directory using php-ldap
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: curl
Version: 16
Hardware: i386
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Kamil Dudka
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-12-14 23:39 UTC by Gregory Gleason
Modified: 2012-12-20 16:22 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-12-20 16:22:49 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
packet capture of failed SSL connection (2.89 KB, application/octet-stream)
2011-12-14 23:40 UTC, Gregory Gleason
no flags Details
packet capture of successful SSL connection (3.89 KB, application/octet-stream)
2011-12-14 23:40 UTC, Gregory Gleason
no flags Details

Description Gregory Gleason 2011-12-14 23:39:14 UTC
Description of problem:

This started after migrating from CentOS 5.x to Fedora 16.

User credentials are authenticated against a Windows 2008 active directory.
We use php with ldap support to authenticate users.  Here is a test php script I've set up to debug this issue:

<?php
$server = "ldaps://fqdn.of.activeDirectory/";
$basedn = "...";
$handle = "user";
$password = "password";
$dn = "domain\\" . $handle;

$ldap = ldap_connect($server);
ldap_set_option($ldap, LDAP_OPT_PROTOCOL_VERSION, 3);
ldap_set_option($ldap, LDAP_OPT_REFERRALS, 0);

if($bind = ldap_bind($ldap, $dn, $password)) {
    echo "bind success\n";
} else {
    echo "bind failed\n";
}
?>

If I run this script with php-cli, it works 100% of the time.  This is how I tested:
$ while true; do php /var/www/html/ldap_test.php;sleep 1;done
bind success
bind success
bind success
bind success
bind success
bind success
bind success
bind success
bind success
bind success
bind success
bind success

It never fails.

When I access the same script via HTTP through a browser, however, I get intermittent failures:
$ while true; do curl http://10.148.10.30/ldap_test.php; sleep 1;done
bind failed
bind success
bind failed
bind success
bind success
bind failed
bind success
bind failed
bind success
bind success
bind failed
bind failed
bind success

I set ldap debugging in my php script like this before connecting to the ldap server. 
ldap_set_option(NULL, LDAP_OPT_DEBUG_LEVEL, 255);

Here is the debug output when it fails.  10.148.10.115 is the ldap server.  10.250.10.215 is the http client:

ldap_create
ldap_url_parse_ext(ldaps://fqdn.of.activeDirectory/)
ldap_bind_s
ldap_simple_bind_s
ldap_sasl_bind_s
ldap_sasl_bind
ldap_send_initial_request
ldap_new_connection 1 1 0
ldap_int_open_connection
ldap_connect_to_host: TCP fqdn.of.activeDirectory:636
ldap_new_socket: 17
ldap_prepare_socket: 17
ldap_connect_to_host: Trying 10.148.10.115:636
ldap_pvt_connect: fd: 17 tm: -1 async: 0
TLS: certificate [(null)] is not valid - error -8179:Unknown code ___f 13.
TLS: error: connect - force handshake failure: errno 0 - moznss error -8179
TLS: can't connect: TLS error -8179:Unknown code ___f 13.
ldap_err2string
[Wed Dec 14 16:15:28 2011] [error] [client 10.250.10.215] PHP Warning:  ldap_bind(): Unable to bind to server: Can't contact LDAP server in /var/www/html/ldap_test.php on line 13

Here is the debug output when the connection succeeds:

ldap_create
ldap_url_parse_ext(ldaps://fqdn.of.activeDirectory/)
ldap_bind_s
ldap_simple_bind_s
ldap_sasl_bind_s
ldap_sasl_bind
ldap_send_initial_request
ldap_new_connection 1 1 0
ldap_int_open_connection
ldap_connect_to_host: TCP fqdn.of.activeDirectory:636
ldap_new_socket: 17
ldap_prepare_socket: 17
ldap_connect_to_host: Trying 10.148.10.115:636
ldap_pvt_connect: fd: 17 tm: -1 async: 0
TLS: certificate [(null)] is not valid - error -8179:Unknown code ___f 13.
TLS certificate verification: subject: -unknown-, issuer: CN=nardc1,DC=demoavaya,DC=com, cipher: AES-128, security level: high, secret key bits: 128, total key bits: 128, cache hits: 0, cache misses: 0, cache not reusable: 0
ldap_open_defconn: successful
ldap_send_server_request
ldap_result ld 0x21af9100 msgid 1
wait4msg ld 0x21af9100 msgid 1 (infinite timeout)
wait4msg continue ld 0x21af9100 msgid 1 all 1
** ld 0x21af9100 Connections:
* host: nardc1.demoavaya.com  port: 636  (default)
  refcnt: 2  status: Connected
  last used: Wed Dec 14 16:17:45 2011


I ran a packet capture on the http server to get the traffic between the SSL client (the web server) and the SSL server (Active directory).  Here is a good bind:

No.     Time        Source                Destination           Protocol Length Info
      4 0.000395    10.148.10.30          10.148.10.115         SSLv2    126    Client Hello
      5 0.000873    10.148.10.115         10.148.10.30          TLSv1    2264   Server Hello, Certificate, Certificate Request, Server Hello Done
      7 0.002021    10.148.10.30          10.148.10.115         TLSv1    271    Certificate, Client Key Exchange, Change Cipher Spec, Encrypted Handshake Message
      8 0.003760    10.148.10.115         10.148.10.30          TLSv1    125    Change Cipher Spec, Encrypted Handshake Message
      9 0.003931    10.148.10.30          10.148.10.115         TLSv1    151    Application Data
     10 0.005277    10.148.10.115         10.148.10.30          TLSv1    119    Application Data
     11 0.005468    10.148.10.30          10.148.10.115         TLSv1    103    Application Data
     12 0.005527    10.148.10.30          10.148.10.115         TLSv1    103    Encrypted Alert


Here is a bad bind.  Notice the Unknown CA error from the client:
No.     Time        Source                Destination           Protocol Length Info
      4 0.000316    10.148.10.30          10.148.10.115         SSLv2    126    Client Hello
      5 0.000762    10.148.10.115         10.148.10.30          TLSv1    2264   Server Hello, Certificate, Certificate Request, Server Hello Done
      7 0.001428    10.148.10.30          10.148.10.115         TLSv1    73     Alert (Level: Fatal, Description: Unknown CA)

I saved the decoded SSL Server Hello for each and compared, and they were identical, aside form the date, time, packet numbers, etc.  The important stuff (certificate, etc) was not different.

Here is the contents of /etc/openldap/ldap.conf
TLS_CACERTDIR /etc/openldap/cacerts
TLS_REQCERT never

Contents of cacert dir:
# ls -l /etc/openldap/cacerts/
total 4
lrwxrwxrwx 1 root root   11 Dec 13 11:31 b8f38771.0 -> certnew.cer
-rw-r--r-- 1 root root 1728 Dec 12 13:27 certnew.cer

That is the CA cert from the LDAP server.

Version-Release number of selected component (if applicable):
openldap-2.4.26-5.fc16.i686
openldap-clients-2.4.26-5.fc16.i686
php-common-5.3.8-3.fc16.i686
php-5.3.8-3.fc16.i686
httpd-2.2.21-1.fc16.i686

How reproducible:
This is a very specific setup for our production environment.  I'm not sure it can be produced or not.

Steps to Reproduce:
1.
2.
3.
  
Actual results:
SSL connection fails with "Unknown CA"


Expected results:
SSL connection works.

Additional info:

Comment 1 Gregory Gleason 2011-12-14 23:40:23 UTC
Created attachment 546981 [details]
packet capture of failed SSL connection

Comment 2 Gregory Gleason 2011-12-14 23:40:59 UTC
Created attachment 546982 [details]
packet capture of successful SSL connection

Comment 3 Kevin Fenzi 2011-12-15 00:39:14 UTC
php-ldap is a subpackage of php. 

this has nothing at all to do with p0f. ;) 
Reassigning.

Comment 4 Helgi Möller 2012-01-04 19:41:00 UTC
This problem is something that's also plaguing my system.

The difference is that we are running RHEL 6.2.  This problem showed up after an upgrade - prior to upgrading the same in-house code was running without this intermittency. [I do not recall the version we upgraded from]

This hasn't been a problem while connecting to a 389 Directory Server via ldaps, it seems to only belong to Active Directory connections via mod_php, using php-ldap. php-cli has not shown this behaviour.

The AD-ldap connections will be functioning fine and dandy for the first couple of thousands requests made, where a bind is established.  Afterwards the handshake will fail in a random fashion.

Additional information:
When a connection is successful, we can connect to the AD server as often as we want within that session.  If a connection fails, every single connection in that session will fail as well.  If apache is restarted, AD-ldap connections won't fail for another couple of thousands requests.

Another, perhaps not related, oddity is that the handshake will fail more often when connecting to a more remote(not in-house) AD server - we have not tested that specific connection using older packages, with whom things were working as intended.

Comment 5 Gregory Gleason 2012-01-04 21:39:54 UTC
I made two changes that seemed to have mitigated the issue.

First, I found that the CA cert in my /etc/openldap/cacerts directory was incorrect.  When I tried to validate the cert presented in the SSL server hello against that CA cert, it failed.

So I followed the instructions here:

http://publib.boulder.ibm.com/infocenter/tivihelp/v2r1/index.jsp?topic=%2Fcom.ibm.itame.doc_5.1%2Fam51_webinstall313.htm

to export the CA cert from the active directory server.  I saved it as a DER encoded file, then used openssl to convert it to PEM.  I then used the /etc/pki/tls/misc/c_hash tool to get the hash for that PEM CA cert, and renamed it to match, and put it in /etc/openldap/cacerts/

I also changed TLS_REQCERT in /etc/openldap/ldap.conf from 'never' to 'allow'

Since these two changes and restarting httpd, I haven't had the issue at all, and we're talking LDAP interactions per day.

Comment 6 Helgi Möller 2012-01-11 15:25:16 UTC
The changes you made had no effect on our setup - there's still an intermittent failure in accessing Active Directory over ldaps.

Is this not a recognized problem at all? I haven't been able to find any more references to this type of an issue with php-ldap(?)

At the moment we are setting up new machines to try and replicate the issue, to be certain why it's occurring.

Comment 7 Gregory Gleason 2012-01-13 14:26:50 UTC
You should probably provide an ldap debug log as I have to see if it's really the same failure:

TLS: error: connect - force handshake failure: errno 0 - moznss error -8179
TLS: can't connect: TLS error -8179:Unknown code ___f 13.
ldap_err2string

I should also point out that in my issue, executing the script with php cli never fails - it only fails when accessed through via apache mod_php.

Comment 8 Helgi Möller 2012-01-13 15:35:10 UTC
It does indeed not connect due to an error(-8179,) as you described.

A packet dump did also reveal an Unknown CA error during the handshake, identical to your packet logs - once I discovered that, I went on a hunt for similar bugs and found this report.

As per your issue, php-cli does not seem to have the same tendencies as mod_php.

I've temporarily remedied this on our side by running all Active Directory connections through a [not-too-elegant] wrapper that executes the requested ldap query through php cli.

I'll dig up the logs and post them once I can find our system administrators :)

Comment 9 Gregory Gleason 2012-01-13 15:59:20 UTC
You might try validating the certificate from your AD server against your CA cert in your web server.

This is how I found that the CA cert that was provided to me by the AD admin was not correct.

Here is what I did from the web server:

openssl s_client -connect x.x.x.x:636

x.x.x.x is the IP address of the AD server.  In the output, I copied and pasted the server certificate to a new file.

I then ran something like this:

openssl verify -CAfile /path/to/ca/cert <cert> 

Where <cert> is the new file with the server cert.

I don't remember the exact syntax but it was something like that. Man pages can help.

Comment 10 wdouglascampbell 2012-04-19 16:34:36 UTC
I am also experiencing this issue since upgrading to Fedora 16.

The problem is exhibiting itself when I try to bind to my Active Directory.  It is intermittent when access via Apache but when using the PHP command line, there is never a problem.

In searching around I found this bug report which seems to explain the problem existing in RHEL6.

https://bugzilla.redhat.com/show_bug.cgi?id=738456

The problem appears to be related to a collision between libcurl and OpenLDAP on NSS initialization and shutdown.


In my case I did find that if I disable the curl extension that I was able to bind 100% of the time to my Active Directory but unfortunately I also need to be able to use libcurl.


Is there any chance that the fix implemented in the other bug issue that I shared could be implemented for this and an updated package released?

I would do it myself but I don't know how.

I'm hoping this information can help someone else who does know how.

Thanks!

Comment 11 Remi Collet 2012-09-24 05:15:41 UTC
As this report is quite old, can you confirm if it still present in latest php version 5.3.16 in fedora 16, (or 5.3.17 in testing) ?

Comment 12 Gregory Gleason 2012-09-24 13:28:23 UTC
Remi,

Unfortunately I cannot.  I no longer work for the same organization that has this configuration, though it appears that there have been several other comments from other users who have had this issue.

I had mitigated this issue with some configuration changes as I mentioned in comment #5.

Comment 13 Remi Collet 2012-10-10 08:55:16 UTC
The fix for curl/openldap/nss issue is applied since
curl 7.25.0-2.fc18
curl 1.24.0-2.fc17

Comment 14 Remi Collet 2012-10-10 09:03:25 UTC
Reassign to curl.

@kdudka : do you think the patch could be applied to F16 ?

Comment 15 Kamil Dudka 2012-10-15 13:43:42 UTC
(In reply to comment #14)
> @kdudka : do you think the patch could be applied to F16 ?

Sure.  We have a new enough version of NSS in F16 now, so there should be no problem there...

Comment 16 Fedora Update System 2012-10-16 13:02:36 UTC
curl-7.21.7-8.fc16 has been submitted as an update for Fedora 16.
https://admin.fedoraproject.org/updates/curl-7.21.7-8.fc16

Comment 17 Fedora Update System 2012-10-17 00:24:03 UTC
Package curl-7.21.7-8.fc16:
* should fix your issue,
* was pushed to the Fedora 16 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing curl-7.21.7-8.fc16'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2012-16252/curl-7.21.7-8.fc16
then log in and leave karma (feedback).

Comment 18 Fedora Update System 2012-12-20 16:22:53 UTC
curl-7.21.7-8.fc16 has been pushed to the Fedora 16 stable repository.  If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.