Bug 784928 - httpd (and php-ldap) fail to create new outgoing TLS sessions
httpd (and php-ldap) fail to create new outgoing TLS sessions
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: openldap (Show other bugs)
6.2
x86_64 All
high Severity high
: rc
: ---
Assigned To: Jan Synacek
BaseOS QE Security Team
: Regression
Depends On:
Blocks: 782183 836160 840699
  Show dependency treegraph
 
Reported: 2012-01-26 11:59 EST by George Shammas
Modified: 2013-11-01 06:37 EDT (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-03-20 14:48:17 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
php ldap test (1.02 KB, text/plain)
2012-02-01 10:16 EST, George Shammas
no flags Details

  None (edit)
Description George Shammas 2012-01-26 11:59:03 EST
Description of problem:
Random Apache threads stop being able to create TLS connections. This problem becomes very problematic when trying to contact an ldap server over ssl. And because its not all Apache threads, it intermittent, making users very frustrated.

Uping the php LDAP debug level (ldap_set_option(NULL, LDAP_OPT_DEBUG_LEVEL, 7);)  we can see the threads stop being able to create TLS sessions.

Sniffing traffic to and from the ldap server shows that each time a tcp connection is fully initiated, but in the case of a fail, a FIN is issued as soon as the connection is opened.

Another person seems to have the same problem, https://www.redhat.com/archives/rhelv6-list/2011-September/msg00023.html but the only response to that discussion is invalid. On my server having the problem, I launched 20 proccesses runing the same php script from the cli for 5 minutes staight, and not once did one fail. All the while apache threads still had trouble.

He links it to how many requests a thread has served is the cause of the problem.


Version-Release number of selected component (if applicable):
RHEL6.1 and RHEL6.2
RHEL6.0 and RHEL5.x Seem to be immune. 

How reproducible:
Very



Steps to Reproduce:
1. Install php-ldap and httpd
2. Try contacting a ldap server over ssl
2a. A freshly restarted httpd will take about 20 minutes before symptoms start.
3. Turn on php-ldap debug mode to see the usable error of failing to create tls sessions. With out doing this, the only error you get is failed to bind.
  


Actual results:
ldap_create
ldap_url_parse_ext(ldaps://redacted/)
ldap_bind_s
ldap_simple_bind_s
ldap_sasl_bind_s
ldap_sasl_bind
ldap_send_initial_request
ldap_new_connection 1 1 0
ldap_int_open_connection
ldap_connect_to_host: TCP redacted:636
ldap_new_socket: 14
ldap_prepare_socket: 14
ldap_connect_to_host: Trying redacted:636
ldap_pvt_connect: fd: 14 tm: -1 async: 0
TLS: error: could not initialize moznss security context - error -5925:The one-time function was previously called and failed. Its error code is no longer available
TLS: can't create ssl handle.
ldap_err2string



Expected results:
ldap_create
ldap_url_parse_ext(ldaps://redacted/)
ldap_bind_s
ldap_simple_bind_s
ldap_sasl_bind_s
ldap_sasl_bind
ldap_send_initial_request
ldap_new_connection 1 1 0
ldap_int_open_connection
ldap_connect_to_host: TCP redacted:636
ldap_new_socket: 13
ldap_prepare_socket: 13
ldap_connect_to_host: Trying redacted:636
ldap_pvt_connect: fd: 13 tm: -1 async: 0
TLS: certificate [CN=redacted,OU=ITS,O=New York University,L=New York,ST=New York,C=US] is valid
TLS certificate verification: subject: CN=redacted,OU=ITS,O=New York University,L=New York,ST=New York,C=US, issuer: CN=VeriSign Class 3 Secure Server CA - G2,OU=Terms of use at https://www.verisign.com/rpa (c)09,OU=VeriSign Trust Network,O="VeriSign, Inc.",C=US, cipher: RC4, security level: high, secret key bits: 128, total key bits: 128, cache hits: 0, cache misses: 0, cache not reusable: 0
ldap_open_defconn: successful
ldap_send_server_request
........
ldap_next_attribute
ldap_get_values_len
ldap_next_attribute
ldap_get_values_len
ldap_next_attribute
ldap_get_dn
ldap_msgfree
ldap_free_connection 1 1
ldap_send_unbind
ldap_free_connection: actually freed


Additional info:
Comment 2 George Shammas 2012-01-26 13:19:04 EST
Rich Graves mailing list thread, which linked to, gave me two more testing ideas, both with interesting results. 

1. I adjusted the MaxRequestsPerChild from the default of 4000 to 200 and waited for the problem to occur. Then I stressed the server to see if when a new thread spawned it would take the problem with it. And it did. After about 10 seconds, all the old apache threads died and respawned and everything even at 400 requests a second no more threads would error.If I waited for the problem again (without restarted httpd) and the same result happened. 

Once a thread reached its maxed requests and respawned, the problem goes away. This leads me to believe that its a time based problem.


2. If I further lower MaxRequestsPerChild from 200 to 10. The problem is not noticeable. Threads respawn too quickly to get diseased by it. However the side effect of this is that the server's max requests per second dropped from 400 to 50. So not an ideal solution.
Comment 3 Joe Orton 2012-01-31 12:14:24 EST
Given the indication this is a regression since 6.0, and the error message:

""
error: could not initialize moznss security context - error -5925:The
one-time function was previously called and failed. Its error code is no longer
available
""

I'd presume this is a problem with the switch to Mozilla NSS in openldap; re-assigning accordingly.  Can you provide a minimal repro case?
Comment 6 George Shammas 2012-02-01 10:16:59 EST
Created attachment 558850 [details]
php ldap test

I have not been able to reproduce it outside of Apache with mod_php. Even apache with php-cgi works.

I have attached a simple php script, that avoids being cached, and errors when it doesn't work. Outputs as much data as it can, however the error that ldap returns is '', which isn't very useful.


Also this bug is actually a dup of
https://bugzilla.redhat.com/show_bug.cgi?format=multiple&id=738456
which also has no solution.
Comment 8 Kamil Dudka 2012-04-02 03:35:42 EDT
(In reply to comment #6)
> Also this bug is actually a dup of
> https://bugzilla.redhat.com/show_bug.cgi?format=multiple&id=738456

Could you please check whether libcurl-7.19.7-26.el6_2.4 fixes this issue?

http://rhn.redhat.com/errata/RHBA-2012-0430.html
Comment 11 Jan Vcelak 2012-07-17 06:11:01 EDT
@George: ping
Comment 12 Jan Vcelak 2012-07-18 14:29:13 EDT
If not, please, can you also try with following scratch build:
http://people.redhat.com/jvcelak/bz707599/
Comment 16 Rich Graves 2013-03-19 17:34:10 EDT
Cleanup: possible duplicate bug https://bugzilla.redhat.com/show_bug.cgi?id=738456 is resolved.
Comment 17 Siddharth Nagar 2013-03-20 14:48:17 EDT
This is issue seems to be resolved as per Comment #16. Closing.

Note You need to log in before you can comment on or make changes to this bug.