Bug 966781 - new ldap connections can block ldaps and ldapi connections
new ldap connections can block ldaps and ldapi connections
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: 389-ds-base (Show other bugs)
6.4
Unspecified Unspecified
unspecified Severity unspecified
: rc
: ---
Assigned To: Rich Megginson
Sankar Ramalingam
:
Depends On: 1017418
Blocks:
  Show dependency treegraph
 
Reported: 2013-05-23 19:44 EDT by Rich Megginson
Modified: 2013-11-21 16:07 EST (History)
7 users (show)

See Also:
Fixed In Version: 389-ds-base-1.2.11.15-22.el6
Doc Type: Bug Fix
Doc Text:
Cause: When there was a request for a new LDAP connection at the same time as a request for a new LDAPS or LDAPI connection, the server would process only the LDAP request. Consequence: The directory server is much slower to respond to LDAPS or LDAPI new connection requests. Fix: The directory server was changed to process all listener requests at the same time. Result: There is no big difference in the processing times for LDAP, LDAPS, and LDAPI new connection requests.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-11-21 16:07:55 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Rich Megginson 2013-05-23 19:44:07 EDT
This bug is created as a clone of upstream ticket:
https://fedorahosted.org/389/ticket/47359

The poll loop in slapd_daemon() will first look for a new ldap connection from an ldap listener.  If it finds one, it will not look for a new ldaps or ldapi connection.  If the server is processing thousands of ldap connection requests per second, this could cause ldaps connections to take a long time, timeout, or accumulate in the ldaps port tcp backlog queue.
Comment 5 Ján Rusnačko 2013-08-01 08:08:48 EDT
Is this possible to automate ? If yes, could you please add verification steps ?
Comment 6 Rich Megginson 2013-08-01 10:15:59 EDT
Steps:

You'll need at least 3 machines - one for the server, and two for clients - to really stress the connections - server machine should have at least 4 cores, client machines should have at least 4 cores

Setup server machine with RHDS with SSL and LDAPI

On each client machine, run 1 and 2 at the same time

1. run ldclt for regular ldap

    ldclt -h $HOST -p $LDAPPORT -D $BINDDN -w $BINDPW \
        -b $BASE -f $FILT -e bindonly -e bindeach  \
	-e esearch -n$NTHREAD -v -q

Where HOST is the server machine, LDAPPORT is the non-secure LDAP port, BINDDN and BINDPW are the identity (you can use cn=directory manager and password), BASE is your search base, FILT is a search filter that returns 1 entry (e.g. if you use Example.ldif, you can use something like uid=scarter), and NTHREAD is the number of cores on the machine (grep processor /proc/cpuinfo | wc -l)

2. run ldclt for SSL - you cannot use threads with SSL, so you must run one process for each core on the machine (grep processor /proc/cpuinfo | wc -l)
Run NPROC ldclt processes like this:

    ldclt -h $HOST -p $LDAPSPORT -D $BINDDN -w $BINDPW \
        -b $BASE -f $FILT -e bindonly -e bindeach  \
	-e esearch -n1 -Z /path/to/certdir/cert8.db -v -q

You will need to get the CA cert from the server machine and use certutil on the client machine first:

    certutil -d /path/to/certdir -A -t CT,, -n "cacert" -a -i cacert.asc


Keep the output of ldclt somewhere.

If you do this on an older RHDS, and on the latest RHDS, you will notice that the latest RHDS can support many more LDAPS connections than the older version of RHDS.
Comment 8 Ján Rusnačko 2013-10-17 07:52:14 EDT
Used 3 VMs - server with RHEL6.4/RHEL6.5 with 1 CPU and client VMs with 8 CPU.

On 389-ds-base-1.2.11.15-11.el6.x86_64:

[root@dstet ~]# ldclt -h 192.168.122.135 -p 636 -D "cn=tuser1,ou=people,dc=example,dc=com" -w Secret123 -b "ou=people,dc=example,dc=com" -f "cn=tuser2" -e bindeach -e bindonly -e esearch -n 1 -v -q -Z certdb/
ldclt version 4.23
/usr/bin/ldclt-bin -h 192.168.122.135 -p 636 -D cn=tuser1,ou=people,dc=example,dc=com -w Secret123 -b ou=people,dc=example,dc=com -f cn=tuser2 -e bindeach -e bindonly -e esearch -n 1 -v -q -Z certdb/
Process ID         = 2897
Host to connect    = 192.168.122.135
Port number        = 636
Bind DN            = cn=tuser1,ou=people,dc=example,dc=com
Passwd             = Secret123
Referral           = on
Base DN            = ou=people,dc=example,dc=com
Filter             = "cn=tuser2"
Max times inactive = 3
Max allowed errors = 1000
Number of samples  = -1
Number of threads  = 1
Total op. req.     = -1
Running mode       = 0xc0000029
Running mode       = quiet verbose bind_each_operation ssl bindonly exact_search
LDAP oper. timeout = 30 sec
Sampling interval  = 10 sec
Scope              = subtree
Attrsonly          = 0
ldclt[2897]: Starting at Thu Oct 17 12:24:22 2013

ldclt[2897]: Average rate:  759.00/thr  (  75.90/sec), total:    759
ldclt[2897]: Average rate:  759.00/thr  (  75.90/sec), total:    759
ldclt[2897]: Average rate:  741.00/thr  (  74.10/sec), total:    741
ldclt[2897]: Average rate:  180.00/thr  (  18.00/sec), total:    180
ldclt[2897]: Average rate:    2.00/thr  (   0.20/sec), total:      2
ldclt[2897]: Average rate:    2.00/thr  (   0.20/sec), total:      2
ldclt[2897]: Average rate:    2.00/thr  (   0.20/sec), total:      2
^C
ldclt[2897]: Global average rate: 2445.00/thr  ( 34.93/sec), total:   2445
ldclt[2897]: Global number times "no activity" reports: never
ldclt[2897]: Global no error occurs during this session.
Catch SIGINT - exit...
ldclt[2897]: Ending at Thu Oct 17 12:25:42 2013
ldclt[2897]: Exit status 0 - No problem during execution.

[root@localhost ~]# ldclt -h 192.168.122.135 -p 389 -D "cn=tuser1,ou=people,dc=example,dc=com" -w Secret123 -b "ou=people,dc=example,dc=com" -f "cn=tuser2" -e bindeach -e bindonly -e esearch -v -q -n 10
ldclt version 4.23
/usr/bin/ldclt-bin -h 192.168.122.135 -p 389 -D cn=tuser1,ou=people,dc=example,dc=com -w Secret123 -b ou=people,dc=example,dc=com -f cn=tuser2 -e bindeach -e bindonly -e esearch -v -q -n 10
Process ID         = 17750
Host to connect    = 192.168.122.135
Port number        = 389
Bind DN            = cn=tuser1,ou=people,dc=example,dc=com
Passwd             = Secret123
Referral           = on
Base DN            = ou=people,dc=example,dc=com
Filter             = "cn=tuser2"
Max times inactive = 3
Max allowed errors = 1000
Number of samples  = -1
Number of threads  = 10
Total op. req.     = -1
Running mode       = 0xc0000009
Running mode       = quiet verbose bind_each_operation bindonly exact_search
LDAP oper. timeout = 30 sec
Sampling interval  = 10 sec
Scope              = subtree
Attrsonly          = 0
ldclt[17750]: Starting at Thu Oct 17 11:24:55 2013

ldclt[17750]: Average rate: 1277.60/thr  (1277.60/sec), total:  12776
ldclt[17750]: Average rate: 1254.90/thr  (1254.90/sec), total:  12549
ldclt[17750]: Average rate: 1263.00/thr  (1263.00/sec), total:  12630
ldclt[17750]: Average rate: 1272.50/thr  (1272.50/sec), total:  12725
^C
ldclt[17750]: Global average rate: 5068.00/thr  (1267.00/sec), total:  50680
ldclt[17750]: Global number times "no activity" reports: never
ldclt[17750]: Global no error occurs during this session.
Catch SIGINT - exit...
ldclt[17750]: Ending at Thu Oct 17 11:25:41 2013
ldclt[17750]: Exit status 0 - No problem during execution.

When normal client starts stressing the server, SSL client performance drops to 2/thr.

On 389-ds-base-1.2.11.15-28.el6.x86_64:

[root@dstet ~]# ldclt -h 192.168.122.135 -p 636 -D "cn=tuser1,ou=people,dc=example,dc=com" -w Secret123 -b "ou=people,dc=example,dc=com" -f "cn=tuser2" -e bindeach -e bindonly -e esearch -n 
1 -v -q -Z certdb/
ldclt version 4.23
/usr/bin/ldclt-bin -h 192.168.122.135 -p 636 -D cn=tuser1,ou=people,dc=example,dc=com -w Secret123 -b ou=people,dc=example,dc=com -f cn=tuser2 -e bindeach -e bindonly -e esearch -n 1 -v -q -
Z certdb/
Process ID         = 2880
Host to connect    = 192.168.122.135
Port number        = 636
Bind DN            = cn=tuser1,ou=people,dc=example,dc=com
Passwd             = Secret123
Referral           = on
Base DN            = ou=people,dc=example,dc=com
Filter             = "cn=tuser2"
Max times inactive = 3
Max allowed errors = 1000
Number of samples  = -1
Number of threads  = 1
Total op. req.     = -1
Running mode       = 0xc0000029
Running mode       = quiet verbose bind_each_operation ssl bindonly exact_search
LDAP oper. timeout = 30 sec
Sampling interval  = 10 sec
Scope              = subtree
Attrsonly          = 0
ldclt[2880]: Starting at Thu Oct 17 12:17:57 2013

ldclt[2880]: Average rate:  709.00/thr  (  70.90/sec), total:    709
ldclt[2880]: Average rate:  726.00/thr  (  72.60/sec), total:    726
ldclt[2880]: Average rate:  865.00/thr  (  86.50/sec), total:    865
ldclt[2880]: Average rate:  881.00/thr  (  88.10/sec), total:    881
ldclt[2880]: Average rate:  896.00/thr  (  89.60/sec), total:    896
ldclt[2880]: Average rate:  868.00/thr  (  86.80/sec), total:    868
ldclt[2880]: Average rate:  860.00/thr  (  86.00/sec), total:    860
^C
ldclt[2880]: Global average rate: 5805.00/thr  ( 82.93/sec), total:   5805
ldclt[2880]: Global number times "no activity" reports: never
ldclt[2880]: Global no error occurs during this session.
Catch SIGINT - exit...
ldclt[2880]: Ending at Thu Oct 17 12:19:16 2013
ldclt[2880]: Exit status 0 - No problem during execution.


[root@localhost ~]# ldclt -h 192.168.122.135 -p 389 -D "cn=tuser1,ou=people,dc=example,dc=com" -w Secret123 -b "ou=people,dc=example,dc=com" -f "cn=tuser2" -e bindeach -e bindonly -e esearch
 -v -q -n 10
ldclt version 4.23
/usr/bin/ldclt-bin -h 192.168.122.135 -p 389 -D cn=tuser1,ou=people,dc=example,dc=com -w Secret123 -b ou=people,dc=example,dc=com -f cn=tuser2 -e bindeach -e bindonly -e esearch -v -q -n 10
Process ID         = 17725
Host to connect    = 192.168.122.135
Port number        = 389
Bind DN            = cn=tuser1,ou=people,dc=example,dc=com
Passwd             = Secret123
Referral           = on
Base DN            = ou=people,dc=example,dc=com
Filter             = "cn=tuser2"
Max times inactive = 3
Max allowed errors = 1000
Number of samples  = -1
Number of threads  = 10
Total op. req.     = -1
Running mode       = 0xc0000009
Running mode       = quiet verbose bind_each_operation bindonly exact_search
LDAP oper. timeout = 30 sec
Sampling interval  = 10 sec
Scope              = subtree
Attrsonly          = 0
ldclt[17725]: Starting at Thu Oct 17 11:18:21 2013

ldclt[17725]: Average rate:  992.10/thr  ( 992.10/sec), total:   9921
ldclt[17725]: Average rate: 1001.20/thr  (1001.20/sec), total:  10012
ldclt[17725]: Average rate:  996.00/thr  ( 996.00/sec), total:   9960
ldclt[17725]: Average rate:  972.00/thr  ( 972.00/sec), total:   9720
ldclt[17725]: Average rate:  973.60/thr  ( 973.60/sec), total:   9736
cldclt[17725]: Average rate: 1091.90/thr  (1091.90/sec), total:  10919
^C
ldclt[17725]: Global average rate: 6026.80/thr  (1004.47/sec), total:  60268
ldclt[17725]: Global number times "no activity" reports: never
ldclt[17725]: Global no error occurs during this session.
Catch SIGINT - exit...
ldclt[17725]: Ending at Thu Oct 17 11:19:22 2013
ldclt[17725]: Exit status 0 - No problem during execution.

Performance problem is not reproduced. Verified.
Comment 9 errata-xmlrpc 2013-11-21 16:07:55 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1653.html

Note You need to log in before you can comment on or make changes to this bug.