Created attachment 472701 [details] replay script, configs, access log, strace results Description of problem: slapd stops responding to queries; the TCP port remains open, though new connections are unable to bind() or issue LDAP queries. This condition persists until slapd is shutdown and restarted. Version-Release number of selected component (if applicable): Stock and up-to-date RHEL system; RPM on clean install from EPEL. # cat /etc/redhat-release Red Hat Enterprise Linux Server release 5.5 (Tikanga) # rpm -qa | grep 389 389-console-1.1.4-1.el5 389-dsgw-1.1.5-1.el5 389-ds-base-1.2.7.5-1.el5 389-admin-1.1.13-1.el5 389-admin-console-1.1.5-1.el5 389-admin-console-doc-1.1.5-1.el5 389-ds-1.2.1-1.el5 389-adminutil-1.1.8-4.el5 389-ds-console-1.2.3-1.el5 389-ds-console-doc-1.2.3-1.el5 How reproducible: Difficult: requires multiple processes replaying production TCP sessions saved from tcpdump to cause the wedge. A single replay process causes no issue, nor does multiple processes replicating the slapd log via ldapsearch against the server. Steps to Reproduce: 1. Clean install of a standalone LDAP server, minor configuration tweaks to enable Kerberos support, increase file descriptors, allow anonymous binds. (See dirsrv.sysconfig, dse.ldif, attached.) 2. Collect production LDAP traffic (actual client requests against the current LDAP servers, which run an older fedora-ds release that do not exhibit this bug): `tcpdump -vv -c 100000 -s 0 -w ldap- traffic port 389` 3. Run strace against slapd-* process on server. 4. Replay production LDAP traffic against test standalone server (via ldap-tcpdump-replay script, attached). Requires multiple runs of the script, single run does not suffice? 5. Wait for server to stop responding... Started strace at 20:45 UTC and the test traffic at around then; server stopped responding and was automatically restarted by an every-15- seconds health check connecting via 127.0.0.1 that saw no problems at ~20:47:30 but found an unresponsive server at ~20:47:45.
This looks like a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=668548
I have experienced this issue as well under production load. I would also add that while the TCP port does remain open, connections to it are immediately disconnected. # cat /etc/redhat-release Red Hat Enterprise Linux Server release 5.5 (Tikanga) # rpm -qa | grep 389 389-admin-1.1.13-1.el5 389-ds-console-doc-1.2.3-1.el5 389-adminutil-1.1.8-4.el5 389-console-1.1.4-1.el5 389-dsgw-1.1.5-1.el5 389-ds-console-1.2.3-1.el5 389-admin-console-doc-1.1.5-1.el5 389-ds-base-1.2.7.5-1.el5 389-admin-console-1.1.5-1.el5 389-ds-1.2.1-1.el5 access: [10/Jan/2011:12:04:26 -0600] conn=187 fd=69 slot=69 connection from 172.16.x.x to 172.16.x.x [10/Jan/2011:12:04:26 -0600] conn=187 op=-1 fd=69 closed - T2 [10/Jan/2011:12:04:26 -0600] conn=188 fd=69 slot=69 connection from 172.16.x.x to 172.16.x.x [10/Jan/2011:12:04:26 -0600] conn=188 op=-1 fd=69 closed - T2 [10/Jan/2011:12:04:26 -0600] conn=189 fd=69 slot=69 connection from 172.16.x.x to 172.16.x.x [10/Jan/2011:12:04:26 -0600] conn=189 op=-1 fd=69 closed - T2 [10/Jan/2011:12:04:27 -0600] conn=190 fd=69 slot=69 connection from 172.16.x.x to 172.16.x.x [10/Jan/2011:12:04:27 -0600] conn=190 op=-1 fd=69 closed - T2 [10/Jan/2011:12:04:27 -0600] conn=191 fd=69 slot=69 connection from 172.16.x.x to 172.16.x.x [10/Jan/2011:12:04:27 -0600] conn=191 op=-1 fd=69 closed - T2 [10/Jan/2011:12:04:30 -0600] conn=192 fd=69 slot=69 connection from 172.16.x.x to 172.16.x.x [10/Jan/2011:12:04:30 -0600] conn=192 op=-1 fd=69 closed - T2 [10/Jan/2011:12:04:31 -0600] conn=193 fd=69 slot=69 connection from 172.16.x.x to 172.16.x.x [10/Jan/2011:12:04:31 -0600] conn=193 op=-1 fd=69 closed - T2 error: [10/Jan/2011:12:04:26 -0600] - new connection on 69 [10/Jan/2011:12:04:26 -0600] - activity on 69r [10/Jan/2011:12:04:26 -0600] - new connection on 69 [10/Jan/2011:12:04:26 -0600] - activity on 69r [10/Jan/2011:12:04:26 -0600] - new connection on 69 [10/Jan/2011:12:04:26 -0600] - activity on 69r [10/Jan/2011:12:04:27 -0600] - new connection on 69 [10/Jan/2011:12:04:27 -0600] - activity on 69r [10/Jan/2011:12:04:27 -0600] - new connection on 69 [10/Jan/2011:12:04:27 -0600] - activity on 69r [10/Jan/2011:12:04:30 -0600] - new connection on 69 [10/Jan/2011:12:04:31 -0600] - new connection on 69 [10/Jan/2011:12:04:31 -0600] - activity on 69r
*** Bug 668548 has been marked as a duplicate of this bug. ***
Created attachment 476252 [details] git patch file (master) Description: The server uses c_timelimit field in the connection table element to check the staled Simple Paged Results request. The field is supposed to be cleaned up when the staled Simple Paged Results connection is disconnected. But the cleanup was sometimes incomplete. It causes the following requests which happens to acquire the same connection table element timed out. This patch forces to clean up the c_timelimit every time the connection table is acquired and renewed. Also, the timeout check is done only for the Simple Paged Results connection.
Reviewed by Rich (Thank you!!!) Pushed to master. $ git merge 668619 Updating 57dcdf1..68d53e3 Fast-forward ldap/servers/slapd/connection.c | 2 ++ ldap/servers/slapd/daemon.c | 3 ++- ldap/servers/slapd/pagedresults.c | 4 ++-- 3 files changed, 6 insertions(+), 3 deletions(-) $ git push Counting objects: 15, done. Delta compression using up to 4 threads. Compressing objects: 100% (8/8), done. Writing objects: 100% (8/8), 1.11 KiB, done. Total 8 (delta 6), reused 0 (delta 0) To ssh://git.fedorahosted.org/git/389/ds.git 57dcdf1..68d53e3 master -> master
I had the same problem. Slapd stops responding after a few minutes of started. I've applied the patch and it's been running for 24 hours without any problem.
(In reply to comment #7) > I had the same problem. Slapd stops responding after a few minutes of started. > I've applied the patch and it's been running for 24 hours without any problem. Thank you for your input! Please let us know if you run into the problem again.
I've also tested this on Fedora 13 (I installed 1.2.8.a2), and it seems to be running fine. Previously it was "crashing" very frequently (i.e. not responding, with the T2 messages); RHEL4 clients seemed especially hard on it.
No hang on RHEL 5.5 with 1.2.8 a2, but instead segmentation faults when tested with replayed production traffic. Gathering gdb output...
Created attachment 478119 [details] gdb logs gdb traces from two crashes of 389-ds-base-1.2.8-0.2.a2.el5 in response to replayed production traffic.
(In reply to comment #11) > Created attachment 478119 [details] > gdb logs > > gdb traces from two crashes of 389-ds-base-1.2.8-0.2.a2.el5 in response to > replayed production traffic. Thanks. The stack looks corrupted. Are you using SASL/GSSAPI Kerberos authentication? Can you attach the last few lines from your errors and access logs before the crash?
Yes, the servers are configured for Kerberos auth (via http://directory.fedoraproject.org/wiki/Howto:Kerberos). Our existing fedora-ds-1.0.4-1.RHEL4 production LDAP servers also use Kerberos auth, and do not exhibit any problems (they were setup well before my time here, though). Errors shows just various peer rests, nothing unique at time of the crash. [10/Feb/2011:18:18:01 +0000] - PR_Write(118) Netscape Portable Runtime error -59 61 (TCP connection reset by peer.) [10/Feb/2011:18:18:02 +0000] - PR_Write(79) Netscape Portable Runtime error -596 1 (TCP connection reset by peer.) Access shows nothing exciting, though that is buffered. Trying again with unbuffered logs and a higher error log level.
Unbuffered logging appears to avoid the segmentation faults (for an overnight run of test traffic; buffered logging leads to crashes within minutes), though hobbles performance to the log write speed. If production load is near or beyond that limit, client requests block or timeout, which is not ideal. Shutting down the unbuffered logging ns-slapd took 10 (!) minutes after all test traffic was cut off.
Yeah, that's why we buffer access logging by default. You can use a named pipe script for the access log instead - http://directory.fedoraproject.org/wiki/Named_Pipe_Log_Script
Created attachment 481105 [details] gdb trace from 1.2.8 a3 segfault gdb trace from 1.2.8 a3 segfault added (alpha 3 is harder to segfault than alpha 2).
Centos5? 32-bit or 64-bit? How are you generating the load? If you are using some scripts or load client, is it possible we could get a copy?
I've opened bug 683250 to track the crashing issue initially reported in comment#10. Please use that bug instead of this one.
using the steps mentioned at comment 5. and... I am facing the same T2 error. Actually,first I tested the given script against a suffix which is having only 6-7 entries.. No issues found there. But when I went ahead and executed the script attached by you in the bug with the suffix having entries more than 50,000.. It crashes.. here are the few lines from access log : [20/May/2011:15:20:02 +051800] conn=13 op=1 SRCH base="dc=example,dc=com" scope=2 filter="(cn=*)" attrs="distinguishedName" [20/May/2011:15:20:07 +051800] conn=13 op=1 RESULT err=3 tag=101 nentries=0 etime=8 notes=U,P [20/May/2011:15:20:00 +051800] conn=94 fd=156 slot=156 connection from 127.0.0.1 to 127.0.0.1 [20/May/2011:15:20:07 +051800] conn=63 op=-1 fd=125 closed - T2 [20/May/2011:15:20:07 +051800] conn=59 op=-1 fd=121 closed - T2 [20/May/2011:15:20:07 +051800] conn=55 op=-1 fd=117 closed - T2 [20/May/2011:15:20:07 +051800] conn=51 op=-1 fd=113 closed - T2 [20/May/2011:15:20:07 +051800] conn=49 op=-1 fd=111 closed - T2 [20/May/2011:15:20:07 +051800] conn=47 op=-1 fd=109 closed - T2 [20/May/2011:15:20:07 +051800] conn=38 op=-1 fd=100 closed - T2 [20/May/2011:15:20:07 +051800] conn=95 fd=68 slot=68 connection from 127.0.0.1 to 127.0.0.1 [20/May/2011:15:20:07 +051800] conn=96 fd=73 slot=73 connection from 127.0.0.1 to 127.0.0.1 [20/May/2011:15:20:00 +051800] conn=47 op=1 RESULT err=0 tag=101 nentries=5 etime=1 notes=U,P [20/May/2011:15:20:07 +051800] conn=74 op=1 SRCH base="dc=example,dc=com" scope=2 filter="(cn=*)" attrs="distinguishedName" [20/May/2011:15:20:07 +051800] conn=74 op=1 RESULT err=0 tag=101 nentries=5 etime=0 notes=U,P [20/May/2011:15:20:07 +051800] conn=73 op=1 SRCH base="dc=example,dc=com" scope=2 filter="(cn=*)" attrs="distinguishedName" [20/May/2011:15:20:07 +051800] conn=73 op=1 RESULT err=0 tag=101 nentries=5 etime=0 notes=U,P [20/May/2011:15:20:08 +051800] conn=65 op=2 UNBIND [20/May/2011:15:20:08 +051800] conn=65 op=2 fd=127 closed - U1 [20/May/2011:15:20:08 +051800] conn=64 op=2 UNBIND [20/May/2011:15:20:08 +051800] conn=64 op=2 fd=126 closed - U1 [20/May/2011:15:20:08 +051800] conn=62 op=2 UNBIND [20/May/2011:15:20:08 +051800] conn=62 op=2 fd=124 closed - U1 [20/May/2011:15:20:08 +051800] conn=61 op=3 UNBIND [20/May/2011:15:20:08 +051800] conn=61 op=3 fd=123 closed - U1 [20/May/2011:15:20:08 +051800] conn=58 op=2 UNBIND [20/May/2011:15:20:08 +051800] conn=58 op=2 fd=120 closed - U1 [20/May/2011:15:20:07 +051800] conn=97 fd=77 slot=77 connection from 127.0.0.1 to 1 [20/May/2011:15:20:08 +051800] conn=109 fd=125 slot=125 connection from 127.0.0.1 to 127.0.0.1 [20/May/2011:15:20:09 +051800] conn=91 op=-1 fd=153 closed - T2 [20/May/2011:15:20:09 +051800] conn=86 op=-1 fd=148 closed - T2 [20/May/2011:15:20:09 +051800] conn=74 op=-1 fd=136 closed - T2 [20/May/2011:15:20:09 +051800] conn=73 op=-1 fd=135 closed - T2 [20/May/2011:15:20:09 +051800] conn=71 op=-1 fd=133 closed - T2 [20/May/2011:15:20:09 +051800] conn=70 op=-1 fd=132 closed - T2 [20/May/2011:15:20:09 +051800] conn=69 op=-1 fd=131 closed - T2 [20/May/2011:15:20:09 +051800] conn=68 op=-1 fd=130 closed - T2 20/May/2011:15:20:11 +051800] conn=28 op=1 RESULT err=3 tag=101 nentries=0 etime=12 notes=U,P [20/May/2011:15:20:11 +051800] conn=14 op=2 UNBIND [20/May/2011:15:20:11 +051800] conn=14 op=2 fd=76 closed - U1 [20/May/2011:15:20:11 +051800] conn=13 op=2 UNBIND [20/May/2011:15:20:11 +051800] conn=13 op=2 fd=75 closed - U1 [20/May/2011:15:20:11 +051800] conn=12 op=2 UNBIND [20/May/2011:15:20:11 +051800] conn=12 op=2 fd=74 closed - U1 [20/May/2011:15:20:11 +051800] conn=10 op=2 UNBIND [20/May/2011:15:20:11 +051800] conn=10 op=2 fd=72 closed - U1 From errors log ; [20/May/2011:14:36:24 +051800] - PR_Write(64) Netscape Portable Runtime error -5961 (TCP connection reset by peer.) [20/May/2011:15:13:50 +051800] - slapd shutting down - signaling operation threads [20/May/2011:15:13:50 +051800] - slapd shutting down - closing down internal subsystems and plugins [20/May/2011:15:13:52 +051800] - Waiting for 4 database threads to stop [20/May/2011:15:13:52 +051800] - All database threads now stopped [20/May/2011:15:13:53 +051800] - slapd stopped. [20/May/2011:15:13:56 +051800] - 389-Directory/1.2.8.3 B2011.123.1759 starting up [20/May/2011:15:13:56 +051800] - slapd started. Listening on All Interfaces port 389 for LDAP requests [20/May/2011:15:13:56 +051800] - Listening on All Interfaces port 636 for LDAPS requests Request you to please guide here. Thanks and Regards, Amita
(In reply to comment #19) > using the steps mentioned at comment 5. > and... I am facing the same T2 error. > > Actually,first I tested the given script against a suffix which is having only > 6-7 entries.. No issues found there. > > But when I went ahead and executed the script attached by you in the bug with > the suffix having entries more than 50,000.. It crashes.. What makes you think it crashes? Are you talking about the test program or the directory server? From the errors log below, the server shuts down cleanly and starts up cleanly - no crash there. The original bug was that the server would stop responding to new requests after some period of activity. That is, ldapsearch would fail, but the server would be running. Are you able to reproduce that problem? Note that it is ok to get T2 errors - if the client is hung or there is a lot of network congestion, the server can timeout attempting to read/write to the client. If this is a common occurrence for the customer, we usually recommend increasing the idle timeout. > > here are the few lines from access log : > [20/May/2011:15:20:02 +051800] conn=13 op=1 SRCH base="dc=example,dc=com" > scope=2 filter="(cn=*)" attrs="distinguishedName" > [20/May/2011:15:20:07 +051800] conn=13 op=1 RESULT err=3 tag=101 nentries=0 > etime=8 notes=U,P > [20/May/2011:15:20:00 +051800] conn=94 fd=156 slot=156 connection from > 127.0.0.1 to 127.0.0.1 > [20/May/2011:15:20:07 +051800] conn=63 op=-1 fd=125 closed - T2 > [20/May/2011:15:20:07 +051800] conn=59 op=-1 fd=121 closed - T2 > [20/May/2011:15:20:07 +051800] conn=55 op=-1 fd=117 closed - T2 > [20/May/2011:15:20:07 +051800] conn=51 op=-1 fd=113 closed - T2 > [20/May/2011:15:20:07 +051800] conn=49 op=-1 fd=111 closed - T2 > [20/May/2011:15:20:07 +051800] conn=47 op=-1 fd=109 closed - T2 > [20/May/2011:15:20:07 +051800] conn=38 op=-1 fd=100 closed - T2 > [20/May/2011:15:20:07 +051800] conn=95 fd=68 slot=68 connection from 127.0.0.1 > to 127.0.0.1 > [20/May/2011:15:20:07 +051800] conn=96 fd=73 slot=73 connection from 127.0.0.1 > to 127.0.0.1 > [20/May/2011:15:20:00 +051800] conn=47 op=1 RESULT err=0 tag=101 nentries=5 > etime=1 notes=U,P > [20/May/2011:15:20:07 +051800] conn=74 op=1 SRCH base="dc=example,dc=com" > scope=2 filter="(cn=*)" attrs="distinguishedName" > [20/May/2011:15:20:07 +051800] conn=74 op=1 RESULT err=0 tag=101 nentries=5 > etime=0 notes=U,P > [20/May/2011:15:20:07 +051800] conn=73 op=1 SRCH base="dc=example,dc=com" > scope=2 filter="(cn=*)" attrs="distinguishedName" > [20/May/2011:15:20:07 +051800] conn=73 op=1 RESULT err=0 tag=101 nentries=5 > etime=0 notes=U,P > > > [20/May/2011:15:20:08 +051800] conn=65 op=2 UNBIND > [20/May/2011:15:20:08 +051800] conn=65 op=2 fd=127 closed - U1 > [20/May/2011:15:20:08 +051800] conn=64 op=2 UNBIND > [20/May/2011:15:20:08 +051800] conn=64 op=2 fd=126 closed - U1 > [20/May/2011:15:20:08 +051800] conn=62 op=2 UNBIND > [20/May/2011:15:20:08 +051800] conn=62 op=2 fd=124 closed - U1 > [20/May/2011:15:20:08 +051800] conn=61 op=3 UNBIND > [20/May/2011:15:20:08 +051800] conn=61 op=3 fd=123 closed - U1 > [20/May/2011:15:20:08 +051800] conn=58 op=2 UNBIND > [20/May/2011:15:20:08 +051800] conn=58 op=2 fd=120 closed - U1 > [20/May/2011:15:20:07 +051800] conn=97 fd=77 slot=77 connection from 127.0.0.1 > to 1 > > > [20/May/2011:15:20:08 +051800] conn=109 fd=125 slot=125 connection from > 127.0.0.1 to 127.0.0.1 > [20/May/2011:15:20:09 +051800] conn=91 op=-1 fd=153 closed - T2 > [20/May/2011:15:20:09 +051800] conn=86 op=-1 fd=148 closed - T2 > [20/May/2011:15:20:09 +051800] conn=74 op=-1 fd=136 closed - T2 > [20/May/2011:15:20:09 +051800] conn=73 op=-1 fd=135 closed - T2 > [20/May/2011:15:20:09 +051800] conn=71 op=-1 fd=133 closed - T2 > [20/May/2011:15:20:09 +051800] conn=70 op=-1 fd=132 closed - T2 > [20/May/2011:15:20:09 +051800] conn=69 op=-1 fd=131 closed - T2 > [20/May/2011:15:20:09 +051800] conn=68 op=-1 fd=130 closed - T2 > > > 20/May/2011:15:20:11 +051800] conn=28 op=1 RESULT err=3 tag=101 nentries=0 > etime=12 notes=U,P > [20/May/2011:15:20:11 +051800] conn=14 op=2 UNBIND > [20/May/2011:15:20:11 +051800] conn=14 op=2 fd=76 closed - U1 > [20/May/2011:15:20:11 +051800] conn=13 op=2 UNBIND > [20/May/2011:15:20:11 +051800] conn=13 op=2 fd=75 closed - U1 > [20/May/2011:15:20:11 +051800] conn=12 op=2 UNBIND > [20/May/2011:15:20:11 +051800] conn=12 op=2 fd=74 closed - U1 > [20/May/2011:15:20:11 +051800] conn=10 op=2 UNBIND > [20/May/2011:15:20:11 +051800] conn=10 op=2 fd=72 closed - U1 > > > From errors log ; > [20/May/2011:14:36:24 +051800] - PR_Write(64) Netscape Portable Runtime error > -5961 (TCP connection reset by peer.) > [20/May/2011:15:13:50 +051800] - slapd shutting down - signaling operation > threads > [20/May/2011:15:13:50 +051800] - slapd shutting down - closing down internal > subsystems and plugins > [20/May/2011:15:13:52 +051800] - Waiting for 4 database threads to stop > [20/May/2011:15:13:52 +051800] - All database threads now stopped > [20/May/2011:15:13:53 +051800] - slapd stopped. > [20/May/2011:15:13:56 +051800] - 389-Directory/1.2.8.3 B2011.123.1759 starting > up > [20/May/2011:15:13:56 +051800] - slapd started. Listening on All Interfaces > port 389 for LDAP requests > [20/May/2011:15:13:56 +051800] - Listening on All Interfaces port 636 for LDAPS > requests > > > Request you to please guide here. > > Thanks and Regards, > Amita
When you hit the bug, are you able to, in another window, run ldapsearch against the directory server? If so, then the server is not hung, and you have not reproduced the problem. I tried running the script. For one, on my system, I got a lot of errors like this: /share/scripts/bug668619.sh: fork: retry: Resource temporarily unavailable For another - the script runs with -E pr=5 - by default, ldapsearch will prompt for the next page, so I have several hundred ldapsearch processes waiting for input on stdin - is this right? I guess the purpose of the test is to leave several hundred paged searches unfinished? Otherwise, they might finish before the timeout is hit.
Hi Rich and Noriko, I am using attached script - "script to reproduce the problem" to verify. 1). I am getting below after some time : ldap_result: Can't contact LDAP server (-1) # extended LDIF # # LDAPv3 # base <dc=example,dc=com> with scope subtree # filter: (cn=*) # requesting: dn # with pagedResults control: size=5 # ldap_result: Can't contact LDAP server (-1) # extended LDIF # # LDAPv3 # base <dc=example,dc=com> with scope subtree # filter: (cn=*) # requesting: dn # with pagedResults control: size=5 # ldap_result: Can't contact LDAP server (-1) # extended LDIF # # LDAPv3 # base <dc=example,dc=com> with scope subtree # filter: (cn=*) # requesting: dn # with pagedResults control: size=5 # ldap_result: Can't contact LDAP server (-1) Also I was not able to come out of this loop, It was infinite. Control+C was fail. 2. I am able to execute ldapsearch on another window successfully. 3. Access logs shows some T2 : [07/Jul/2011:14:15:22 +051800] conn=142 op=-1 fd=164 closed - T2 [07/Jul/2011:14:15:22 +051800] conn=141 op=-1 fd=163 closed - T2 [07/Jul/2011:14:15:22 +051800] conn=140 op=-1 fd=162 closed - T2 [07/Jul/2011:14:15:22 +051800] conn=139 op=-1 fd=136 closed - T2 [07/Jul/2011:14:15:22 +051800] conn=138 op=-1 fd=161 closed - T2 [07/Jul/2011:14:15:22 +051800] conn=137 op=-1 fd=160 closed - T2 [07/Jul/2011:14:15:22 +051800] conn=136 op=-1 fd=159 closed - T2 [07/Jul/2011:14:15:22 +051800] conn=135 op=-1 fd=158 closed - T2 [07/Jul/2011:14:15:22 +051800] conn=134 op=-1 fd=157 closed - T2 [07/Jul/2011:14:15:22 +051800] conn=133 op=-1 fd=156 closed - T2 [07/Jul/2011:14:15:22 +051800] conn=131 op=-1 fd=154 closed - T2 [07/Jul/2011:14:15:22 +051800] conn=130 op=-1 fd=153 closed - T2 [07/Jul/2011:14:15:22 +051800] conn=129 op=-1 fd=152 closed - T2 [07/Jul/2011:14:15:22 +051800] conn=128 op=-1 fd=151 closed - T2 [07/Jul/2011:14:15:22 +051800] conn=127 op=-1 fd=150 closed - T2 [07/Jul/2011:14:15:22 +051800] conn=126 op=-1 fd=149 closed - T2 [07/Jul/2011:14:15:22 +051800] conn=125 op=-1 fd=148 closed - T2 [07/Jul/2011:14:15:22 +051800] conn=124 op=-1 fd=147 closed - T2 [07/Jul/2011:14:15:22 +051800] conn=123 op=-1 fd=146 closed - T2 [07/Jul/2011:14:15:22 +051800] conn=121 op=-1 fd=145 closed - T2 [07/Jul/2011:14:15:22 +051800] conn=120 op=-1 fd=144 closed - T2 [07/Jul/2011:14:15:22 +051800] conn=119 op=-1 fd=143 closed - T2 [07/Jul/2011:14:15:22 +051800] conn=118 op=-1 fd=142 closed - T2 [07/Jul/2011:14:15:22 +051800] conn=117 op=-1 fd=141 closed - T2 [07/Jul/2011:14:15:22 +051800] conn=116 op=-1 fd=140 closed - T2 [07/Jul/2011:14:15:22 +051800] conn=115 op=-1 fd=139 closed - T2 [07/Jul/2011:14:15:22 +051800] conn=114 op=-1 fd=138 closed - T2 [07/Jul/2011:14:15:22 +051800] conn=113 op=-1 fd=137 closed - T2 [07/Jul/2011:14:15:22 +051800] conn=112 op=-1 fd=112 closed - T2 [07/Jul/2011:14:15:22 +051800] conn=110 op=-1 fd=135 closed - T2 [07/Jul/2011:14:15:22 +051800] conn=109 op=-1 fd=134 closed - T2 [07/Jul/2011:14:15:22 +051800] conn=108 op=-1 fd=133 closed - T2 [07/Jul/2011:14:15:22 +051800] conn=106 op=-1 fd=131 closed - T2 Is it as expected and can be marked as VERIFIED? I am testing with below build -- [root@rheltest ~]# rpm -qa | grep 389 389-console-1.1.7-1.el6.noarch 389-ds-console-doc-1.2.6-1.el6.noarch 389-ds-base-devel-1.2.8.2-1.el6_1.6.x86_64 389-adminutil-1.1.14-1.el6.x86_64 389-admin-console-1.1.7-1.el6.noarch 389-adminutil-devel-1.1.14-1.el6.x86_64 389-ds-base-debuginfo-1.2.8.2-1.el6_1.3.x86_64 389-ds-base-libs-1.2.8.2-1.el6_1.6.x86_64 389-ds-base-1.2.8.2-1.el6_1.6.x86_64 389-ds-1.2.1-2.el6.noarch 389-admin-1.1.20-1.el6.x86_64 389-admin-console-doc-1.1.7-1.el6.noarch 389-ds-console-1.2.6-1.el6.noarch [root@rheltest ~]# rpm -qa | grep redhat redhat-ds-admin-9.0.0-0.2.el6.x86_64 redhat-idm-console-9.0.0-1.el6.x86_64 redhat-logos-60.0.14-1.el6.noarch redhat-menus-14.0.0-3.el6.noarch redhat-ds-console-9.0.0-1.el6.noarch redhat-ds-console-doc-9.0.0-1.el6.noarch redhat-lsb-4.0-3.el6.x86_64 redhat-ds-base-9.0.0-0.17.el6dsrv.x86_64 redhat-admin-console-9.0.0-1.el6.noarch redhat-admin-console-doc-9.0.0-1.el6.noarch redhat-bookmarks-6-1.el6.noarch redhat-lsb-graphics-4.0-3.el6.x86_64 redhat-lsb-printing-4.0-3.el6.x86_64 redhat-ds-9.0.0-1.el6.x86_64 redhat-indexhtml-6-1.el6.noarch redhat-release-server-6Server-6.1.0.2.el6.x86_64
(In reply to comment #30) > Hi Rich and Noriko, > > I am using attached script - "script to reproduce the problem" to verify. > > 1). I am getting below after some time : > ldap_result: Can't contact LDAP server (-1) > # extended LDIF > # > # LDAPv3 > # base <dc=example,dc=com> with scope subtree > # filter: (cn=*) > # requesting: dn > # with pagedResults control: size=5 > # > > ldap_result: Can't contact LDAP server (-1) > # extended LDIF > # > # LDAPv3 > # base <dc=example,dc=com> with scope subtree > # filter: (cn=*) > # requesting: dn > # with pagedResults control: size=5 > # > > ldap_result: Can't contact LDAP server (-1) > # extended LDIF > # > # LDAPv3 > # base <dc=example,dc=com> with scope subtree > # filter: (cn=*) > # requesting: dn > # with pagedResults control: size=5 > # > > ldap_result: Can't contact LDAP server (-1) > > Also I was not able to come out of this loop, It was infinite. > Control+C was fail. > > 2. I am able to execute ldapsearch on another window successfully. Ok. Then the bug is fixed. The other things are ok, as long as you can connect to the server from another process. > > 3. Access logs shows some T2 : > [07/Jul/2011:14:15:22 +051800] conn=142 op=-1 fd=164 closed - T2 > [07/Jul/2011:14:15:22 +051800] conn=141 op=-1 fd=163 closed - T2 > [07/Jul/2011:14:15:22 +051800] conn=140 op=-1 fd=162 closed - T2 > [07/Jul/2011:14:15:22 +051800] conn=139 op=-1 fd=136 closed - T2 > [07/Jul/2011:14:15:22 +051800] conn=138 op=-1 fd=161 closed - T2 > [07/Jul/2011:14:15:22 +051800] conn=137 op=-1 fd=160 closed - T2 > [07/Jul/2011:14:15:22 +051800] conn=136 op=-1 fd=159 closed - T2 > [07/Jul/2011:14:15:22 +051800] conn=135 op=-1 fd=158 closed - T2 > [07/Jul/2011:14:15:22 +051800] conn=134 op=-1 fd=157 closed - T2 > [07/Jul/2011:14:15:22 +051800] conn=133 op=-1 fd=156 closed - T2 > [07/Jul/2011:14:15:22 +051800] conn=131 op=-1 fd=154 closed - T2 > [07/Jul/2011:14:15:22 +051800] conn=130 op=-1 fd=153 closed - T2 > [07/Jul/2011:14:15:22 +051800] conn=129 op=-1 fd=152 closed - T2 > [07/Jul/2011:14:15:22 +051800] conn=128 op=-1 fd=151 closed - T2 > [07/Jul/2011:14:15:22 +051800] conn=127 op=-1 fd=150 closed - T2 > [07/Jul/2011:14:15:22 +051800] conn=126 op=-1 fd=149 closed - T2 > [07/Jul/2011:14:15:22 +051800] conn=125 op=-1 fd=148 closed - T2 > [07/Jul/2011:14:15:22 +051800] conn=124 op=-1 fd=147 closed - T2 > [07/Jul/2011:14:15:22 +051800] conn=123 op=-1 fd=146 closed - T2 > [07/Jul/2011:14:15:22 +051800] conn=121 op=-1 fd=145 closed - T2 > [07/Jul/2011:14:15:22 +051800] conn=120 op=-1 fd=144 closed - T2 > [07/Jul/2011:14:15:22 +051800] conn=119 op=-1 fd=143 closed - T2 > [07/Jul/2011:14:15:22 +051800] conn=118 op=-1 fd=142 closed - T2 > [07/Jul/2011:14:15:22 +051800] conn=117 op=-1 fd=141 closed - T2 > [07/Jul/2011:14:15:22 +051800] conn=116 op=-1 fd=140 closed - T2 > [07/Jul/2011:14:15:22 +051800] conn=115 op=-1 fd=139 closed - T2 > [07/Jul/2011:14:15:22 +051800] conn=114 op=-1 fd=138 closed - T2 > [07/Jul/2011:14:15:22 +051800] conn=113 op=-1 fd=137 closed - T2 > [07/Jul/2011:14:15:22 +051800] conn=112 op=-1 fd=112 closed - T2 > [07/Jul/2011:14:15:22 +051800] conn=110 op=-1 fd=135 closed - T2 > [07/Jul/2011:14:15:22 +051800] conn=109 op=-1 fd=134 closed - T2 > [07/Jul/2011:14:15:22 +051800] conn=108 op=-1 fd=133 closed - T2 > [07/Jul/2011:14:15:22 +051800] conn=106 op=-1 fd=131 closed - T2 > > Is it as expected and can be marked as VERIFIED? > > I am testing with below build -- > [root@rheltest ~]# rpm -qa | grep 389 > 389-console-1.1.7-1.el6.noarch > 389-ds-console-doc-1.2.6-1.el6.noarch > 389-ds-base-devel-1.2.8.2-1.el6_1.6.x86_64 > 389-adminutil-1.1.14-1.el6.x86_64 > 389-admin-console-1.1.7-1.el6.noarch > 389-adminutil-devel-1.1.14-1.el6.x86_64 > 389-ds-base-debuginfo-1.2.8.2-1.el6_1.3.x86_64 > 389-ds-base-libs-1.2.8.2-1.el6_1.6.x86_64 > 389-ds-base-1.2.8.2-1.el6_1.6.x86_64 > 389-ds-1.2.1-2.el6.noarch > 389-admin-1.1.20-1.el6.x86_64 > 389-admin-console-doc-1.1.7-1.el6.noarch > 389-ds-console-1.2.6-1.el6.noarch > [root@rheltest ~]# rpm -qa | grep redhat > redhat-ds-admin-9.0.0-0.2.el6.x86_64 > redhat-idm-console-9.0.0-1.el6.x86_64 > redhat-logos-60.0.14-1.el6.noarch > redhat-menus-14.0.0-3.el6.noarch > redhat-ds-console-9.0.0-1.el6.noarch > redhat-ds-console-doc-9.0.0-1.el6.noarch > redhat-lsb-4.0-3.el6.x86_64 > redhat-ds-base-9.0.0-0.17.el6dsrv.x86_64 > redhat-admin-console-9.0.0-1.el6.noarch > redhat-admin-console-doc-9.0.0-1.el6.noarch > redhat-bookmarks-6-1.el6.noarch > redhat-lsb-graphics-4.0-3.el6.x86_64 > redhat-lsb-printing-4.0-3.el6.x86_64 > redhat-ds-9.0.0-1.el6.x86_64 > redhat-indexhtml-6-1.el6.noarch > redhat-release-server-6Server-6.1.0.2.el6.x86_64
Based on comment#31, I am marking the bug as VERIFIED.
This comment was flagged a spam, view the edit history to see the original text if required.
The script is executed with the option -E pr=5; by default, ldapsearch will request the subsequent page; thus, I have hundreds of ldapsearch processes listening for input on stdin; is this correct? The point of the exam, I take it, is to get participants to abandon searches spanning hundreds of pages. If not, they may wrap up before the timer goes out.https://dordle.io