I noticed that FDS (1.0.4 and 1.1) is not able to handle the close of a LdapContext, if you use Control objects from the JNDI/LDAP Booster Pack, like VirtualListViewControl or SortControl, instantiated with the Control.CRITICAL. When you close the context, you get the following entry in the access logs: [09/Jun/2008:13:48:02 -0400] conn=278 op=3 UNBIND, error processing controls - error 12 (Unavailable critical extension). This ldapconnection will not be closed and stays open forever..... Here is the code: LdapContext ctx = new InitialLdapContext(); SortKey key = new SortKey("sn"); Control control = new SortControl(new SortKey[] { key }, Control.CRITICAL); Control[] controls = { control }; ctx.setRequestControls(controls); ctx.close(); BTW, if you use Control.NONCRITICAL, it works....
Any ideas, when this will be fixed? I compiled and build FedoraDirSvr111 yesterday and the CRITICAL option is still not supported (which is obvious), but I also tried the NONCRITICAL option (which works for 104 and 110. With 111 this does not work and I see the following in access: [10/Jun/2008:13:22:16 -0400] conn=0 op=1 VLV 1:0:0:0 1:11468 (0) [10/Jun/2008:13:22:16 -0400] conn=0 op=1 RESULT err=0 tag=101 nentries=1 etime=0 [10/Jun/2008:13:22:16 -0400] conn=0 op=2 ABANDON targetop=NOTFOUND msgid=2 [10/Jun/2008:13:22:16 -0400] conn=0 op=3 UNBIND [10/Jun/2008:13:22:16 -0400] conn=0 op=3 fd=64 closed - U1 You see that the connection is closed, but the client throws the ABANDON is.
(In reply to comment #1) > Any ideas, when this will be fixed? No, not sure. > > I compiled and build FedoraDirSvr111 yesterday and the CRITICAL option is still > not supported (which is obvious), but I also tried the NONCRITICAL option (which > works for 104 and 110. With 111 this does not work and I see the following in > access: > [10/Jun/2008:13:22:16 -0400] conn=0 op=1 VLV 1:0:0:0 1:11468 (0) > [10/Jun/2008:13:22:16 -0400] conn=0 op=1 RESULT err=0 tag=101 nentries=1 etime=0 > [10/Jun/2008:13:22:16 -0400] conn=0 op=2 ABANDON targetop=NOTFOUND msgid=2 > [10/Jun/2008:13:22:16 -0400] conn=0 op=3 UNBIND > [10/Jun/2008:13:22:16 -0400] conn=0 op=3 fd=64 closed - U1 > > You see that the connection is closed, but the client throws the ABANDON is. So 111 makes this problem worse somehow? I'm not aware of any changes around this area. In 110, what does the access log look like for this case? Note that the connection should not hang around forever, if it is idle - the server will automatically close idle connections after some period of time (can't remember the setting, but it's in cn=config).
[10/Jun/2008:13:23:20 -0400] conn=2 op=2 SORT -modifyTimestamp [10/Jun/2008:13:23:20 -0400] conn=2 op=2 VLV 1:0:0:0 1:11468 (0) [10/Jun/2008:13:23:20 -0400] conn=2 op=2 RESULT err=0 tag=101 nentries=1 etime=0 [10/Jun/2008:13:23:20 -0400] conn=2 op=3 UNBIND, error processing controls - err or 12 (Unavailable critical extension) I guess you mean nslapd-idletimeout. Yes, this is set to 0, which leaves it open....
Rich, I did some tests with nsslapd-idletimeout. The server closes the connection, when no critical extensions were used and I leave the connection just open (idle). It does not close those hanging connections, though....
Yes, that's correct. I forgot - idle connection processing is only done when processing new connections. If you use ldapsearch or something like that, that should trigger the cleanup of the idle connections.
Any progress on this one? Let me know, when you intend to fix this one. Thanks
No progress. It could be a couple of months before we get around to this. What is the severity? Does decreasing the idle timeout clean up those connections fast enough?
I did some further tests with our applications. I think I can live with getting this fixed in a timeframe of 2 month.
There are 2 issues in this bug. 1) Unavailable critical extension and 2) the connection is not closed even after the client exits and the server logs "closed". 1) Unavailable critical extention I'm testing the DS from the latest source code with the LDAP C SDK example srvrsort.c with this small modification to test both critical and noncritical. 88 /* Create the sort control. */ 89 #if 1 90 rc = ldap_create_sort_control( ld, sortkeylist, 1, &sortctrl ); 91 #else 92 rc = ldap_create_sort_control( ld, sortkeylist, 0, &sortctrl ); 93 #endif I had to increase the values of these config parameters to adjust my test database size. Otherwise, the operation failed with err=12 (Unavailable critical extention). cn=config: nsslapd-sizelimit: 100000 dn: cn=config,cn=ldbm database,cn=plugins,cn=config nsslapd-lookthroughlimit: 100000 nsslapd-idlistscanlimit: 100000 With the modification, the test program srvrsort successfully ran both marked critical mode (== 1) and noncritical (== 0). [..] conn=1 fd=64 slot=64 connection from 10.14.54.100 to 10.14.54.100 [..] conn=1 op=0 BIND dn="uid=TVradmin0, ou=Accounting, dc=example,dc=com" method=128 version=3 [..] conn=1 op=0 RESULT err=0 tag=97 nentries=0 etime=0 dn="uid=tvradmin0,ou=accounting,dc=example,dc=com" [..] conn=1 op=1 SRCH base="dc=example,dc=com" scope=2 filter="(objectClass=person)" attrs=ALL [..] conn=1 op=1 SORT description -givenName (10000) [..] conn=1 op=1 RESULT err=0 tag=101 nentries=10000 etime=21 notes=U [..] conn=1 op=2 UNBIND [..] conn=1 op=2 fd=64 closed - U1 2) I set the default value 4000 to nsslapd-idlistscanlimit to make the test fail. Then, the connection remains for a while: tcp 0 0 <host>.redhat.c:43243 <host>.redhat.c:10389 TIME_WAIT I'm not setting nsslapd-idletimeout. Thus, the value should be 0 (== never). But the leftover connection is cleaned up in a minute or so. So, my behavior is a little different from the reported one. May I ask you to run "netstat -a" and check the state of the connection?
http://www.ietf.org/rfc/rfc4511.txt 4.1.11. Controls "The criticality field only has meaning in controls attached to request messages (except UnbindRequest). For controls attached to response messages and the UnbindRequest, the criticality field SHOULD be FALSE, and MUST be ignored by the receiving protocol peer." In the case of the UnbindRequest, the receiving protocol peer is the directory server, not the client. So I think the correct behavior is for the server to ignore the control criticality when processing the unbind request.
Created attachment 333433 [details] python-ldap test script test script to reproduce the problem Uses dsadmin.py from http://github.com/richm/scripts/tree/master in conjunction with python-ldap - could probably be made to work without dsadmin.py When you run the script, you can see the number of connections keeps going up.
Created attachment 333434 [details] patch Running the test script with the patch, the unbind request is processed, and the number of connections does not increase
Created attachment 333507 [details] cvs commit log Reviewed by: nkinder (Thanks!) Fix Description: When an unbind request is received that has controls marked as critical, and we do not support those controls, we must treat them as non-critical controls and continue processing. Platforms tested: RHEL4, RHEL5 Flag Day: no Doc impact: no
Test with the python script that Rich provided. [the dsadmin.py is here: http://github.com/richm/scripts/blob/702a161b5ad83c47d03ff5ac37ab52dcf691e16f/dsadmin.py ] =========== running result : =================== [root@mv32a-vm ~]# python ./bug333433.py instdir= /usr/lib/dirsrv/slapd-mv32a-vm dn: cn=config nsslapd-certdir: /etc/dirsrv/slapd-mv32a-vm nsslapd-errorlog: /var/log/dirsrv/slapd-mv32a-vm/errors nsslapd-instancedir:: nsslapd-schemadir: /etc/dirsrv/slapd-mv32a-vm/schema show active connections . . . dn: cn=monitor connection: 64:20090415223534Z:6:6::cn=directory manager connection: 65:20090416213745Z:4:3::cn=directory manager currentconnections: 2 start search request . . . search returned 6 entries send abandon with controls . . . send abandon without controls . . . send unbind with controls . . . try a search after the unbind . . . caught exception LDAP connection invalid open new connection . . . instdir= /usr/lib/dirsrv/slapd-mv32a-vm dn: cn=config nsslapd-certdir: /etc/dirsrv/slapd-mv32a-vm nsslapd-errorlog: /var/log/dirsrv/slapd-mv32a-vm/errors nsslapd-instancedir:: nsslapd-schemadir: /etc/dirsrv/slapd-mv32a-vm/schema show active connections . . . dn: cn=monitor connection: 64:20090415223534Z:6:6::cn=directory manager connection: 66:20090416213745Z:4:3::cn=directory manager currentconnections: 2 ====> result evaluation: since the before and after the abandon request is same, this test is consider as pass.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHEA-2009-0455.html