Note: This bug is displayed in read-only format because
the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Description of problem:
client psearch with multiple threads hangs if nsslapd-maxthreadsperconn is under sized.
the situation does NOT happen on RHEL-5 with RHDS-8.2 , and seem related to the change:
https://github.com/389ds/389-ds-base/commit/bbe198d8406b64b06bc52e21f378eb0d96e140f8https://github.com/389ds/389-ds-base/issues/719
"Add a warning message when a connection hits the max number of threads"
there was a decision at that time, to only have the statistic and status information, not to do anything to a connection, as it was only delayed, "harmless", along with a sizing configuration issue on the server side.
But this is different with the persistent search connections.
With this feature, the server is stuck, and hangs the client, so it becomes a problem.
Version-Release number of selected component (if applicable):
RHEL-8.2 with 389-ds-base-1.4.2.12-3.module+el8dsrv+6923+6ab1d5c5.x86_64
RHEL-7.8 with 389-ds-base-1.3.10.2-6.el7.x86_64
How reproducible:
on demand
Steps to Reproduce:
1. Have a LDAP server up and runing
2. tune nsslapd-maxthreadsperconn to a very small value, e.g. 2
dn: cn=config
changetype: modify
replace: nsslapd-maxthreadsperconn
nsslapd-maxthreadsperconn: 2
3. set some data
dn: ou=sf02712629,dc=example,dc=test
objectClass: organizationalUnit
objectClass: top
ou: people
ou: sf02712629
and
dn: ou=sf02712629,dc=example,dc=test
changetype: modify
replace: aci
aci: (targetattr="*")(version 3.0; acl "all access"; allow (all) userdn="ldap:///anyone";)
4. run a LDAP client to run persistent searches in a loop, set with more threads in the connection than the and nsslapd-maxthreadsperconn, and with a control for ADD, DEL
or run the test Java client provided:
./build.sh
time ./run.sh 2>&1 | tee ./out.txt
Actual results:
the Java client hangs quickly, the conn thread is in blocked state from cn=cmonitor
Expected results:
yes
Additional info:
ldapsearch -o ldif-wrap=no -xLLLD "cn=directory manager" -W -b cn=monitor -s base connection threads currentconnections currentconnectionsatmaxthreads maxthreadsperconnhits opsinitiated opscompleted dtablesize
I had connections hitting the max thread number per connection, and blocked, for example:
connection: 64:20201007184004Z:20806:20805:r:NULLDN:1:15:119191:150:ip=192.168.122.51
which means the conn id 150 is in the "max thread state", hit the max thread value 15 times, and had operation attempts blocked 119191 times.
then if I tune up nsslapd-maxthreadsperconn on the fly to 100 , the Java client "resumes" ( do not remember if server or client started to speak first), and progresses again, the corresponding connection is no longer in a "max thread state"
in a TCP dump, the last LDAP activity is from the LDAP client:
47 1 192.168.122.51 -> 192.168.122.51 LDAP 212 searchRequest(10) "ou=sf02712629,dc=example,dc=test" singleLevel
Frame 47: 212 bytes on wire (1696 bits), 212 bytes captured (1696 bits)
Linux cooked capture
Internet Protocol Version 4, Src: 192.168.122.51 (192.168.122.51), Dst: 192.168.122.51 (192.168.122.51)
Transmission Control Protocol, Src Port: 44126 (44126), Dst Port: ldap (389), Seq: 947, Ack: 523, Len: 144
Lightweight Directory Access Protocol
LDAPMessage searchRequest(10) "ou=sf02712629,dc=example,dc=test" singleLevel
messageID: 10
protocolOp: searchRequest (3)
searchRequest
baseObject: ou=sf02712629,dc=example,dc=test
scope: singleLevel (1)
derefAliases: derefAlways (3)
sizeLimit: 0
timeLimit: 0
typesOnly: False
Filter: (objectclass=*)
filter: present (7)
present: objectclass
attributes: 0 items
controls: 2 items
Control
controlType: 2.16.840.1.113730.3.4.3 (joint-iso-itu-t.16.840.1.113730.3.4.3)
criticality: True
controlValue: 300902010f0101ff0101ff
Control
controlType: 2.16.840.1.113730.3.4.2 (joint-iso-itu-t.16.840.1.113730.3.4.2)
48 1 192.168.122.51 -> 192.168.122.51 TCP 68 ldap > 44126 [ACK] Seq=523 Ack=1091 Win=46976 Len=0 TSval=3694135261 TSecr=3694135261
Frame 48: 68 bytes on wire (544 bits), 68 bytes captured (544 bits)
Linux cooked capture
Internet Protocol Version 4, Src: 192.168.122.51 (192.168.122.51), Dst: 192.168.122.51 (192.168.122.51)
Transmission Control Protocol, Src Port: ldap (389), Dst Port: 44126 (44126), Seq: 523, Ack: 1091, Len: 0
49 7 192.168.122.51 -> 192.168.122.51 TCP 68 44126 > ldap [FIN, ACK] Seq=1091 Ack=523 Win=43776 Len=0 TSval=3694140820 TSecr=3694135261
Frame 49: 68 bytes on wire (544 bits), 68 bytes captured (544 bits)
Linux cooked capture
Internet Protocol Version 4, Src: 192.168.122.51 (192.168.122.51), Dst: 192.168.122.51 (192.168.122.51)
Transmission Control Protocol, Src Port: 44126 (44126), Dst Port: ldap (389), Seq: 1091, Ack: 523, Len: 0
50 7 192.168.122.51 -> 192.168.122.51 TCP 68 ldap > 44126 [ACK] Seq=523 Ack=1092 Win=46976 Len=0 TSval=3694140860 TSecr=3694140820
Frame 50: 68 bytes on wire (544 bits), 68 bytes captured (544 bits)
Linux cooked capture
Internet Protocol Version 4, Src: 192.168.122.51 (192.168.122.51), Dst: 192.168.122.51 (192.168.122.51)
Transmission Control Protocol, Src Port: ldap (389), Dst Port: 44126 (44126), Seq: 523, Ack: 1092, Len: 0
(END)
less /var/log/dirsrv/slapd-m1/access
...
[13/Oct/2020:22:10:58.059364477 +0000] conn=274 fd=66 slot=66 connection from 192.168.122.51 to 192.168.122.51
[13/Oct/2020:22:10:58.062426524 +0000] conn=274 op=0 BIND dn="" method=128 version=3
[13/Oct/2020:22:10:58.062488982 +0000] conn=274 op=0 RESULT err=0 tag=97 nentries=0 etime=0.000132738 dn=""
[13/Oct/2020:22:10:58.066787227 +0000] conn=274 op=1 SRCH base="ou=sf02712629,dc=example,dc=test" scope=0 filter="(objectClass=*)" attrs=ALL
[13/Oct/2020:22:10:58.068677808 +0000] conn=274 op=1 RESULT err=0 tag=101 nentries=1 etime=0.002671829
[13/Oct/2020:22:10:58.082943767 +0000] conn=274 op=2 SRCH base="ou=sf02712629,dc=example,dc=test" scope=0 filter="(objectClass=*)" attrs=ALL
[13/Oct/2020:22:10:58.083571509 +0000] conn=274 op=2 RESULT err=0 tag=101 nentries=1 etime=0.001513465
[13/Oct/2020:22:10:58.087329407 +0000] conn=274 op=3 SRCH base="ou=sf02712629,dc=example,dc=test" scope=0 filter="(objectClass=*)" attrs=ALL
[13/Oct/2020:22:10:58.088454889 +0000] conn=274 op=3 RESULT err=0 tag=101 nentries=1 etime=0.002549361
[13/Oct/2020:22:10:58.089116514 +0000] conn=274 op=4 SRCH base="ou=sf02712629,dc=example,dc=test" scope=1 filter="(objectClass=*)" attrs=ALL options=persistent
[13/Oct/2020:22:10:58.128375675 +0000] conn=274 op=5 SRCH base="ou=sf02712629,dc=example,dc=test" scope=0 filter="(objectClass=*)" attrs=ALL
[13/Oct/2020:22:10:58.128909872 +0000] conn=274 op=5 RESULT err=0 tag=101 nentries=1 etime=0.001004205
[13/Oct/2020:22:10:58.134667523 +0000] conn=274 op=6 SRCH base="ou=sf02712629,dc=example,dc=test" scope=1 filter="(objectClass=*)" attrs=ALL options=persistent
[13/Oct/2020:22:10:58.136104758 +0000] conn=274 op=7 SRCH base="ou=sf02712629,dc=example,dc=test" scope=1 filter="(objectClass=*)" attrs=ALL options=persistent
(END)
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory (389-ds-base bug fix and enhancement update), and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHBA-2021:4203