Bug 1974242

Summary: Paged search impacts performance [9.3.0]
Product: Red Hat Enterprise Linux 9 Reporter: thierry bordaz <tbordaz>
Component: 389-ds-baseAssignee: Pierre Rogier <progier>
Status: CLOSED ERRATA QA Contact: LDAP QA Team <idm-ds-qe-bugs>
Severity: high Docs Contact: Evgenia Martynyuk <emartyny>
Priority: high    
Version: 9.1CC: bsmejkal, emartyny, idm-ds-dev-bugs, jonmoore, mreynolds, mrhodes, pasik, progier, tbordaz, tmihinto, vashirov
Target Milestone: rcKeywords: Reopened, Triaged, ZStream
Target Release: 9.3   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: sync-to-jira
Fixed In Version: 389-ds-base-2.3.4-3.el9 Doc Type: Bug Fix
Doc Text:
.Paged searches from a regular user now do not impact performance Previously, when Directory Server was under the search load, paged searches from a regular user could impact the server performance because a lock conflicted with the thread that polls for network events. In addition, if a network issue occurred while sending the page search, the whole server was unresponsive until the `nsslapd-iotimeout` parameter expired. With this update, the lock was split into several parts to avoid the contention with the network events. As a result, no performance impact during paged searches from a regular user.
Story Points: ---
Clone Of:
: 2224505 2224507 2231841 2251374 (view as bug list) Environment:
Last Closed: 2023-11-07 08:25:17 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2224505, 2224507, 2231841, 2251374, 2265544    

Description thierry bordaz 2021-06-21 07:38:40 UTC
Description of problem:
When a server is under search load, a paged search from a regular user impacts the performance.
Same paged search requested by DM has much smaller impact.

Version-Release number of selected component (if applicable):
since 7.x

How reproducible:
systematic

Steps to Reproduce:
    Create a db with 40000 users
    Run search load using ldclt -D "uid=test,ou=people,dc=example,dc=com" -w test -e bindeach,esearch -b "ou=people,dc=example,dc=com" -f "uid=00001"
    While the search load is running, run paged search in a loop that requests all ids:
    while : ; do ldapsearch -D "uid=test,ou=people,dc=example,dc=com" -w test -b dc=example,dc=com 'uid=*' -E pr=100/noprompt; done


Actual results:
ldclt[1353]: Average rate: 2415.70/thr  (2415.70/sec), total:  24157 -- only ldclt is running 
ldclt[1353]: Average rate: 2342.50/thr  (2342.50/sec), total:  23425
ldclt[1353]: Average rate: 1048.90/thr  (1048.90/sec), total:  10489 -\
ldclt[1353]: Average rate:  413.10/thr  ( 413.10/sec), total:   4131   | paged search from a regular user
ldclt[1353]: Average rate:  461.00/thr  ( 461.00/sec), total:   4610   |
ldclt[1353]: Average rate: 1759.30/thr  (1759.30/sec), total:  17593 -/
ldclt[1353]: Average rate: 2374.70/thr  (2374.70/sec), total:  23747
ldclt[1353]: Average rate: 1952.70/thr  (1952.70/sec), total:  19527 -\
ldclt[1353]: Average rate: 1783.00/thr  (1783.00/sec), total:  17830   | paged search from DM
ldclt[1353]: Average rate: 1749.70/thr  (1749.70/sec), total:  17497 -/
ldclt[1353]: Average rate: 2319.80/thr  (2319.80/sec), total:  23198
ldclt[1353]: Average rate: 2378.80/thr  (2378.80/sec), total:  23788

Expected results:
There shouldn't be a significant drop in performance.

Additional info:
@wisebaldone who reported the issue, also mentioned that 1.2.11.15-48 doesn't have the issue and 1.2.11.15-97 does have.

Comment 3 RHEL Program Management 2022-12-21 07:27:55 UTC
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.

Comment 10 RHEL Program Management 2023-06-21 07:28:16 UTC
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.

Comment 11 Pierre Rogier 2023-07-17 07:56:29 UTC
Reopening the bug as root cause is finally understood and a fix is in progress.

Comment 13 Pierre Rogier 2023-07-17 08:10:55 UTC
The issue is due to a very small lock contention impacting 0.3% of the server CPU time and 5% of the listening thread time but that was enough to decrease the performance by 60%.
Even worse we have seen case (after a network issue (tcp router restarted)) were the server was fully unresponsive until the nsslapd-ioblocktimeout expired.

Comment 24 errata-xmlrpc 2023-11-07 08:25:17 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (389-ds-base bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:6350