Bug 1974242 - Paged search impacts performance
Summary: Paged search impacts performance
Keywords:
Status: VERIFIED
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: 389-ds-base
Version: 9.1
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: 9.3
Assignee: Pierre Rogier
QA Contact: LDAP QA Team
URL:
Whiteboard: sync-to-jira
Depends On:
Blocks: 2224505 2224507 2231841
TreeView+ depends on / blocked
 
Reported: 2021-06-21 07:38 UTC by thierry bordaz
Modified: 2023-08-14 12:07 UTC (History)
10 users (show)

Fixed In Version: 389-ds-base-2.3.4-3.el9
Doc Type: Bug Fix
Doc Text:
Cause: When sending paged results some lock contention occurs with the thread that polls for network events Consequence: The performances drops by a 4 to 5 factor when page search occurs. Another consequence is that if a network issue occurs while sending page search, the whole server may get unresponsive until nsslapd-iotimeout expires. Fix: The lock has been split in several ones to avoid the contention. Result: No more performance impact when page search are performed
Clone Of:
: 2224505 2224507 2231841 (view as bug list)
Environment:
Last Closed: 2023-06-21 07:28:16 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github 389ds 389-ds-base issues 4551 0 None open Paged search impacts performance 2021-06-21 07:39:36 UTC
Red Hat Issue Tracker IDMDS-3476 0 None None None 2023-07-26 11:35:52 UTC

Description thierry bordaz 2021-06-21 07:38:40 UTC
Description of problem:
When a server is under search load, a paged search from a regular user impacts the performance.
Same paged search requested by DM has much smaller impact.

Version-Release number of selected component (if applicable):
since 7.x

How reproducible:
systematic

Steps to Reproduce:
    Create a db with 40000 users
    Run search load using ldclt -D "uid=test,ou=people,dc=example,dc=com" -w test -e bindeach,esearch -b "ou=people,dc=example,dc=com" -f "uid=00001"
    While the search load is running, run paged search in a loop that requests all ids:
    while : ; do ldapsearch -D "uid=test,ou=people,dc=example,dc=com" -w test -b dc=example,dc=com 'uid=*' -E pr=100/noprompt; done


Actual results:
ldclt[1353]: Average rate: 2415.70/thr  (2415.70/sec), total:  24157 -- only ldclt is running 
ldclt[1353]: Average rate: 2342.50/thr  (2342.50/sec), total:  23425
ldclt[1353]: Average rate: 1048.90/thr  (1048.90/sec), total:  10489 -\
ldclt[1353]: Average rate:  413.10/thr  ( 413.10/sec), total:   4131   | paged search from a regular user
ldclt[1353]: Average rate:  461.00/thr  ( 461.00/sec), total:   4610   |
ldclt[1353]: Average rate: 1759.30/thr  (1759.30/sec), total:  17593 -/
ldclt[1353]: Average rate: 2374.70/thr  (2374.70/sec), total:  23747
ldclt[1353]: Average rate: 1952.70/thr  (1952.70/sec), total:  19527 -\
ldclt[1353]: Average rate: 1783.00/thr  (1783.00/sec), total:  17830   | paged search from DM
ldclt[1353]: Average rate: 1749.70/thr  (1749.70/sec), total:  17497 -/
ldclt[1353]: Average rate: 2319.80/thr  (2319.80/sec), total:  23198
ldclt[1353]: Average rate: 2378.80/thr  (2378.80/sec), total:  23788

Expected results:
There shouldn't be a significant drop in performance.

Additional info:
@wisebaldone who reported the issue, also mentioned that 1.2.11.15-48 doesn't have the issue and 1.2.11.15-97 does have.

Comment 3 RHEL Program Management 2022-12-21 07:27:55 UTC
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.

Comment 10 RHEL Program Management 2023-06-21 07:28:16 UTC
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.

Comment 11 Pierre Rogier 2023-07-17 07:56:29 UTC
Reopening the bug as root cause is finally understood and a fix is in progress.

Comment 13 Pierre Rogier 2023-07-17 08:10:55 UTC
The issue is due to a very small lock contention impacting 0.3% of the server CPU time and 5% of the listening thread time but that was enough to decrease the performance by 60%.
Even worse we have seen case (after a network issue (tcp router restarted)) were the server was fully unresponsive until the nsslapd-ioblocktimeout expired.


Note You need to log in before you can comment on or make changes to this bug.