Description of problem: When a server is under search load, a paged search from a regular user impacts the performance. Same paged search requested by DM has much smaller impact. Version-Release number of selected component (if applicable): since 7.x How reproducible: systematic Steps to Reproduce: Create a db with 40000 users Run search load using ldclt -D "uid=test,ou=people,dc=example,dc=com" -w test -e bindeach,esearch -b "ou=people,dc=example,dc=com" -f "uid=00001" While the search load is running, run paged search in a loop that requests all ids: while : ; do ldapsearch -D "uid=test,ou=people,dc=example,dc=com" -w test -b dc=example,dc=com 'uid=*' -E pr=100/noprompt; done Actual results: ldclt[1353]: Average rate: 2415.70/thr (2415.70/sec), total: 24157 -- only ldclt is running ldclt[1353]: Average rate: 2342.50/thr (2342.50/sec), total: 23425 ldclt[1353]: Average rate: 1048.90/thr (1048.90/sec), total: 10489 -\ ldclt[1353]: Average rate: 413.10/thr ( 413.10/sec), total: 4131 | paged search from a regular user ldclt[1353]: Average rate: 461.00/thr ( 461.00/sec), total: 4610 | ldclt[1353]: Average rate: 1759.30/thr (1759.30/sec), total: 17593 -/ ldclt[1353]: Average rate: 2374.70/thr (2374.70/sec), total: 23747 ldclt[1353]: Average rate: 1952.70/thr (1952.70/sec), total: 19527 -\ ldclt[1353]: Average rate: 1783.00/thr (1783.00/sec), total: 17830 | paged search from DM ldclt[1353]: Average rate: 1749.70/thr (1749.70/sec), total: 17497 -/ ldclt[1353]: Average rate: 2319.80/thr (2319.80/sec), total: 23198 ldclt[1353]: Average rate: 2378.80/thr (2378.80/sec), total: 23788 Expected results: There shouldn't be a significant drop in performance. Additional info: @wisebaldone who reported the issue, also mentioned that 1.2.11.15-48 doesn't have the issue and 1.2.11.15-97 does have.
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release. Therefore, it is being closed. If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.
Reopening the bug as root cause is finally understood and a fix is in progress.
The issue is due to a very small lock contention impacting 0.3% of the server CPU time and 5% of the listening thread time but that was enough to decrease the performance by 60%. Even worse we have seen case (after a network issue (tcp router restarted)) were the server was fully unresponsive until the nsslapd-ioblocktimeout expired.