Bug 1897248

Summary: performance search rate: nagle triggers high rate of setsocketopt
Product: Red Hat Enterprise Linux 8 Reporter: thierry bordaz <tbordaz>
Component: 389-ds-baseAssignee: thierry bordaz <tbordaz>
Status: CLOSED ERRATA QA Contact: RHDS QE <ds-qe-bugs>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 8.3CC: bsmejkal, mreynolds, sgouvern
Target Milestone: rcKeywords: Triaged
Target Release: 8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: sync-to-jira
Fixed In Version: 389-ds-1.4-8040020201216214810.866effaa Doc Type: Bug Fix
Doc Text:
Cause: In order to optimize network traffic and avoid sending partial LDAP responses, 389-ds uses TCP_CORK socket option. This socket option is useless because LDAP server always send complete responses/results Consequence: To use TCP_CORK, LDAP server does setsockopt syscall per resp/result. But the use of TCP_CORK is useless in case of 389-ds Fix: set nagle=off by default Result: reduce the number of setsockopt calls
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-05-18 15:45:43 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description thierry bordaz 2020-11-12 15:47:19 UTC
Description of problem:

nagle is enabled by default. To prevent returned search entries/result to be stuck the server set/reset (setsocketopt) TCP_CORK. It can give a network benefit in term of transmitted overhead, but the cost is a big use of systemcall setsocketopt.

The benefit of nagle should be limited by the fact that DS always write complete PDU (response or entries)

Switching on/off nagle has no significant impact on throughput but gives benefit in terms of syscall

tracing: perf trace -t <worker> -s

NAGLE: off

 ns-slapd (108026), 1527476 events, 100.0%

   syscall            calls    total       min       avg       max      stddev
                               (msec)    (msec)    (msec)    (msec)        (%)
   --------------- -------- --------- --------- --------- ---------     ------
   futex             328122  3059.967     0.000     0.009   360.607     19.06%
   poll              124478   854.697     0.002     0.007   100.142     20.29%
   sendto            124149   785.173     0.002     0.006     0.041      0.11%
   write             123627   483.753     0.002     0.004     0.080      0.10%
   recvfrom           62295   245.209     0.002     0.004     0.040      0.12%
   madvise                9     0.211     0.016     0.023     0.042     12.46%
   sched_yield           34     0.146     0.004     0.004     0.005      1.32%



NAGLE: on
 ns-slapd (108026), 1784094 events, 100.0%

   syscall            calls    total       min       avg       max      stddev
                               (msec)    (msec)    (msec)    (msec)        (%)
   --------------- -------- --------- --------- --------- ---------     ------
   futex             328704  3054.546     0.000     0.009   346.687     17.23%
   setsockopt        124702   656.760     0.001     0.005     0.034      0.14%
   poll              125437   655.709     0.002     0.005   100.141     15.28%
   sendto            124501   464.307     0.001     0.004     0.039      0.14%
   write             125026   462.438     0.001     0.004     0.031      0.10%
   recvfrom           62799   248.383     0.002     0.004     0.030      0.11%
   madvise                3     0.066     0.014     0.022     0.034     26.97%
   sched_yield           10     0.043     0.004     0.004     0.004      1.69%

Version-Release number of selected component (if applicable):


How reproducible:
Will be described later

Actual results:
With searchrate, we can see as much setsockopt than the #write

Expected results:
See lower number of setsockopt

Comment 3 bsmejkal 2021-01-08 16:35:11 UTC
============================================================================================================ test session starts =============================================================================================================
platform linux -- Python 3.6.8, pytest-6.2.1, py-1.10.0, pluggy-0.13.1 -- /usr/bin/python3.6
cachedir: .pytest_cache
metadata: {'Python': '3.6.8', 'Platform': 'Linux-4.18.0-269.el8.x86_64-x86_64-with-redhat-8.4-Ootpa', 'Packages': {'pytest': '6.2.1', 'py': '1.10.0', 'pluggy': '0.13.1'}, 'Plugins': {'metadata': '1.11.0', 'html': '3.1.1', 'libfaketime': '0.1.2'}}
389-ds-base: 1.4.3.16-7.module+el8.4.0+9324+a82a8f71
nss: 3.53.1-17.el8_3
nspr: 4.25.0-2.el8_2
openldap: 2.4.46-16.el8
cyrus-sasl: 2.1.27-5.el8
FIPS: disabled
rootdir: /mnt/tests/rhds/tests/upstream/ds/dirsrvtests, configfile: pytest.ini
plugins: metadata-1.11.0, html-3.1.1, libfaketime-0.1.2
collected 11 items / 10 deselected / 1 selected                                                                                                                                                                                              

dirsrvtests/tests/suites/config/config_test.py::test_nagle_default_value PASSED                                                                                                                                                        [100%]

================================================================================================ 1 passed, 10 deselected in 8.69s ===========================================================================================================

Marking as Verified:Tested.

Comment 6 sgouvern 2021-01-18 14:15:32 UTC
wrong build attached to the errata -> moving to ITM12

Comment 7 sgouvern 2021-01-18 14:48:20 UTC
Correct build now attached to the errata : as per comment 3, marking as VERIFIED and moving back to ITM11

Comment 9 errata-xmlrpc 2021-05-18 15:45:43 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (389-ds:1.4 bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:1835