Bug 2210491

Summary: dtablesize being set to soft maxfiledescriptor limit causing massive slowdown in large enviroments.
Product: Red Hat Enterprise Linux 8 Reporter: Jerone Young <jyoung>
Component: 389-ds-baseAssignee: Jamie Chapman <jachapma>
Status: MODIFIED --- QA Contact: LDAP QA Team <idm-ds-qe-bugs>
Severity: high Docs Contact:
Priority: unspecified    
Version: 8.8CC: afarley, apeddire, gkimetto, idm-ds-dev-bugs, jachapma, jonmoore, mreynolds, mrhodes, msauton, tbordaz, tmihinto, vashirov
Target Milestone: rcKeywords: Triaged
Target Release: 8.9Flags: tbordaz: needinfo? (jachapma)
Hardware: All   
OS: Linux   
Whiteboard: sync-to-jira
Fixed In Version: 389-ds-1.4-820230816162424-17499975 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jerone Young 2023-05-27 22:02:36 UTC
Description of problem:
 It looks like dtablesize is being set to the maxfiledescriptors soft limit. 

 This problem was found while migrating from RHEL 7 IDM to RHEL 8 IDM at large client site. I came up at first with systemd override for maxfiledescriptors with value 65535 that resolved issues at client site. Though after further investigation was able to see the issue was not the value of the maxfiledescriptors. But that dtablesize was being set to the soft limit. 


Version-Release number of selected component (if applicable):
     Affects both RHEL 8 & RHEL 9.


How reproducible:

It looks like dtablesize is being set to the maxfiledescriptors soft limit. 

Problem:
=======
     389ds is not setting dtablesize properly based when systemd is setting the maxfiledescriptors with it's default settings.
     dtablesize stays 1024 which causes massive slowdown once you hit around 950 connection. Basically queries are very slow. Also can't
     go above around ~970 concurrent connections.

      It looks like dtablesize is being set to the soft limit when it needs to be set to the actual limit. 

      So with no changes settings you see running commands:
                systemctl show dirsrv@<INSTANCE> |grep -i Limitnofile
                -----------------------------------------------------------------------------
                - RHEL 8 (Directory Server)
                      LimitNOFILE=262144
                      LimitNOFILESoft=1024      < ---- Notice

                - RHEL 9 (IDM system)
                     LimitNOFILE=524288
                     LimitNOFILESoft=1024        <--- Notice

               dsconf  <instance> config get nsslapd-maxdescriptors
               --------------------------------------------------------------------------
                - RHEL 8 
                      nsslapd-maxdescriptors: 262144

                - RHEL 9
                       nsslapd-maxdescriptors: 524288


                - 
 (THE PROBLEM)     dsconf <instance> monitor server |grep dtablesize 
                                   --------------------------------------------------------------------
                                     - RHEL 8
                                            dtablesize: 1024

                                     - RHEL 9
                                            dtablesize: 1024


WorkAround:
==========
        Once in place connections can go very very high. Testing showed 7,000+ at client site. Also queries very fast.

        Looks like this works because the override sets both hard and soft limit. So dtablesize gets a properly.
         
         Workaround
         -----------------
         - RHEL 8
                    mkdir -p /etc/systemd/system/dirsrv@.service.d/
                    cat > /etc/systemd/system/dirsrv@.service.d/filelimts.conf << EOF
                    [Service]
                    LimitNOFILE=262144
                    EOF
                  
                    systemctl daemon-reload
                    systemctl restart dirsrv@<instance>
                      
          - RHEL 9
                    mkdir -p /etc/systemd/system/dirsrv@.service.d/
                    cat > /etc/systemd/system/dirsrv@.service.d/filelimts.conf << EOF
                    [Service]
                    LimitNOFILE=524288
                    EOF
                  
                    systemctl daemon-reload
                    systemctl restart dirsrv@<instance>       
          
               systemctl show dirsrv@<INSTANCE> |grep -i Limitnofile
                -----------------------------------------------------------------------------
                - RHEL 8 (Directory Server)
                          LimitNOFILE=262144
                          LimitNOFILESoft=262144    <---- THIS !!
 
                - RHEL 9 (IDM system)
                           LimitNOFILE=524288             
                           LimitNOFILESoft=524288   <-- THIS !!
                           
                    
               dsconf  <instance> config get nsslapd-maxdescriptors
               --------------------------------------------------------------------------
                - RHEL 8 
                      nsslapd-maxdescriptors: 262144

                - RHEL 9
                       nsslapd-maxdescriptors: 524288

              
                dsconf <instance> monitor server |grep dtablesize 
                --------------------------------------------------------------------
                - RHEL 8
                         dtablesize:  262144

                 - RHEL 9
                         dtablesize: 524288



Possible FIX
==========
     Looks like the fix would be to have dtablesize set to maxfiledescriptors or the hard maxfiledescriptor limit.

Comment 6 Marc Sauton 2023-07-20 17:29:47 UTC
the dtablesize has been documented in the RHDS perf guide until RHDS-11 ( no longer in RHDS-12 )
https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/pdf/performance_tuning_guide/red_hat_directory_server-11-performance_tuning_guide-en-us.pdf
2.1.1. Monitoring the Directory Server Using the Command Line

a note could be added in

https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/pdf/tuning_performance_in_identity_management/red_hat_enterprise_linux-9-tuning_performance_in_identity_management-en-us.pdf
for
a system general configuration with sysctl for
somaxconn
dtablesize
and/or
Chapter 6. Adjusting IdM Directory Server performance
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/html/tuning_performance_in_identity_management/adjusting-idm-directory-server-performance_tuning-performance-in-idm
and/or
Chapter 7. Adjusting the performance of the KDC
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/html/tuning_performance_in_identity_management/assembly_adjusting-the-performance-of-the-kdc_tuning-performance-in-idm

but note such tuning may displace a bottleneck where a prior connection throttling will now hammer the LDAP worker threads and cause a lot more contention with plug-ins.

Comment 7 Jerone Young 2023-07-20 17:35:34 UTC
@Marc
     Upstream has already confirmed this as a bug and are working on it now.
 
     dtablesize is in RHDS-12 I show it in the example.

     The workaround provided is being used and has been proven by customers.

Comment 8 Jerone Young 2023-07-20 17:41:59 UTC
@Marc
     What I get for multi-tasking. Yes you are correct that notes should be added to documentation.

Comment 10 Viktor Ashirov 2023-07-24 14:57:46 UTC
Patch was merged upstream, moving to POST.