Bug 799929

Summary: Raise limits for max num of files sssd_nss/sssd_pam can use
Product: Red Hat Enterprise Linux 6 Reporter: Stephen Gallagher <sgallagh>
Component: sssdAssignee: Stephen Gallagher <sgallagh>
Status: CLOSED ERRATA QA Contact: IDM QE LIST <seceng-idm-qe-list>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 6.3CC: apeetham, chad, grajaiya, jgalipea, mniranja, prc, tuphill
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: sssd-1.8.0-12.el6 Doc Type: Bug Fix
Doc Text:
Cause: SSSD was limited to using 1024 file descriptors for its sssd_nss and sssd_pam responder processes. Consequence: On very busy systems with many user lookups and/or authentications, SSSD could run out of descriptors and stop responding to requests until it was restarted. Fix: SSSD has had it's limit increased to 4096 descriptors Result: Users should not experience the resource exhaustion described above.
Story Points: ---
Clone Of:
: 815154 (view as bug list) Environment:
Last Closed: 2012-06-20 11:55:34 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 799968, 834621    
Bug Blocks: 815154    

Description Stephen Gallagher 2012-03-05 12:50:24 UTC
This bug is created as a clone of upstream ticket:
https://fedorahosted.org/sssd/ticket/1197

We currently keep the default max files limit which is set at around 1000 files for normal processes.
Given we let clients keep connections open for a long time, this limit is low on very large servers that may easily exceed 1000 processes.

We should use setrlimit and raise this limit to 8k files for now.
We can tune it back down a bit once we will have the shared memory cache.
At that point keeping the socket open should not be needed anymore an we should change the clients (or even just the server) to close the socket once a request is done.

Because we do not control clients we should also keep track of file descriptors and periodically prune inactive file descriptors when we are close to the max files limit in order to avoid starving the system (when we run out of FDs we are incapable of serving new processes).

Comment 1 Stephen Gallagher 2012-03-05 14:15:36 UTC
Before this fix, the sssd_nss and sssd_pam processes would be limited to 1024 open file descriptors. After this patch, they will be limited to either 4096 or 8192 file descriptors, depending on SELinux configuration.

The code internally will try to request 8192 descriptors, but if SELinux is enforcing (and not manually configured to grant SSSD the CAP_SYS_RESOURCE capability), it will be clamped down to 4096.

You can test this with:
cat /proc/<PID>/limits | grep "Max open files"
The first number after "Max open files" is the active limit.


BZ #799968 has been opened to track adding the SELinux policy to allow this value to be 8192 by default.

Comment 4 Amith 2012-04-27 21:31:50 UTC
Verified on sssd-1.8.0-22.el6.x86_64.
Changeset link: https://engineering.redhat.com/trac/SSSDtetframework/changeset/897

The output of the beaker automation script is given below:

::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:: [   LOG    ] :: Verify BZ release ticket #355 :- Raise limits for max num of files sssd_nss/sssd_pam can use
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

:: [   PASS   ] :: Running 'cat /proc/5162/limits | grep "Max open files" > /tmp/limit_file'
:: [   PASS   ] :: Running 'cat /tmp/limit_file | tr -s " " > /tmp/lmt_file'
:: [   PASS   ] :: Verifying whether sssd_nss limit val = 4096
:: [   PASS   ] :: Running 'cat /proc/5163/limits | grep "Max open files" > /tmp/limit_file'
:: [   PASS   ] :: Running 'cat /tmp/limit_file | tr -s " " > /tmp/lmt_file'
:: [   PASS   ] :: Verifying whether sssd_pam limit val = 4096
:: [   LOG    ] :: Duration: 1s
:: [   LOG    ] :: Assertions: 13 good, 0 bad
:: [   PASS   ] :: RESULT: Verify BZ release ticket #355 :- Raise limits for max num of files sssd_nss/sssd_pam can use

Comment 5 Stephen Gallagher 2012-06-12 13:45:13 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Cause: SSSD was limited to using 1024 file descriptors for its sssd_nss and sssd_pam responder processes.

Consequence: On very busy systems with many user lookups and/or authentications, SSSD could run out of descriptors and stop responding to requests until it was restarted.

Fix: SSSD has had it's limit increased to 4096 descriptors

Result: Users should not experience the resource exhaustion described above.

Comment 7 errata-xmlrpc 2012-06-20 11:55:34 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2012-0747.html

Comment 8 Thomas Uphill 2013-10-02 20:31:28 UTC
Still seeing the issue on a 6.1 box with 1.8.0-32.el6

Worth noting that increasing the limit on the running process causes sssd to recover

# cat /proc/$(pidof sssd_pam)/limits |grep files
Max open files            1024                 1024                 files
# echo -n "Max open files=8192:8192" >/proc/$(pidof sssd_pam)/limits
# cat /proc/$(pidof sssd_pam)/limits |grep files
Max open files            8192                 8192                 files 

sssd responds after this.