Bug 2195919
| Summary: | sssd-be tends to run out of system resources, hitting the maximum number of open files | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | Abhijit Roy <abroy> |
| Component: | sssd | Assignee: | Sumit Bose <sbose> |
| Status: | VERIFIED --- | QA Contact: | Anuj Borah <aborah> |
| Severity: | medium | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 8.7 | CC: | aborah, aboscatt, atikhono, john.sincock, pbrezina, pkulkarn, sbose, sgadekar |
| Target Milestone: | rc | Keywords: | Triaged |
| Target Release: | --- | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | sync-to-jira | ||
| Fixed In Version: | sssd-2.9.1-1.el8 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | Type: | Bug | |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Deadline: | 2023-07-03 | ||
|
Description
Abhijit Roy
2023-05-06 15:57:58 UTC
Created attachment 1962799 [details]
lsof
Created attachment 1962800 [details]
proc_pgrep sssd_be_fd
Hi, can you check with the 'ps' command if there are actually many ldap_child/krb5_child processes running? Can you share the ldap_child/krb5_child log file to check if there are any issues? bye, Sumit Hello Sumit, Log details: https://drive.google.com/file/d/1vLBrigx6tAn98PSyrB9mSfCdwnWHxTIA/view?usp=sharing (In reply to Sumit Bose from comment #5) > Hi, > > some of the relevant log files are truncated in the sos-report. Please ask > to create a tar ball with the logs form /var/log/sssd and attach it to the > case. > > bye, > Sumit Hi Sumit, Did you get a chance to look into this issue For now, I have asked cus to check 1) Login into the problematic host as root. 2) vi /etc/security/limits.conf and set value as below. -------------------------------------- root soft nofile 20480 root hard nofile 20480 -------------------------------------- 3) Logout and relogin again as root and check if output of below command is showing "20480" value. # ulimit -n 4) Restart SSSD service # systemctl restart sssd # systemctl status sssd 5) Check if the issue is observed. Hi, to reproduce the issue I replaced /usr/libexec/sssd/krb5_child with a shell script like #!/bin/bash sleep 10 to create a reliable timeout when the backend calls krb5_child. Then I run authentications from one shell while watching of `ls -al /proc/$(pidof sssd_be)/fd` in another window. HTH bye, Sumit I also am seeing this after recent updates. # cat /var/log/messages | grep -i "open files" | cut -f 5- -d ' ' sssd_be[4114231]: Could not open file [/var/log/sssd/krb5_child.log]. Error: [24][Too many open files] ... Necessary to restart sssd to temporarily fix. Disgraceful to see bugs like this in an "enterprise" OS, but, this is what we have grown to expect from Red Hat, and from awful components like SSSD. For me, issue first observed in same version: sssd-2.7.3-4.el8_7.3.x86_64 (In reply to John from comment #13) > I also am seeing this after recent updates. This bug (that is being fixed by https://github.com/SSSD/sssd/pull/6745) was there for ages. Something different changed that this bug is now triggered in your environment. (In reply to John from comment #14) > For me, issue first observed in same version: > sssd-2.7.3-4.el8_7.3.x86_64 Please look into 'krb5_child.log' to figure out why (if) it started failing often. (In reply to Alexey Tikhonov from comment #15) > Please look into 'krb5_child.log' to figure out why (if) it started failing > often. Thanks for the suggestion but I would rather stab myself in the face with a fork. Never, in the 10 years or so i've had occasion to look at SSSD logs, have i ever found SSSD logs useful to debug any of the many issues I've had with SSSD. I have over 30 years of experience with a variety of unixes, and SSSD logs are the most obfuscated and useless logs i have ever seen. It doesn't matter how low high your debug level is, all you get is more and more misleading noise, with a million things failing that have no relevance to your issue, and which occur even when sssd is "working". The logs never, ever, contain anything useful. The only way for anyone to resolve any issues with sssd is just by randomly changing settings in the config file and praying. I'll just ignore the problem and hope it goes away, since that seems to work for Red Hat on a regular basis, maybe it'll work for me just this once. (In reply to John from comment #16) > > I'll just ignore the problem and hope it goes away Another option could be to reach out to your customer support point of contact. Hi John, Thank you for bringing your concerns/experience to our attention. We understand that you are encountering this issue with SSSD and are disappointed with the overall debugging experience. We apologize for any inconvenience you have experienced so far. While we appreciate your feedback, it would be immensely helpful if you could provide us with more specific details about the problems you are facing. This will enable us to investigate the issue more effectively and find a resolution. We would like to work collaboratively with you, so any specific information or log excerpts you can share would greatly assist us in identifying the root cause. In case you are still not interested in going through the logs, we would like to understand your experience in more detail. Could you please provide specific examples of how the logs and debugging experience were not useful and appeared obfuscated? This will help us understand what aspects you found challenging or missed during your troubleshooting process. Of course, another option is to engage the customer support point of contact, as Alexey previously mentioned. However, it's important to note that simply reaching out to customer support may not necessarily contribute to improving the overall user experience, the reason why we are asking for details about debugging/logs. Software issues can be frustrating, but our teams are dedicated to continuously improving our OS and its components. Ignoring feedback is not part of our approach; instead, we actively seek ways to enhance our system based on input from users like you. Please let us know if you can provide any further details or if you would like us to assist you in any specific way. We are here to help and ensure a positive user experience. Best regards, Andre Boscatto Product Owner, Identity and Access Management Department Upstream PR: https://github.com/SSSD/sssd/pull/6745 Pushed PR: https://github.com/SSSD/sssd/pull/6745 * `master` * 455611952f90ed0cefaff1e840623ea14ac06be1 - krb5: make sure sockets are closed on timeouts * `sssd-2-9` * 4d2cf0b62bbf0386755550bfad684cf36b36eccd - krb5: make sure sockets are closed on timeouts |