Red Hat Bugzilla – Bug 645437
NSS responder dies if DP dies during a request
Last modified: 2015-01-04 18:44:41 EST
+++ This bug was initially created as a clone of Bug #645434 +++
Description of problem:
If a data provider dies during a NSS request the NSS responder dies if the timeout of the open and unhandled requests is reached.
Version-Release number of selected component (if applicable):
At least sssd-1.2 and above
There is no know error in the LDAP provider which can be used to trigger this issue, so the sssd_be process must be killed manually.
Steps to Reproduce:
1. Configure sssd with id_provider=ldap.
2. Choose a slow LDAP server and a very large group or find some other way to make the LDAP request last long.
3. getent group very_large_group
4. kill sssd_be immediatly after calling getent
5. wait until the timeout is reached (couple of minutes)
NSS responder dies.
NSS responder returns an error to the client.
The upstream bug can be found here: https://fedorahosted.org/sssd/ticket/654
QE and Dev has spent days trying to reproduce and verify this bug. It is extremely hard to do. Development has been successful a few times, but very inconsistently. Given the nature of the bug, It's not a very complicated fix, and from an engineering perspective it's more or less obvious. Since, this fix has not caused any regressions in all automated and manual testing, will mark bug verified.
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.