Bug 645437

Summary: NSS responder dies if DP dies during a request
Product: Red Hat Enterprise Linux 5 Reporter: Stephen Gallagher <sgallagh>
Component: sssdAssignee: Stephen Gallagher <sgallagh>
Status: CLOSED ERRATA QA Contact: Chandrasekar Kannan <ckannan>
Severity: medium Docs Contact:
Priority: low    
Version: 5.6CC: benl, dpal, jgalipea, jhrozek, sbose, sgallagh, ssorce
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: sssd-1.2.1-35.el5 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 645434 Environment:
Last Closed: 2011-01-13 22:37:25 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 645434, 645438    
Bug Blocks: 640580    

Description Stephen Gallagher 2010-10-21 13:48:54 UTC
+++ This bug was initially created as a clone of Bug #645434 +++

Description of problem:
If a data provider dies during a NSS request the NSS responder dies if the timeout of the open and unhandled requests is reached.

Version-Release number of selected component (if applicable):
At least sssd-1.2 and above

How reproducible:
There is no know error in the LDAP provider which can be used to trigger this issue, so the sssd_be process must be killed manually. 

Steps to Reproduce:
1. Configure sssd with id_provider=ldap.
2. Choose a slow LDAP server and a very large group or find some other way to make the LDAP request last long.
3. getent group very_large_group
4. kill sssd_be immediatly after calling getent
5. wait until the timeout is reached (couple of minutes)
  
Actual results:
NSS responder dies.

Expected results:
NSS responder returns an error to the client.

Additional info:
The upstream bug can be found here: https://fedorahosted.org/sssd/ticket/654

Comment 2 Jenny Severance 2010-11-12 13:27:51 UTC
QE and Dev has spent days trying to reproduce and verify this bug.  It is extremely hard to do.  Development has been successful a few times, but very inconsistently.  Given the nature of the bug, It's not a very complicated fix, and from an engineering perspective it's more or less obvious.  Since, this fix has not caused any regressions in all automated and manual testing, will mark bug verified.

Comment 4 errata-xmlrpc 2011-01-13 22:37:25 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHEA-2011-0044.html