This service will be undergoing maintenance at 00:00 UTC, 2016-08-01. It is expected to last about 1 hours
Bug 645434 - NSS responder dies if DP dies during a request
NSS responder dies if DP dies during a request
Status: CLOSED ERRATA
Product: Fedora
Classification: Fedora
Component: sssd (Show other bugs)
14
All All
low Severity medium
: ---
: ---
Assigned To: Stephen Gallagher
Fedora Extras Quality Assurance
:
Depends On:
Blocks: 645437 645438
  Show dependency treegraph
 
Reported: 2010-10-21 09:44 EDT by Sumit Bose
Modified: 2010-11-16 18:19 EST (History)
4 users (show)

See Also:
Fixed In Version: sssd-1.4.1-1.fc14
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 645437 645438 (view as bug list)
Environment:
Last Closed: 2010-11-16 18:19:42 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)

  None (edit)
Description Sumit Bose 2010-10-21 09:44:33 EDT
Description of problem:
If a data provider dies during a NSS request the NSS responder dies if the timeout of the open and unhandled requests is reached.

Version-Release number of selected component (if applicable):
At least sssd-1.2 and above

How reproducible:
There is no know error in the LDAP provider which can be used to trigger this issue, so the sssd_be process must be killed manually. 

Steps to Reproduce:
1. Configure sssd with id_provider=ldap.
2. Choose a slow LDAP server and a very large group or find some other way to make the LDAP request last long.
3. getent group very_large_group
4. kill sssd_be immediatly after calling getent
5. wait until the timeout is reached (couple of minutes)
  
Actual results:
NSS responder dies.

Expected results:
NSS responder returns an error to the client.

Additional info:
The upstream bug can be found here: https://fedorahosted.org/sssd/ticket/654
Comment 1 Sumit Bose 2010-10-22 07:45:59 EDT
Based on an idea from Jan Zelený <jzeleny@redhat.com> I found an easier way to reproduce this issue:

Steps to Reproduce:
1. Configure sssd with id_provider=ldap, any LDAP server is ok
2. Start sssd, preferably with an empty cache (rm -f /var/lib/sss/db/*)
3. Find the pid of sssd_nss
   pgrep sssd_nss
4. Define a delay on the interface which is used to contact the LDAP server. If the LDAP server runs locally use lo, e.g.
   tc qdisc add dev lo root netem delay 3s
5. run
   while /bin/true; do if pgrep getent 1> /dev/null; then killall -9 /usr/libexec/sssd/sssd_be; break; fi; sleep 1; done
   (this will kill sssd_be as soon as a getent command is running
6. in a differrent shell call
   getent group some_group_which_is_not_in_the_cache
7. Wait until the getent call returns, this call last up to 5 minutes 
8. Call
   pgrep sssd_nss
   again
9. remove the delay
   tc qdisc del dev lo root

Actual results:
The two PIDs differ, i.e. sssd_nss dies and was restarted

Expected results:
The two PIDs are the same, i.e. sssd_nss didn't die
Comment 2 Fedora Update System 2010-11-05 14:34:42 EDT
sssd-1.4.1-1.fc14 has been submitted as an update for Fedora 14.
https://admin.fedoraproject.org/updates/sssd-1.4.1-1.fc14
Comment 3 Fedora Update System 2010-11-06 19:40:59 EDT
sssd-1.4.1-1.fc14 has been pushed to the Fedora 14 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update sssd'.  You can provide feedback for this update here: https://admin.fedoraproject.org/updates/sssd-1.4.1-1.fc14
Comment 4 Fedora Update System 2010-11-16 18:19:33 EST
sssd-1.4.1-1.fc14 has been pushed to the Fedora 14 stable repository.  If problems still persist, please make note of it in this bug report.

Note You need to log in before you can comment on or make changes to this bug.