Description of problem:
If a data provider dies during a NSS request the NSS responder dies if the timeout of the open and unhandled requests is reached.
Version-Release number of selected component (if applicable):
At least sssd-1.2 and above
There is no know error in the LDAP provider which can be used to trigger this issue, so the sssd_be process must be killed manually.
Steps to Reproduce:
1. Configure sssd with id_provider=ldap.
2. Choose a slow LDAP server and a very large group or find some other way to make the LDAP request last long.
3. getent group very_large_group
4. kill sssd_be immediatly after calling getent
5. wait until the timeout is reached (couple of minutes)
NSS responder dies.
NSS responder returns an error to the client.
The upstream bug can be found here: https://fedorahosted.org/sssd/ticket/654
Based on an idea from Jan Zelený <email@example.com> I found an easier way to reproduce this issue:
Steps to Reproduce:
1. Configure sssd with id_provider=ldap, any LDAP server is ok
2. Start sssd, preferably with an empty cache (rm -f /var/lib/sss/db/*)
3. Find the pid of sssd_nss
4. Define a delay on the interface which is used to contact the LDAP server. If the LDAP server runs locally use lo, e.g.
tc qdisc add dev lo root netem delay 3s
while /bin/true; do if pgrep getent 1> /dev/null; then killall -9 /usr/libexec/sssd/sssd_be; break; fi; sleep 1; done
(this will kill sssd_be as soon as a getent command is running
6. in a differrent shell call
getent group some_group_which_is_not_in_the_cache
7. Wait until the getent call returns, this call last up to 5 minutes
9. remove the delay
tc qdisc del dev lo root
The two PIDs differ, i.e. sssd_nss dies and was restarted
The two PIDs are the same, i.e. sssd_nss didn't die
sssd-1.4.1-1.fc14 has been submitted as an update for Fedora 14.
sssd-1.4.1-1.fc14 has been pushed to the Fedora 14 testing repository. If problems still persist, please make note of it in this bug report.
If you want to test the update, you can install it with
su -c 'yum --enablerepo=updates-testing update sssd'. You can provide feedback for this update here: https://admin.fedoraproject.org/updates/sssd-1.4.1-1.fc14
sssd-1.4.1-1.fc14 has been pushed to the Fedora 14 stable repository. If problems still persist, please make note of it in this bug report.