Bug 808597

Summary: sssd_nss crashes on request when no back end is running
Product: Red Hat Enterprise Linux 6 Reporter: Jakub Hrozek <jhrozek>
Component: sssdAssignee: Stephen Gallagher <sgallagh>
Status: CLOSED ERRATA QA Contact: IDM QE LIST <seceng-idm-qe-list>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 6.3CC: grajaiya, jgalipea, kbanerje, prc
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: sssd-1.8.0-22.el6 Doc Type: Bug Fix
Doc Text:
No documentation required
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-06-20 11:56:39 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jakub Hrozek 2012-03-30 19:35:42 UTC
This bug is created as a clone of upstream ticket:
https://fedorahosted.org/sssd/ticket/1270

uname -r
3.2.10-3.fc16.x86_64

After a few minutes when sssd is launched :
systemctl restart sssd.service

Process are :
{{{
 5628 ?        Ss     0:00 /usr/sbin/sssd -D -f
 5630 ?        S      0:00 /usr/libexec/sssd/sssd_nss --debug-to-files
 5631 ?        S      0:00 /usr/libexec/sssd/sssd_pam --debug-to-files
 5663 ?        R      0:00 /usr/libexec/sssd/sssd_be --domain LDAP --debug-to-files
}}}

The log says :
{{{
[root@lame4: 382] Mar 20 18:07:36 lame4 kernel: [11304.391716] sssd_be[5663]: segfault at 4000 ip 00007fc4188e64a2 sp 00007ffffbf0ce90 error 4 in sssd_be[7fc41888d000+75000]
Mar 20 18:07:36 lame4 abrtd: Directory 'ccpp-2012-03-20-18:07:36-5663' creation detected
Mar 20 18:07:36 lame4 abrt[5682]: Saved core dump of pid 5663 (/usr/libexec/sssd/sssd_be) to /var/spool/abrt/ccpp-2012-03-20-18:07:36-5663 (16244736 bytes)
Mar 20 18:07:36 lame4 sssd[be[LDAP]]: Starting up
Mar 20 18:07:36 lame4 abrtd: DUP_OF_DIR: /var/spool/abrt/ccpp-2012-03-20-15:04:40-828
Mar 20 18:07:36 lame4 abrtd: Problem directory is a duplicate of /var/spool/abrt/ccpp-2012-03-20-15:04:40-828
Mar 20 18:07:36 lame4 abrtd: Deleting problem directory ccpp-2012-03-20-18:07:36-5663 (dup of ccpp-2012-03-20-15:04:40-828)
}}}

Process are :
{{{
 5628 ?        Ss     0:00 /usr/sbin/sssd -D -f
 5630 ?        S      0:00 /usr/libexec/sssd/sssd_nss --debug-to-files
 5631 ?        S      0:00 /usr/libexec/sssd/sssd_pam --debug-to-files
}}}

A few minutes later log says :
{{{
Mar 20 18:29:01 lame4 abrtd: Directory 'ccpp-2012-03-20-18:29:01-5630' creation detected
Mar 20 18:29:01 lame4 abrt[6320]: Saved core dump of pid 5630 (/usr/libexec/sssd/sssd_nss) to /var/spool/abrt/ccpp-2012-03-20-18:29:01-5630 (1622016 bytes)
Mar 20 18:29:01 lame4 sssd[nss]: Starting up
Mar 20 18:29:01 lame4 sssd[nss]: Starting up
Mar 20 18:29:01 lame4 sssd[nss]: Starting up
Mar 20 18:29:01 lame4 abrtd: DUP_OF_DIR: /var/spool/abrt/ccpp-2012-03-20-15:11:34-833
Mar 20 18:29:01 lame4 abrtd: Problem directory is a duplicate of /var/spool/abrt/ccpp-2012-03-20-15:11:34-833
Mar 20 18:29:01 lame4 abrtd: Deleting problem directory ccpp-2012-03-20-18:29:01-5630 (dup of ccpp-2012-03-20-15:11:34-833)
}}}

Process are :
{{{
 5628 ?        Ss     0:00 /usr/sbin/sssd -D -f
 5631 ?        S      0:00 /usr/libexec/sssd/sssd_pam --debug-to-files
}}}
and getent passwd user doesn't work anymore

In a first time sssd_be segfaults and getent passwd user works,
ans in a second time sssd_nss try to start up and getent passwd user fails.

Comment 1 Jakub Hrozek 2012-03-30 19:40:45 UTC
This bugzilla is tracking the sssd_nss problem. The sssd_be crash is being tracked separately in bug #808597.

To reproduce:
1. kill sssd_be so that is stops definitely
for x in $(seq 1 4); do kill -9 $(pidof sssd_be); done
2. make sure no sssd_be process is running now
pidof sssd_be
3. getent passwd someuser

With the current packages, the sssd_nss process crashes with SIGABRT. It should not crash, but rather revert to returning data from cache.

Comment 4 Kaushik Banerjee 2012-04-30 11:10:51 UTC
# getent passwd puser1
puser1:*:1001:1001:Posix User1:/home/puser1:
# for x in $(seq 1 4); do kill -9 $(pidof sssd_be); done
# pidof sssd_be
# getent passwd puser1
puser1:*:1001:1001:Posix User1:/home/puser1:


Verified with sssd-1.8.0-23

Comment 5 Stephen Gallagher 2012-06-12 13:51:21 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
No documentation required

Comment 7 errata-xmlrpc 2012-06-20 11:56:39 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2012-0747.html