Bug 675007

Summary: sssd corrupts group cache
Product: Red Hat Enterprise Linux 5 Reporter: Jeff Schroeder <jeffschroeder>
Component: sssdAssignee: Stephen Gallagher <sgallagh>
Status: CLOSED ERRATA QA Contact: Chandrasekar Kannan <ckannan>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 5.6CC: a9016009, benl, dpal, grajaiya, jgalipea, jwest, kbanerje, msvoboda, ovitters, prc
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: sssd-1.5.1-7.el5 Doc Type: Bug Fix
Doc Text:
While running the LDAP cache cleanup task, an issue with a corrupted group cache occurred, and the user was stripped of membership of every group except his primary group. This issue has been fixed and the aforementioned problem now no longer occurs.
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-07-21 08:10:30 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 712134    

Description Jeff Schroeder 2011-02-03 21:47:19 UTC
Description of problem: After recently converting a git server to run sssd for ldap authentication, we ran into a problem where a user would lose every group they were in *except* for their primary group.

Version-Release number of selected component (if applicable):
[root@git ~]# rpm -q sssd
sssd-1.2.1-39.el5
[root@git ~]# cat /etc/redhat-release 
Red Hat Enterprise Linux Server release 5.6 (Tikanga)

How reproducible:
Run sssd with the config below

Steps to Reproduce:
1. Let the cache cleanup task purge some expired entries
2. Watch it eat itsself
  
Actual results:
User matthiasc was in these groups:
matthiasc

$ groups matthiasc
matthiasc : matthiasc

Expected results:
User matthiasc should be in the groups:
matthiasc gcvsadm webusers gnomeweb ftpadmin gnomecvs

Additional info:

sssd log snippet with debug_level=7 in /etc/sssd/sssd.conf:
(Thu Feb  3 21:30:53 2011) [sssd[be[LDAP]]] [sysdb_search_users_check_handle] (6): Search users with filter: (&(objectclass=user)(&(!(dataExpireTimestamp=0))(dataExpireTimestamp<=1296768653)(!(lastLogin=*))))
(Thu Feb  3 21:30:53 2011) [sssd[be[LDAP]]] [sysdb_search_entry_done] (6): Error: Entry not Found!
(Thu Feb  3 21:30:53 2011) [sssd[be[LDAP]]] [sysdb_search_groups_check_handle] (6): Search groups with filter: (&(objectclass=group)(&(!(dataExpireTimestamp=0))(dataExpireTimestamp<=1296768653)))
(Thu Feb  3 21:30:53 2011) [sssd[be[LDAP]]] [cleanup_groups_process] (4): Found 12 expired group entries!
(Thu Feb  3 21:30:53 2011) [sssd[be[LDAP]]] [sysdb_search_users_check_handle] (6): Search users with filter: (&(objectclass=user)(|(memberOf=name=buildmaster,cn=groups,cn=LDAP,cn=sysdb)(gidNumber=522)))
(Thu Feb  3 21:30:53 2011) [sssd[be[LDAP]]] [sysdb_search_users_check_handle] (6): Search users with filter: (&(objectclass=user)(|(memberOf=name=buildslave,c
n=groups,cn=LDAP,cn=sysdb)(gidNumber=523))) 
(Thu Feb  3 21:30:53 2011) [sssd[be[LDAP]]] [sysdb_search_users_check_handle] (6): Search users with filter: (&(objectclass=user)(|(memberOf=name=halloween,cn=groups,cn=LDAP,cn=sysdb)(gidNumber=512)))
(Thu Feb  3 21:30:53 2011) [sssd[be[LDAP]]] [sysdb_search_users_check_handle] (6): Search users with filter: (&(objectclass=user)(|(memberOf=name=gnomecvs,cn=groups,cn=LDAP,cn=sysdb)(gidNumber=70)))
(Thu Feb  3 21:30:53 2011) [sssd[be[LDAP]]] [sysdb_search_users_check_handle] (6): Search users with filter: (&(objectclass=user)(|(memberOf=name=mpopovic,cn=groups,cn=LDAP,cn=sysdb)(gidNumber=7886)))
(Thu Feb  3 21:30:53 2011) [sssd[be[LDAP]]] [sysdb_search_users_check_handle] (6): Search users with filter: (&(objectclass=user)(|(memberOf=name=treitter,cn=groups,cn=LDAP,cn=sysdb)(gidNumber=7644)))
(Thu Feb  3 21:30:53 2011) [sssd[be[LDAP]]] [sysdb_search_users_check_handle] (6): Search users with filter: (&(objectclass=user)(|(memberOf=name=kmaraas,cn=groups,cn=LDAP,cn=sysdb)(gidNumber=2183)))
(Thu Feb  3 21:30:53 2011) [sssd[be[LDAP]]] [sysdb_search_users_check_handle] (6): Search users with filter: (&(objectclass=user)(|(memberOf=name=otaylor,cn=groups,cn=LDAP,cn=sysdb)(gidNumber=2150)))
(Thu Feb  3 21:30:53 2011) [sssd[be[LDAP]]] [sysdb_search_users_check_handle] (6): Search users with filter: (&(objectclass=user)(|(memberOf=name=rhughes,cn=groups,cn=LDAP,cn=sysdb)(gidNumber=7265)))
(Thu Feb  3 21:30:53 2011) [sssd[be[LDAP]]] [sysdb_search_users_check_handle] (6): Search users with filter: (&(objectclass=user)(|(memberOf=name=yarrr,cn=groups,cn=LDAP,cn=sysdb)(gidNumber=516)))
(Thu Feb  3 21:30:53 2011) [sssd[be[LDAP]]] [sysdb_search_users_check_handle] (6): Search users with filter: (&(objectclass=user)(|(memberOf=name=sri,cn=groups,cn=LDAP,cn=sysdb)(gidNumber=6865)))
(Thu Feb  3 21:30:53 2011) [sssd[be[LDAP]]] [sysdb_search_users_check_handle] (6): Search users with filter: (&(objectclass=user)(|(memberOf=name=xan,cn=groups,cn=LDAP,cn=sysdb)(gidNumber=6973)))
(Thu Feb  3 21:30:53 2011) [sssd[be[LDAP]]] [ldap_id_cleanup_set_timer] (6): Scheduling next cleanup at 1296768773.1609268
(Thu Feb  3 21:32:54 2011) [sssd[be[LDAP]]] [sysdb_search_users_check_handle] (6): Search users with filter: (&(objectclass=user)(&(!(dataExpireTimestamp=0))(dataExpireTimestamp<=1296768774)(!(lastLogin=*))))
(Thu Feb  3 21:32:54 2011) [sssd[be[LDAP]]] [cleanup_users_process] (4): Found 956 expired user entries!
(Thu Feb  3 21:32:54 2011) [sssd[be[LDAP]]] [sldb_request_callback] (6): LDB Error: 1 (ltdb modify without transaction)
(Thu Feb  3 21:32:54 2011) [sssd[be[LDAP]]] [sysdb_op_default_done] (6): Error: 5 (Input/output error)
(Thu Feb  3 21:32:54 2011) [sssd[be[LDAP]]] [cleanup_users_delete_done] (2): User delete returned 5 (Input/output error)
(Thu Feb  3 21:32:54 2011) [sssd[be[LDAP]]] [ldap_id_cleanup_users_done] (1): Failed to cleanup users (5 [Input/output error]), retrying later!
(Thu Feb  3 21:32:54 2011) [sssd[be[LDAP]]] [ldap_id_cleanup_set_timer] (6): Scheduling next cleanup at 1296768894.1220292



Our sssd.conf:
[sssd]
config_file_version = 2
domains = LDAP
reconnection_retries = 3
sbus_timeout = 30
services = nss, pam

[nss]
# From the previous /etc/ldap.conf
filter_groups = root,ldap,named,avahi,haldaemon,dbus,radvd,tomcat,radiusd,news,mailman,nscd,gdm
filter_users = root,ldap,named,avahi,haldaemon,dbus,radvd,tomcat,radiusd,news,mailman,nscd,gdm
reconnection_retries = 3

[pam]
reconnection_retries = 3

[domain/LDAP]
auth_provider = ldap
cache_credentials = TRUE
# So password changing works
chpass_provider = ldap
# Only cache credentials of users who login
enumerate = FALSE
id_provider = ldap
ldap_search_base = dc=gnome,dc=org
# Ignore ldap pwd policies so password changes work with shadow* attributes in ldap
ldap_pwd_policy = none
ldap_tls_reqcert = allow
ldap_uri = ldap://ldap-back
timeout = 60
# Since we don't use the RFC2307 'gecos' ldap attribute
ldap_user_gecos = cn

Comment 1 Stephen Gallagher 2011-02-09 18:47:52 UTC
There are actually two issues revealed here.

1) During the ldap cleanup task, we're attempting to write to the LDB without a transaction active, which is throwing an error.

2) This also reveals that we're apparently making some bad decisions somewhere about which groups need to be removed. We shouldn't be seeing the above error because none of the groups in question should be on the purge list anyway.

Comment 3 Kaushik Banerjee 2011-05-12 08:42:37 UTC
Verified in version:
# rpm -qi sssd | head
Name        : sssd                         Relocations: (not relocatable)
Version     : 1.5.1                             Vendor: Red Hat, Inc.
Release     : 34.el5                        Build Date: Tue 03 May 2011 10:46:09 PM IST
Install Date: Wed 11 May 2011 02:07:53 PM IST      Build Host: x86-004.build.bos.redhat.com
Group       : Applications/System           Source RPM: sssd-1.5.1-34.el5.src.rpm
Size        : 3508089                          License: GPLv3+
Signature   : (none)
Packager    : Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>
URL         : http://fedorahosted.org/sssd/
Summary     : System Security Services Daemon


1. User6 added as follows:
dn: cn=user6,ou=People,dc=example,dc=com
objectClass: posixAccount
objectClass: top
loginShell: /bin/bash
uidNumber: 5203
gidNumber: 5203
uid: user6
cn: user6
homeDirectory: /home/user6
userPassword: {SSHA}Y4jrPr0Lx/1byf5/amkJhdVYEgsOwjtX

dn: cn=grp1_user6,ou=Groups,dc=example,dc=com
objectClass: posixGroup
objectClass: groupofuniquenames
objectClass: top
gidNumber: 5203
cn: grp1_user6
memberUid: user6

dn: cn=grp2_user6,ou=Groups,dc=example,dc=com
objectClass: posixGroup
objectClass: groupofuniquenames
objectClass: top
gidNumber: 6203
cn: grp2_user6
memberUid: user6

dn: cn=grp3_user6,ou=Groups,dc=example,dc=com
objectClass: posixGroup
objectClass: groupofuniquenames
objectClass: top
gidNumber: 7203
cn: grp3_user6
memberUid: user6

dn: cn=grp4_user6,ou=Groups,dc=example,dc=com
objectClass: posixGroup
objectClass: groupofuniquenames
objectClass: top
gidNumber: 8203
cn: grp4_user6
memberUid: user6

dn: cn=grp5_user6,ou=Groups,dc=example,dc=com
objectClass: posixGroup
objectClass: groupofuniquenames
objectClass: top
gidNumber: 9203
cn: grp5_user6
memberUid: user6

dn: cn=parent_grp1_user6,ou=Groups,dc=example,dc=com
objectClass: posixGroup
objectClass: groupofuniquenames
objectClass: top
gidNumber: 10203
cn: parent_grp1_user6
memberUid: grp1_user6
memberUid: user6


2. Enumerate user6 and it's groups
# getent -s sss passwd user6
user6:*:5203:5203:user6:/home/user6:/bin/bash

# getent -s sss group grp1_user6
grp1_user6:*:5203:user6

# id user6
uid=5203(user6) gid=5203(grp1_user6) groups=5203(grp1_user6),8203(grp4_user6),7203(grp3_user6),10203(parent_grp1_user6),6203(grp2_user6),9203(grp5_user6) context=root:system_r:unconfined_t:SystemLow-SystemHigh

# id -g user6
5203

# groups user6
user6 : grp1_user6 grp4_user6 grp3_user6 parent_grp1_user6 grp2_user6 grp5_user6

Comment 5 Miroslav Svoboda 2011-07-15 13:24:33 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
While running the LDAP cache cleanup task, an issue with a corrupted group cache occurred, and the user was stripped of membership of every group except his primary group. This issue has been fixed and the aforementioned problem now no longer occurs.

Comment 6 errata-xmlrpc 2011-07-21 08:10:30 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0975.html