709178 – "groups user" and "finger gecos" fails

Bug 709178 - "groups user" and "finger gecos" fails

Summary: "groups user" and "finger gecos" fails

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	sssd
Sub Component:
Version:	14
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	Stephen Gallagher
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	725868 748857
TreeView+	depends on / blocked

Reported:	2011-05-31 00:21 UTC by Peter Glassenbury
Modified:	2020-05-02 16:21 UTC (History)
CC List:	4 users (show)
Fixed In Version:	sssd-1.5.12-1.fc15
Clone Of:
Clones:	725868 (view as bug list)
Environment:
Last Closed:	2011-08-17 01:15:36 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
sanitised version of sssd_LDAP.log (580.07 KB, text/plain) 2011-06-02 02:20 UTC, Peter Glassenbury	no flags	Details
Just the id -G user (25.19 KB, application/octet-stream) 2011-06-07 02:28 UTC, Peter Glassenbury	no flags	Details
sssd_ldap.log with the cache_LDAP.ldb deleted (126.84 KB, application/octet-stream) 2011-06-07 03:25 UTC, Peter Glassenbury	no flags	Details
working diagnostic log (113.72 KB, application/octet-stream) 2011-06-28 23:41 UTC, Peter Glassenbury	no flags	Details
diagnostic log of id -G that fails to return correct data (118.15 KB, application/octet-stream) 2011-06-28 23:44 UTC, Peter Glassenbury	no flags	Details
log associated with comment21 (23.10 KB, application/octet-stream) 2011-06-30 02:06 UTC, Peter Glassenbury	no flags	Details
sssd.log file associated with comment21 (104 bytes, application/octet-stream) 2011-06-30 02:07 UTC, Peter Glassenbury	no flags	Details
nss.log associated with comment 21 (15.06 KB, application/octet-stream) 2011-06-30 02:09 UTC, Peter Glassenbury	no flags	Details
nss.log associated with comment 27 (17.77 KB, application/octet-stream) 2011-07-01 00:29 UTC, Peter Glassenbury	no flags	Details
comment 30 -- sssd_nss.log (15.55 KB, application/octet-stream) 2011-07-04 03:42 UTC, Peter Glassenbury	no flags	Details
comment 30 -- sssd_LDAP.log (23.37 KB, application/octet-stream) 2011-07-04 03:43 UTC, Peter Glassenbury	no flags	Details
Show Obsolete (1) View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	SSSD sssd issues 1958	0	None	closed	Explicitly ignore groups with gidNumber = 0	2020-06-10 05:36:01 UTC

Description Peter Glassenbury 2011-05-31 00:21:51 UTC

Description of problem:
groups usercode returns with ONLY the default group
finger peter (gecos) fails with "no such user" but on usercode works

Version-Release number of selected component (if applicable):
sssd-1.5.7-1 and sssd-1.5.8-1(from updates testing) - tried with both
to a remote ldap server centos-ds-base-8.1.0-0.14.el5.centos.2
(sorry - hasn't shifted to redhat yet)

How reproducible:
 groups always for certain users
 finger always

Steps to Reproduce:
1. /etc/nsswitch has "files sss" for password, shadow, group
2. groups pg123 
3. finger pg123 ; finger peter
  
Actual results:
# groups pg123
pg123 : staff
# finger pg123
Login: pg123        	Name: Peter Testing
# finger peter
finger: peter: no such user.
#

Expected results:
groups pg123
pg123 : staff root group1 test2 team3 research .... and so on for about 20 groups

finger peter to produce the same result as finger pg123 (as well as any others
with peter in the gecos field. 

Additional info:
because groups only returns the one group, any file permissions that depend on the other groups fail.
Some users do return the list of groups.. but AFAIK they are only users with less than 16 groups..(I remember that being a limitation YEARS ago - has it come back)...

This may be a confusion but if we add ldap to the end of of sss in nsswitch,
then groups and finger work but there is a big long delay(5 min or so) on a login or "su - user"

sssd.conf has
[sssd]
config_file_version = 2
sbus_timeout = 30
services = nss, pam
domains = LDAP
[nss]
filter_groups = root
filter_users = root
reconnection_retries = 3
[pam]
reconnection_retries = 3
[domain/LDAP]
debug_level = 1               this had been from 0 - 10 but nothing obvious..
id_provider = ldap
auth_provider = ldap
ldap_schema = rfc2307
ldap_uri = ldap://ldapserver1.domain.ac.nz/,ldap://ldapserver2.domain.in.nz/
ldap_search_base = dc=Our,dc=domain,dc=in,dc=nz
ldap_tls_reqcert = demand
ldap_id_use_start_tls = TRUE
cache_credentials = TRUE
enumerate = FALSE
entry_cache_timeout = 5400



Please ask for more info to supply...

Comment 1 Jakub Hrozek 2011-05-31 06:39:29 UTC

Please set debug_level to 10 in your /etc/sssd/sssd.conf, run the getent/finger cases and then attach /var/log/sssd/sssd_LDAP.log 

By adding ldap to nsswitch, you enable nss-pam-ldapd for user/groups request, which for some reason is able to return the info.

There is no artificial limitation on the number of groups, so all of them should be returned, not just 16 etc.

Comment 2 Jakub Hrozek 2011-05-31 09:33:02 UTC

(In reply to comment #0)
> finger peter to produce the same result as finger pg123 (as well as any others
> with peter in the gecos field. 
> 

As Sumit educated me off-list, this can only work if you set enumerate=TRUE. 

The reason being that finger iterates through all user accounts and tries to match the finger argument against the gecos field.

Comment 3 Stephen Gallagher 2011-05-31 12:12:05 UTC

There are two common reasons why 'groups userid' would fail.

1) Your LDAP schema is actually RFC2307bis, not RFC2307. This is because the two schemas store groups differently. RFC2307 uses the memberuid attribute, which stores the name of the members of the groups. RFC2307bis however uses the member attribute, which stores the DN of the members of the groups (which allows for nested groups). To test this possibility, try using 'ldap_schema = rfc2307bis'

2) The other possibility is that your ldap_search_base is not set correctly (which I can't verify here because you over-sanitized it above). It's a common mistake to use
cn=Users,dc=example,dc=com
for the LDAP search base, which means that the groups will never be located, because they exist in the cn=Groups,dc=example,dc=com tree.

Comment 4 Peter Glassenbury 2011-05-31 22:06:28 UTC

(In reply to comment #2)
> (In reply to comment #0)
> > finger peter to produce the same result as finger pg123 (as well as any others
> > with peter in the gecos field. 
> > 
> 
> As Sumit educated me off-list, this can only work if you set enumerate=TRUE. 
> 
> The reason being that finger iterates through all user accounts and tries to
> match the finger argument against the gecos field.

Ahh .. that fixed the finger one OK.

(In reply to comment #3)
> (In reply to comment #0)
> >1) Your LDAP schema is actually RFC2307bis, not RFC2307.
Checked this... definitely RFC2307.. using the memberuid. (looked on the ldap server .. changed to RFC2307bis for just to test.. and doesn't work

> >2) The other possibility is that your ldap_search_base is not set correctly
It is correct... it is working for some people(with small number of group memberships)
here is the search base actual rather than sanitized :-)
ldap_search_base = dc=csse,dc=canterbury,dc=ac,dc=nz

Comment 5 Peter Glassenbury 2011-06-01 00:25:18 UTC

(In reply to comment #1)
> Please set debug_level to 10 in your /etc/sssd/sssd.conf, run the getent/finger
> cases and then attach /var/log/sssd/sssd_LDAP.log 
> 
Done this.. ..Have part sanitised by removing most of the user list download from, I assume, the enumerate=true, BUT  How do I attach the rest in without uploading our whole password file worth of usernames in the groups  debug?

File size left .. 600K 8500 lines.

Comment 6 Stephen Gallagher 2011-06-01 11:57:12 UTC

(In reply to comment #5)
> (In reply to comment #1)
> > Please set debug_level to 10 in your /etc/sssd/sssd.conf, run the getent/finger
> > cases and then attach /var/log/sssd/sssd_LDAP.log 
> > 
> Done this.. ..Have part sanitised by removing most of the user list download
> from, I assume, the enumerate=true, BUT  How do I attach the rest in without
> uploading our whole password file worth of usernames in the groups  debug?
> 
> File size left .. 600K 8500 lines.

Let's try a simple case first. Execute the following commands as root:

service sssd stop
rm -f /var/log/sssd/sssd_LDAP.log
service sssd start
getent group <a real group name>

Please report what the 'getent' command returns (including whether it correctly reports the members of the group). If it is incorrect, please send us the sssd_LDAP.log for this one action.

Second simple case:
service sssd stop
rm -f /var/log/sssd/sssd_LDAP.log
service sssd start
id -G <a real user name>

This should provide a list of the GIDs of all groups the user belongs to. If this list is incomplete (or consists only of primary GID), please send us this sssd_LDAP.log for this one action.

You can upload them as private attachments so that only Red Hat employees have access to it.

Comment 7 Peter Glassenbury 2011-06-02 02:08:06 UTC

(In reply to comment #6)
> (In reply to comment #5)
> > (In reply to comment #1)
> > > Please set debug_level to 10 in your /etc/sssd/sssd.conf, run the getent/finger
> > > cases and then attach /var/log/sssd/sssd_LDAP.log 
> > > 
> > Done this.. ..Have part sanitised by removing most of the user list download
> > from, I assume, the enumerate=true, BUT  How do I attach the rest in without
> > uploading our whole password file worth of usernames in the groups  debug?
> > 
> > File size left .. 600K 8500 lines.
> 
> Let's try a simple case first. Execute the following commands as root:
> getent group <a real group name>
Done this ... it works ...
getent group lmadmin
lmadmin:*:111:pj123
>  If it is incorrect, please send us the
so this bit works .. haven't done a sssd.LDAP.log

> 
> Second simple case:
> service sssd stop
> rm -f /var/log/sssd/sssd_LDAP.log
> service sssd start
> id -G <a real user name>
This is the one I worked on before you posted this... I used 'groups pj123'
and the groups command is the same as id -Gn

> 
> This should provide a list of the GIDs of all groups the user belongs to. If
> this list is incomplete (or consists only of primary GID), please send us this
> sssd_LDAP.log for this one action.
This is the problem .. consists of ONLY the primary GID
> 
> You can upload them as private attachments so that only Red Hat employees have
> access to it.
Ran the command on a RHEL 5.6 machine to see if it worked..
and got what was supposed to be the result ... same ldap server.. so issue 
definitely with fedora 14
# groups pg123  
pg123: staff database dba linux lmadmin macuser matlab research root simuser uucp tutor webadmin webccc
#

Will attach the part sanitised file..

Comment 8 Peter Glassenbury 2011-06-02 02:20:59 UTC

Created attachment 502414 [details]
sanitised version of sssd_LDAP.log

This was from the debug=10
service sssd stop
service sssd start
groups pjg34 (my real user id... changed to pg123 in the log)
service sssd stop

There was over a hundred thousand lines I think.. from the enumerate=true..
which I believe caches the password file locally. I have removed almost all
and sanitised the 3 or 4 left... including mine. This still shows the ldap structure where there are some ou structures that hold staff and students and other groupings of people. Then at around line 720 I have commented that I
deleted all the rest of the users which were just a repeat of the 4 examples.

Then was all the group stuff which I didn't know how to sanitise... so have left as is... can you check it is redhat staff only please.. the only way
I could see to do this was as a private bug...

Cheers
Pete .

Comment 9 Peter Glassenbury 2011-06-02 02:23:55 UTC

(In reply to comment #7)
> (In reply to comment #6)

> Ran the command on a RHEL 5.6 machine to see if it worked..and it did..

Brain fade... of course it worked.. RHEL 5.6 system was ldap .... not sssd so 
different system of ldap lookup...

Comment 10 Stephen Gallagher 2011-06-02 13:56:49 UTC

(In reply to comment #8)
> Created attachment 502414 [details]
> sanitised version of sssd_LDAP.log
> 
> This was from the debug=10
> service sssd stop
> service sssd start
> groups pjg34 (my real user id... changed to pg123 in the log)
> service sssd stop
> 
> There was over a hundred thousand lines I think.. from the enumerate=true..

Sorry, I was unclear before. I wanted you to do this with enumerate=false (that way the logs would ONLY contain the information relevant to this specific lookup).

Comment 11 Peter Glassenbury 2011-06-07 02:28:58 UTC

Created attachment 503357 [details]
Just the id -G user

/sbin/service sssd start
 id -G pjg34
/sbin/service sssd stop

with enumerate=False
and  debug_level=10

result was 9600 (default group from passwd entry in ldap server) 
none of other groups.

Comment 12 Stephen Gallagher 2011-06-07 03:06:30 UTC

(In reply to comment #11)
> Created attachment 503357 [details]
> Just the id -G user
> 
> /sbin/service sssd start
>  id -G pjg34
> /sbin/service sssd stop
> 
> with enumerate=False
> and  debug_level=10
> 
> result was 9600 (default group from passwd entry in ldap server) 
> none of other groups.


Could you do
rm -f /var/lib/sss/db/cache_LDAP.ldb
before starting SSSD and running the ID?

I think the cache answered the request above, because it contained no attempts to contact LDAP (and I need to see what it's trying to get, and what it's actually getting)

Comment 13 Peter Glassenbury 2011-06-07 03:25:04 UTC

Created attachment 503368 [details]
sssd_ldap.log with the cache_LDAP.ldb deleted

service sssd stop
rm -f /var/lib/sss/db/cache_LDAP.ldb
rm -f /var/log/sssd/sssd_LDAP.log
/sbin/service sssd start
id -G pjg34
/sbin/service sssd stop

Comment 14 Jakub Hrozek 2011-06-07 09:06:32 UTC

From the log it seems that the cleanup task tried to remove the groups that were just created. And because they were fake groups, they had a GID of 0, so the cleanup failed.

Comment 15 Peter Glassenbury 2011-06-10 04:08:36 UTC

Do I need to do any further testing or debugging to help diagnose this to get a fix?

Comment 16 Peter Glassenbury 2011-06-27 01:31:56 UTC

Has there been enough diagnostics to call this a bug ?? 
Do you need anything else?
or is it being looked at in the background?

Comment 17 Stephen Gallagher 2011-06-27 11:39:24 UTC

(In reply to comment #16)
> Has there been enough diagnostics to call this a bug ?? 
> Do you need anything else?
> or is it being looked at in the background?

Sorry Peter, the engineer that was looking into this went on vacation and I didn't realize this bug was going unattended.

I think the cleanup task error is a red herring, because that's ocurring AFTER the initgroups() lookup has already returned success. Please disable it by setting "ldap_purge_cache_timeout = 0" in the [domain/LDAP] sections of sssd.conf. This will allow us to rule it out.

You never specified clearly whether the command in Comment 13 returned all of the correct groups, some of them, or none of them (except primary GID).

Could you please rerun that test (with the cache-cleanup disabled) and report
1) What is the output of the id -G pjg34?
2) yum install ldb-tools and run the command (as root):
ldbsearch -H /var/lib/sss/db/cache_LDAP.ldb "(&(objectclass=user)(name=pjg34))"

Please include the output of both 1) and 2).

Comment 18 Peter Glassenbury 2011-06-28 23:39:10 UTC

All set in sssd.conf
debug_level = 10
enumerate = FALSE
ldap_purge_cache_timeout = 0

# service sssd stop
# rm -f /var/lib/sss/db/cache_LDAP.ldb
# rm -f /var/log/sssd/sssd_LDAP.log
# /sbin/service sssd start
# id -G pjg34
9600
# service sssd stop
# ldbsearch -H /var/lib/sss/db/cache_LDAP.ldb "(&(objectclass=user)(name=pjg34))"
asq: Unable to register control with rootdse!
# record 1
dn: name=pjg34,cn=users,cn=LDAP,cn=sysdb
createTimestamp: 1309303388
fullName: Peter Glassenbury, MSCS 223, 7762
gecos: Peter Glassenbury, MSCS 223, 7762
gidNumber: 9600
homeDirectory: /home/cosc/staff/pjg34
loginShell: /bin/bash
name: pjg34
objectClass: user
uidNumber: 5500
originalDN: cn=pjg34,ou=Staff,ou=People,dc=csse,dc=canterbury,dc=ac,dc=nz
originalModifyTimestamp: 20110208214050Z
nsAccountLock: False
lastUpdate: 1309303388
dataExpireTimestamp: 1309308788
initgrExpireTimestamp: 1309308788
memberof: name=webccc,cn=groups,cn=LDAP,cn=sysdb
memberof: name=database,cn=groups,cn=LDAP,cn=sysdb
memberof: name=lmadmin,cn=groups,cn=LDAP,cn=sysdb
memberof: name=research,cn=groups,cn=LDAP,cn=sysdb
memberof: name=simuser,cn=groups,cn=LDAP,cn=sysdb
memberof: name=dba,cn=groups,cn=LDAP,cn=sysdb
memberof: name=matlab,cn=groups,cn=LDAP,cn=sysdb
memberof: name=tutor,cn=groups,cn=LDAP,cn=sysdb
memberof: name=root,cn=groups,cn=LDAP,cn=sysdb
memberof: name=webadmin,cn=groups,cn=LDAP,cn=sysdb
memberof: name=macuser,cn=groups,cn=LDAP,cn=sysdb
memberof: name=sysadmin,cn=groups,cn=LDAP,cn=sysdb
memberof: name=linux,cn=groups,cn=LDAP,cn=sysdb
distinguishedName: name=pjg34,cn=users,cn=LDAP,cn=sysdb

# returned 1 records
# 1 entries
# 0 referrals

# 

As an extra diagnostic... Here is someone that works... (so it may be compared?)
# /sbin/service sssd start
# id -G ysu17
9600 9010 9110 9122 9024 9000 9131 9100 9121 9200 9300 9004 9036 9021
# /sbin/service sssd stop
#
# ldbsearch -H /var/lib/sss/db/cache_LDAP.ldb "(&(objectclass=user)(name=ysu17))"
asq: Unable to register control with rootdse!
# record 1
dn: name=ysu17,cn=users,cn=LDAP,cn=sysdb
createTimestamp: 1309303790
fullName: Yalini Sundralingam, MSCS 322, 8207
gecos: Yalini Sundralingam, MSCS 322, 8207
gidNumber: 9600
homeDirectory: /home/cosc/staff/ysu17
loginShell: /bin/bash
name: ysu17
objectClass: user
uidNumber: 5523
originalDN: cn=ysu17,ou=Staff,ou=People,dc=csse,dc=canterbury,dc=ac,dc=nz
originalModifyTimestamp: 20110208214142Z
nsAccountLock: False
initgrExpireTimestamp: 1309309191
lastUpdate: 1309303791
dataExpireTimestamp: 1309309191
memberof: name=mark110,cn=groups,cn=LDAP,cn=sysdb
memberof: name=team110,cn=groups,cn=LDAP,cn=sysdb
memberof: name=team122,cn=groups,cn=LDAP,cn=sysdb
memberof: name=research,cn=groups,cn=LDAP,cn=sysdb
memberof: name=stage0,cn=groups,cn=LDAP,cn=sysdb
memberof: name=mark121,cn=groups,cn=LDAP,cn=sysdb
memberof: name=stage1,cn=groups,cn=LDAP,cn=sysdb
memberof: name=team121,cn=groups,cn=LDAP,cn=sysdb
memberof: name=stage2,cn=groups,cn=LDAP,cn=sysdb
memberof: name=stage3,cn=groups,cn=LDAP,cn=sysdb
memberof: name=webadmin,cn=groups,cn=LDAP,cn=sysdb
memberof: name=macuser,cn=groups,cn=LDAP,cn=sysdb
memberof: name=phoenix,cn=groups,cn=LDAP,cn=sysdb
distinguishedName: name=ysu17,cn=users,cn=LDAP,cn=sysdb

# returned 1 records
# 1 entries
# 0 referrals
#

TWO files attached... one for the not working pjg34 (sssd_LDAP.log.20110629) and one for the working ysu17(sssd_LDAP.log.working20110629)

Comment 19 Peter Glassenbury 2011-06-28 23:41:26 UTC

Created attachment 510374 [details]
working diagnostic log

Comment 20 Peter Glassenbury 2011-06-28 23:44:04 UTC

Created attachment 510375 [details]
diagnostic log of id -G that fails to return correct data

Comment 21 Stephen Gallagher 2011-06-29 12:41:32 UTC

Both of those logs appear to be identical.

Moreover, the ldbsearch of pjg34 actually contains all of the group members. They should be returning properly...

Could you run the failing test one more time and provide me with the following data:

The sssd_nss.log (just the section with the id -G request)
The output of:
ldbsearch -H /var/lib/sss/db/cache_LDAP.ldb "(&(objectclass=group)(name=linux))"

I want to see if the bug is in the NSS responder instead of the LDAP back-end. And the ldbsearch will show me whether the group is missing data for some reason. It should contain its name, its GID and pjg34 as a member.

Please rerun this test from a purged cache, so I know that the group hasn't been populated by some other user lookup.

Sorry for all the back-and-forth. This is definitely a bug, but I'm not sure where it is yet.

Comment 22 Peter Glassenbury 2011-06-30 02:03:53 UTC

No worries about the back and forth.. need to find the bug... We can't upgrade a server from f12 until we get the ldap/sssd delays issue fixed.

Initially I just ran the ldapsearch -H listed in comment 21 and got ...
# ldapsearch  -H /var/lib/sss/db/cache_LDAP.ldb  "(&(objectclass=group)(name=linux))"
Could not parse LDAP URI(s)=/var/lib/sss/db/cache_LDAP.ldb (3)
#

Went back and Stop sssd; edited sssd.conf to have
debug_level = 10 (in the [nss] section )
and in the [domain/LDAP] section
debug_level = 10
enumerate = FALSE
ldap_purge_cache_timeout = 0

restart sssd
ran id -G pjg34
stop sssd 
And I will attach the three log files(The command still returned 9600 only -- which is the default group from the password entry.)

In this state (ie stopped), I ran the ldapsearch command again got what appears to be a working search...
# ldbsearch -H /var/lib/sss/db/cache_LDAP.ldb "(&(objectclass=group)(name=linux))"
asq: Unable to register control with rootdse!
# record 1
dn: name=linux,cn=groups,cn=LDAP,cn=sysdb
createTimestamp: 1309303388
gidNumber: 9020
name: linux
objectClass: group
originalDN: cn=linux,ou=UnixGroups,dc=csse,dc=canterbury,dc=ac,dc=nz
memberuid: pjg34
memberuid: sanitised2
memberuid: sanitised3
memberuid: sanitised4
memberuid: sanitised5
memberuid: sanitised6
originalModifyTimestamp: 20110127203725Z
member: name=sanitised2,cn=users,cn=LDAP,cn=sysdb
member: name=sanitised3,cn=users,cn=LDAP,cn=sysdb
member: name=sanitised4,cn=users,cn=LDAP,cn=sysdb
member: name=pjg34,cn=users,cn=LDAP,cn=sysdb
member: name=sanitised5,cn=users,cn=LDAP,cn=sysdb
member: name=sanitised6,cn=users,cn=LDAP,cn=sysdb
lastUpdate: 1309397848
dataExpireTimestamp: 1309403248
distinguishedName: name=linux,cn=groups,cn=LDAP,cn=sysdb

# returned 1 records
# 1 entries
# 0 referrals
#
(the list in "member:" grouping appears to be in alphabetical order 
whereas "memberuid:" is in a different order...but ALL are there)

Comment 23 Peter Glassenbury 2011-06-30 02:06:21 UTC

Created attachment 510559 [details]
log associated with comment21

Comment 24 Peter Glassenbury 2011-06-30 02:07:21 UTC

Created attachment 510561 [details]
sssd.log file associated with comment21

Comment 25 Peter Glassenbury 2011-06-30 02:09:14 UTC

Created attachment 510562 [details]
nss.log associated with comment 21

I think this caught the extra debug level = 10 I put in the nss section

Comment 26 Stephen Gallagher 2011-06-30 14:46:00 UTC

(Thu Jun 30 13:43:27 2011) [sssd[nss]] [fill_initgr] (1): Incomplete group object for initgroups! Aborting

That's definitely the problem. One of the groups appears to be incomplete. That log message isn't helpful enough (doesn't tell us which group), so I've created a scratch-build to get us a little more information, please re-run the above test and attach the nss.log again.
http://koji.fedoraproject.org/koji/taskinfo?taskID=3171967

Look for the "Incomplete group object" line again, near the end of the nss.log. Please run the ldbsearch command from above on the group that it reports and include that output. (Note, ldbsearch != ldapsearch, which was the cause of that first failure in comment 22)

If that search reveals that there is no gidNumber attribute, please also run:
ldapsearch -x -H ldap://ldap.example.com -b dc=example,dc=com "(&(objectClass=posixGroup)(cn=groupname))"

(Substitute your server and base search as appropriate) I want to see if the entry on the LDAP server is incorrect. Mostly I'm looking to see if it has no GID number.

Comment 27 Peter Glassenbury 2011-07-01 00:28:55 UTC

Don't know what I am doing with koji... This is what was done...
Went to ...
http://koji.fedoraproject.org/koji/taskinfo?taskID=3171968
downloaded these
  sssd-1.5.8-1.fc14.1.x86_64.rpm
  sssd-client-1.5.8-1.fc14.1.x86_64.rpm
  sssd-debuginfo-1.5.8-1.fc14.1.x86_64.rpm
  sssd-tools-1.5.8-1.fc14.1.x86_64.rpm

yum install --nogpgcheck sssd-1.5.8-1.fc14.1.x86_64.rpm  sssd-client-1.5.8-1.fc14.1.x86_64.rpm  sssd-debuginfo-1.5.8-1.fc14.1.x86_64.rpm  sssd-tools-1.5.8-1.fc14.1.x86_64.rpm 

forgot the debug_level = 10 in domain/LDAP but since you only want the nss.log
it is OK with debug_level=10 and is attached.

sssd.log has the one line message of 
(Fri Jul  1 12:02:03 2011) [sssd] [monitor_quit] (0): Monitor received Terminated: terminating children

could not do the ldbsearch or ldapsearch as the last three lines of log have the groupname as "null"

(Fri Jul  1 12:01:58 2011) [sssd[nss]] [nss_cmd_initgroups_search] (6): Initgroups for [pjg34@LDAP] completed
(Fri Jul  1 12:01:58 2011) [sssd[nss]] [fill_initgr] (1): Incomplete group object [(null)] for initgroups! Aborting
(Fri Jul  1 12:01:58 2011) [sssd[nss]] [nss_cmd_initgroups_dp_callback] (1): Fatal error, killing connection!

but did run the ldbsearch you asked for in comment22...just for completeness

# ldbsearch -H /var/lib/sss/db/cache_LDAP.ldb "(&(objectclass=group)(name=linux))"
asq: Unable to register control with rootdse!
# record 1
dn: name=linux,cn=groups,cn=LDAP,cn=sysdb
createTimestamp: 1309478517
gidNumber: 9020
name: linux
objectClass: group
lastUpdate: 1309478517
dataExpireTimestamp: 1309478516
isPosix: TRUE
originalDN: cn=linux,ou=UnixGroups,dc=csse,dc=canterbury,dc=ac,dc=nz
member: name=pjg34,cn=users,cn=LDAP,cn=sysdb
memberuid: pjg34
distinguishedName: name=linux,cn=groups,cn=LDAP,cn=sysdb

# returned 1 records
# 1 entries
# 0 referrals

Comment 28 Peter Glassenbury 2011-07-01 00:29:53 UTC

Created attachment 510775 [details]
nss.log associated with comment 27

Comment 29 Stephen Gallagher 2011-07-01 11:36:30 UTC

Sorry, I made a mistake in the patch I included in that last build. It was always going to show (null) there...

I've created a new scratch build here:
http://koji.fedoraproject.org/koji/taskinfo?taskID=3173716

Please download and install it as you did above (that was exactly correct, for the record) and try it again. Hopefully this will be more useful.

Comment 30 Peter Glassenbury 2011-07-04 03:29:29 UTC

Well.... ran this again and got similar results. the root group is in ldap and 
has groupid = 0 (which is null in the old run)
dumping out the contents for group root on the ldap server shows it as a standard group with a bunch of members and a groupid of zero.
# extended LDIF
#
# LDAPv3
# base <dc=csse,dc=canterbury,dc=ac,dc=nz> (default) with scope subtree
# filter: cn=root
# requesting: ALL
#

# root, UnixGroups, csse.canterbury.ac.nz
dn: cn=root,ou=UnixGroups,dc=csse,dc=canterbury,dc=ac,dc=nz
objectClass: posixGroup
objectClass: top 
memberUid: pjg34 
And others...
gidNumber: 0
cn: root

I will attach both the sssd files

# tail sssd_nss.log
(Mon Jul  4 14:53:23 2011) [sssd[nss]] [nss_cmd_initgroups_search] (6): Initgroups for [pjg34@LDAP] completed
(Mon Jul  4 14:53:23 2011) [sssd[nss]] [fill_initgr] (1): Incomplete group object [root] for initgroups! Aborting
(Mon Jul  4 14:53:23 2011) [sssd[nss]] [client_recv] (0): Failed to execute request, aborting client!

# ldbsearch -H /var/lib/sss/db/cache_LDAP.ldb "(&(objectclass=group)(name=root))"
asq: Unable to register control with rootdse!
# record 1
dn: name=root,cn=groups,cn=LDAP,cn=sysdb
createTimestamp: 1309478517
gidNumber: 0
name: root
objectClass: group
lastUpdate: 1309478517
dataExpireTimestamp: 1309478516
isPosix: TRUE
originalDN: cn=root,ou=UnixGroups,dc=csse,dc=canterbury,dc=ac,dc=nz
member: name=pjg34,cn=users,cn=LDAP,cn=sysdb
memberuid: pjg34
distinguishedName: name=root,cn=groups,cn=LDAP,cn=sysdb

# returned 1 records
# 1 entries
# 0 referrals
#

Comment 31 Peter Glassenbury 2011-07-04 03:42:48 UTC

Created attachment 511102 [details]
comment 30 -- sssd_nss.log

Comment 32 Peter Glassenbury 2011-07-04 03:43:23 UTC

Created attachment 511103 [details]
comment 30 -- sssd_LDAP.log

Comment 33 Simo Sorce 2011-07-04 12:20:14 UTC

I think we explicitly disallow users and groups with uidNumber = 0 or gidNumber = 0 from being ever downloaded from ldap.

Is this the only group that gives you errors ?
Do you have issues with users that are not member of that group ?

Comment 34 Peter Glassenbury 2011-07-05 00:28:22 UTC

Ah....That looks to be the bug...
I have done a sampling of 40 or 50 of the users and the only
ones that show up as having ONLY the default group are the ones
that are in group root.

I can understand the user root issue... A main part of that
is that if networking is down a local root is needed to login!!

but not for group root ???
How do other people manage group root permissions on hundreds of machines?
We have used the ldap group root for years.

Comment 35 Stephen Gallagher 2011-07-05 11:42:10 UTC

(In reply to comment #34)
> How do other people manage group root permissions on hundreds of machines?

Primarily sudo. This is a FAR better way to manage root permissions.

> We have used the ldap group root for years.

Using any "standard" group name is strongly discouraged in LDAP, except in those rare cases where you're dealing with a product that cannot be configured to use a different group name.

The group "root" is not supposed to be used as a "group of users treated as root". It's meant to be a user-private group for the root user. (The intent being that when root creates a file, the default group is set to the private group so that it is readable ONLY by root. This is the same reason for user-private groups for end users, but it's far more important for the root user).

The correct (and much safer and more audit-friendly) solution is to create role-based groups in LDAP (ideally with GID values > 2000 so as not to ever conflict with local accounts). Then you set the group ownership of files and processes for those roles.

This also makes it much easier to delegate SPECIFIC root tasks, rather than grant some users carte-blanch to make changes to everything on the system.

For example, you might create the group 'bind-admins' and set the group access permissions for all of the BIND DNS configuration files to be editable by the bind-admins group.



As an official statement, we're not planning to "fix" this problem. We designed the SSSD from the very beginning to leave UID and GID 0 alone, and several places in the code rely on being able to treat UID or GID 0 as an error condition. Please fix your LDAP server.

Comment 36 Stephen Gallagher 2011-07-05 11:47:07 UTC

Actually, I'll amend that. I think we may want to modify our search filters so that we explicitly suppress lookups for gidNumber = 0. This way the group will simply be missing instead of causing an error.

I've opened ticket https://fedorahosted.org/sssd/ticket/916 upstream to track this.

Comment 37 Peter Glassenbury 2011-07-06 02:58:48 UTC

comment 36 would be preferable since at the moment we get an error and
NO valid groups are returned(other than default from password entry).
At least if we have the suppress in place, then our other groups 
would work..
A query from your previous comment... our groups and users have ID values > 1000
(for the whole campus)..Has the lower limit for system accounts changed to 2000 now?? I thought system accounts were uid/gid less than 500?

Comment 38 Simo Sorce 2011-07-06 11:36:56 UTC

The limit for system accounts is being changed from 500 to 1000 in Rawhide, so you should have no problem.

Comment 39 Stephen Gallagher 2011-07-06 12:43:57 UTC

(In reply to comment #37)
> A query from your previous comment... our groups and users have ID values >
> 1000
> (for the whole campus)..Has the lower limit for system accounts changed to 2000
> now?? I thought system accounts were uid/gid less than 500?

That was a guideline, not a rule. I usually recommend that people keep their LDAP users and groups above 2000 so as not to have to worry about conflicts with any users and groups in /etc/passwd or /etc/group

Comment 40 Fedora Update System 2011-08-05 14:36:31 UTC

sssd-1.5.12-1.fc15 has been submitted as an update for Fedora 15.
https://admin.fedoraproject.org/updates/sssd-1.5.12-1.fc15

Comment 41 Fedora Update System 2011-08-05 23:56:42 UTC

Package sssd-1.5.12-1.fc15:
* should fix your issue,
* was pushed to the Fedora 15 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing sssd-1.5.12-1.fc15'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/sssd-1.5.12-1.fc15
then log in and leave karma (feedback).

Comment 42 Fedora Update System 2011-08-17 01:15:29 UTC

sssd-1.5.12-1.fc15 has been pushed to the Fedora 15 stable repository.  If problems still persist, please make note of it in this bug report.

Note You need to log in before you can comment on or make changes to this bug.