Bug 1283477

Summary: Improve sssd nss multiple groups message
Product: Red Hat Enterprise Linux 7 Reporter: Paul Wayper <pwayper>
Component: sssdAssignee: Lukas Slebodnik <lslebodn>
Status: CLOSED WORKSFORME QA Contact: Namita Soman <nsoman>
Severity: high Docs Contact:
Priority: unspecified    
Version: 7.1CC: ekeck, grajaiya, jhrozek, lslebodn, mkosek, mzidek, pbrezina, preichl, pwayper
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-01-07 08:40:37 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Paul Wayper 2015-11-19 05:18:22 UTC
Description of problem:

Several functions in the src/responder/nss/nsssrv_cmd.c code detect if they get more than one result and return an error if they were expecting only one.  The code follows this pattern:

        if (dctx->res->count > 1) {
            DEBUG(SSSDBG_FATAL_FAILURE,
                  "getpwuid call returned more than one result !?!\n");
            sss_log(SSS_LOG_ERR,
                    "More users have the same UID [%"PRIu32"] in directory "
                    "server. SSSD will not work correctly.\n", cmdctx->id);
            ret = ENOENT;
            goto done;
        }

In the case of user names it is easy to find the user by name and investigate their properties.  However, in the case of UIDs and GIDs, this is harder to diagnose.

Version-Release number of selected component (if applicable):

sssd-1.12.4-47.el6.x86_64
sssd-ad-1.12.4-47.el6.x86_64
sssd-client-1.12.4-47.el6.x86_64
sssd-common-1.12.4-47.el6.x86_64
sssd-common-pac-1.12.4-47.el6.x86_64
sssd-ipa-1.12.4-47.el6.x86_64
sssd-krb5-1.12.4-47.el6.x86_64
sssd-krb5-common-1.12.4-47.el6.x86_64
sssd-ldap-1.12.4-47.el6.x86_64
sssd-proxy-1.12.4-47.el6.x86_64
sssd-tools-1.12.4-47.el6.x86_64

How reproducible:

Quite reliable

Steps to Reproduce:
1. Do something that causes two groups to map to the same ID.
2. getent group onegroup

Actual results:

3. Get message "More groups have the same GID [1234567890] in directory server. SSSD will not work correctly."

Expected results:

3. Get something like:

"Groups 1592648730, 1357924680 and 1472583690 map to the same GID 1234567890 in directory server.  SSSD will not work correctly."

Additional info:

Comment 2 Jakub Hrozek 2015-11-19 08:38:37 UTC
This error means that there are two (or more) objects in the cache with the same numerical ID. This can either be a reflection of server misconfiguration in case the admin assigns the IDs and assigns the same one to two objects.

But neither of the two cases linked uses manual POSIX IDs, both use algorithmical ID mapping. There SSSD derives IDs from Windows SIDs on its own, so conflicts shouldn't happen.

btw I don't think this is a mapping conflict per se, but a failure to rename a group correctly. What we do during initgroups() with ID mapping schema is that we derive group GIDs from Window SIDs and store group "stubs" with just the SID and the GID (no name) to cache. Later, when the groups are resolved into full objects with name etc we should remove the stub object and insert a full object instead -- looks like we have an error there.

In the e-mail conversation earlier you indicated the bug was reproducable. Could you please attach debug logs with a high debug level to this bug? If there are some steps that help reproduce the bug (like login or running id), then please also run "date" when running those commands so that we can match the debug logs with the commands.

See https://fedorahosted.org/sssd/wiki/Reporting_sssd_bugs#Includenecessarydebuggingdata and https://fedorahosted.org/sssd/wiki/Troubleshooting for some more details.

Comment 3 Lukas Slebodnik 2015-11-19 08:49:37 UTC
I would suspect colliding GIDs in LDAP server if you could see messages in syslog (or sssd_nss.log)

If you can see such messages only in sssd domain log file then it can be the same case as Jakub described.

I agree with Jakub that we need to see log files + sssd.conf. The LDIF of problematic groups from LDAP server (AD) might be useful as well.

Comment 4 Jakub Hrozek 2015-11-19 08:54:43 UTC
Also ldb cache dump might be useful to have:
ldbsearch -H /var/lib/sss/db/cache_$ > dump.txt

Comment 13 Lukas Slebodnik 2015-12-10 14:59:21 UTC
Sumit had a good point.
We changed the default attribute for group name in sssd-1.13.0
https://git.fedorahosted.org/cgit/sssd.git/commit/?id=adb148603344a42d6edffdda0786a10af715dacb
http://file.brq.redhat.com/lslebodn/SRPMS/freeipa-4.2.3.201512101352GIT64928e9-0.fc23.src.rpm

So if they have different string in attribute "name" and "sAMAccountName"
it might solve the issue with two groups with the same name but different DN.

Could they test with sssd on rhel7.2?
or could they test with following line in domain section (sssd.conf)?
ldap_group_name = sAMAccountName

Comment 15 Jakub Hrozek 2015-12-16 13:22:31 UTC
ping, any news?