Bug 1514061

Summary: ID override GID from Default Trust View is not properly resolved in case domain resolution order is set
Product: Red Hat Enterprise Linux 7 Reporter: Thorsten Scherf <tscherf>
Component: sssdAssignee: Fabiano Fidêncio <fidencio>
Status: CLOSED ERRATA QA Contact: Ganna Kaihorodova <gkaihoro>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.4CC: fidencio, gkaihoro, grajaiya, jhrozek, lslebodn, mkosek, mpanaous, mzidek, ndehadra, pbrezina, sbose, sgoveas, tscherf
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: sssd-1.16.2-1.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-10-30 10:40:47 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Logs of manual verification none

Description Thorsten Scherf 2017-11-16 15:15:26 UTC
Description of problem:

Setup: IdM with AD Trust. A Posix group 'ad_admins' (GID 732000006) exists with one member 'ad_admins_external'. The member of the external group is the Windows Domain Admins group. There also exists a Windows Domain User 'aduser' with UID/GID 702801104.

Without any ID override or domain resolution order defined, 'id aduser.local' gives the following output:

# id aduser.local
uid=702801104(aduser.local) gid=702801104(aduser.local) groups=702801104(aduser.local),732000005(ad_users),702800513(domain users.local)

Now I define an ID override in the Default Trust View for the 'aduser' to change the GID to 732000006:

# ipa idoverrideuser-add 'Default Trust View' aduser.local --gidnumber=732000006

I clean the cache and verify that the user now uses the GID from the ID override:

# systemctl stop sssd; rm -rf /var/lib/sss/{db,mc}/* /var/log/sssd/*; systemctl start sssd
# id aduser.local
uid=702801104(aduser.local) gid=732000006(ad_admins) groups=732000006(ad_admins),732000005(ad_users),702800513(domain users.local)

This works as expected. 

Now I change the domain resolution order so that I don't have to use the domain name when I refer to the 'aduser' account:

# ipa config-mod --domain-resolution-order=windows.mylab.local:linux.mylab.local
# ipa config-show|grep -i resolution
  Domain resolution order: windows.mylab.local:linux.mylab.local

I clean the cache and verify again that the 'aduser' account still uses the GID from the ID override:

# systemctl stop sssd; rm -rf /var/lib/sss/{db,mc}/*; systemctl start sssd
# id aduser
uid=702801104(aduser.local) gid=732000006(aduser.local) groups=732000006(aduser.local),732000005(ad_users.local),702800513(domain users.local)

As we can see, the user still uses the correct GID from the ID override (732000006), but the GID is resolved to the wrong group name ('aduser' instead of 'ad_admins'). 

Also using the domain with the user name doesn't change this behaviour:

# id aduser.local
uid=702801104(aduser.local) gid=732000006(aduser.local) groups=732000006(aduser.local),732000005(ad_users.local),702800513(domain users.local)

Looking into the SSSD logs, shows that SSSD indeed tries to resolve the GID 732000005 (adusers) rather than 732000006 (ad_admins):

(Thu Nov 16 16:03:19 2017) [sssd[be[linux.mylab.local]]] [dp_get_account_info_handler] (0x0200): Got request for [0x2][BE_REQ_GROUP][id
number=732000005]

When I remove the domain resolution order, everything works as expected again:

# ipa config-mod --domain-resolution-order=
# systemctl stop sssd; rm -rf /var/lib/sss/{db,mc}/*; systemctl start sssd

# id aduser.local
uid=702801104(aduser.local) gid=732000006(ad_admins) groups=732000006(ad_admins),732000005(ad_users),702800513(domain users.local)

In the SSSD logs we can now also see that SSSD tries to resolve the correct GID (732000006):

(Thu Nov 16 16:06:07 2017) [sssd[be[linux.mylab.local]]] [dp_get_account_info_handler] (0x0200): Got request for [0x2][BE_REQ_GROUP][idnumber=732000006]


Version-Release number of selected component (if applicable):
sssd-1.15.2-50.el7_4.6.x86_64
ipa-server-4.5.0-21.el7_4.2.2.x86_64


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Fabiano Fidêncio 2017-11-27 20:31:20 UTC
Firstly, thanks for the well explained bug report and for providing a machine where I could dig into the issue.

If I understand correctly you explicitly have two groups with the very same gid (732000006), one in each domain (windows.mylab.local and linux.mylab.local).

So, when you run `id userad.local` it'll trigger a "Group by ID" request, which will use the first available domain in the request and depending on the domain resolution order you'll end up with 732000006.local and 732000006.local, thus the different results.

The cache_req code is doing exactly what's expected from it, looking up in the first found domain. I'd say that the solution would be to avoid having two groups with the very same gid in the different domains.

Jakub, would you suggest something different here?

Comment 2 Jakub Hrozek 2017-11-28 10:37:52 UTC
(In reply to Fabiano Fidêncio from comment #1)
> Firstly, thanks for the well explained bug report and for providing a
> machine where I could dig into the issue.
> 
> If I understand correctly you explicitly have two groups with the very same
> gid (732000006), one in each domain (windows.mylab.local and
> linux.mylab.local).
> 
> So, when you run `id userad.local` it'll trigger a "Group by
> ID" request, which will use the first available domain in the request and
> depending on the domain resolution order you'll end up with
> 732000006.local and 732000006.local, thus the
> different results.
> 
> The cache_req code is doing exactly what's expected from it, looking up in
> the first found domain. I'd say that the solution would be to avoid having
> two groups with the very same gid in the different domains.
> 
> Jakub, would you suggest something different here?

Yes, that's also how I would explain this. I think the expectation from the customer might be based on "id" being a seemingly atomic operation which is confined to the windows.mylab.local domain, but in fact, calling 'id' first internally calls getgrouplist() which just returns a list of unqalified GIDs. The GIDs are then translated into names as you said.

So I agree it is not a bug.

Comment 3 Thorsten Scherf 2017-11-28 13:35:50 UTC
(In reply to Fabiano Fidêncio from comment #1)
> If I understand correctly you explicitly have two groups with the very same
> gid (732000006), one in each domain (windows.mylab.local and
> linux.mylab.local).

No. There is only the group 'ad_admins' in the IdM domain that has the gid 732000006. There are no posix attributes used in the AD domain at all. But remember, the ID override from the Default Trust View changes the gid of an AD user ('aduser') to 732000006.

Comment 6 Jakub Hrozek 2017-12-05 14:24:40 UTC
Upstream ticket:
https://pagure.io/SSSD/sssd/issue/3595

Comment 12 Fabiano Fidêncio 2018-05-11 15:44:56 UTC
master:
 cf4f5e0

Comment 16 Ganna Kaihorodova 2018-08-20 09:07:04 UTC
Created attachment 1477106 [details]
Logs of manual verification

Comment 18 errata-xmlrpc 2018-10-30 10:40:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:3158