Bug 1446535

Summary: Group resolution does not work in subdomain without ad_server option
Product: Red Hat Enterprise Linux 7 Reporter: shridhar <sgadekar>
Component: sssdAssignee: Michal Zidek <mzidek>
Status: CLOSED ERRATA QA Contact: shridhar <sgadekar>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.4CC: grajaiya, jhrozek, lslebodn, mkosek, mzidek, pbrezina, sbose, sgadekar, sgoveas, tscherf
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: sssd-1.15.2-31.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-08-01 09:06:23 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description shridhar 2017-04-28 10:07:54 UTC
Description of problem:
In subdomain section, if ad_server option is not mentioned then all group resolution stops working. 
User resolution happens partially correct (its secondary groups are not shown).

Version-Release number of selected component (if applicable):

sssd-1.15.2-1.el7.x86_64

How reproducible:
Always

Steps to Reproduce:
In AD environment with at least one subdomain, create two or more users in subdomain.
Environment: 
1. Root domain - sssd16.qe
2. child domain - first.sssd16.qe
3. Users from child domains:  fu1.qe, fu2.qe, fu3.qe

1. Configure subdomain section in sssd.conf for subdomain as:
# cat sssd.conf
[sssd]
domains = sssd16.qe
config_file_version = 2
services = nss, pam

[domain/sssd16.qe]
ad_domain = sssd16.qe
krb5_realm = SSSD16.QE
realmd_tags = manages-system joined-with-adcli 
cache_credentials = True
id_provider = ad
krb5_store_password_if_offline = True
default_shell = /bin/bash
ldap_id_mapping = True
use_fully_qualified_names = True
fallback_homedir = /home/%u@%d
access_provider = ad
ldap_user_search_base = DC=sssd16,DC=qe
ldap_group_search_base = DC=sssd16,DC=qe
debug_level = 9

[domain/sssd16.qe/first.sssd16.qe]
use_fully_qualified_names = True
ldap_user_search_base = CN=Users,DC=first,DC=sssd16,DC=qe
ldap_group_search_base = CN=Users,DC=first,DC=sssd16,DC=qe
debug_level = 9


2. clear the sssd cache 
service sssd stop ; rm -rf /var/lib/sss/db/* ; service sssd start


Actual results:
Group resolution doesn't work.

[root@shr-7 sssd]# id fu1.qe
uid=130201108(fu1.qe) gid=130201108(fu1.qe) groups=130201108(fu1.qe),130200513
[root@shr-7 sssd]# getent passwd fu1.qe
fu1.qe:*:130201108:130201108:fu1:/home/fu1.qe:/bin/bash
[root@shr-7 sssd]# getent group fg2.qe
[root@shr-7 sssd]# getent group fg3.qe
[root@shr-7 sssd]# getent group fg4.qe
[root@shr-7 sssd]# getent group fgu4.qe


Expected results:
Groups should be resolved correctly, without ad_server

[root@shr-7 sssd]# getent group fg2.qe
fg2.qe:*:130201117:fu2.qe,fu1.qe
[root@shr-7 sssd]# getent group fg3.qe
fg3.qe:*:130201118:fu3.qe,fu1.qe
[root@shr-7 sssd]# getent group fg4.qe
[root@shr-7 sssd]# getent group fgu4.qe
fgu4.qe:*:130201121:fu1.qe

Additional info:
uploading logs and configuration

Comment 4 shridhar 2017-04-28 10:21:16 UTC
timestamps for without ad_server_log: 
 sssd]# service sssd stop ; rm -rf /var/lib/sss/db/* ; rm -rf /var/log/sssd/* ; service sssd start
Redirecting to /bin/systemctl stop  sssd.service
Redirecting to /bin/systemctl start  sssd.service
[root@shr-7 sssd]# date ; getent group fg2.qe
Fri Apr 28 06:10:48 EDT 2017
[root@shr-7 sssd]# date ; getent group fg3.qe
Fri Apr 28 06:10:55 EDT 2017
[root@shr-7 sssd]# date ; getent group fgu4.qe
Fri Apr 28 06:11:00 EDT 2017


Timestamp for with_ad_server_log
 log]# date ; getent group fgu4.qe
Fri Apr 28 06:14:00 EDT 2017
fgu4.qe:*:130201121:fu1.qe
[root@shr-7 log]# date ; getent group fgu2.qe
Fri Apr 28 06:14:06 EDT 2017
[root@shr-7 log]# date ; getent group fg2.qe
Fri Apr 28 06:14:13 EDT 2017
fg2.qe:*:130201117:fu1.qe
[root@shr-7 log]# date ; getent group fg3.qe
Fri Apr 28 06:14:17 EDT 2017
fg3.qe:*:130201118:fu1.qe

Comment 5 Lukas Slebodnik 2017-04-28 11:15:22 UTC
Assigning to author of this feature.

Comment 6 Jakub Hrozek 2017-04-28 12:24:55 UTC
From the logs without the server option:

(Fri Apr 28 06:10:49 2017) [sssd[be[sssd16.qe]]] [resolv_discover_srv_next_domain] (0x0400): SRV resolution of service 'ldap'. Will use DNS discovery domain 'first.sssd16.qe'
(Fri Apr 28 06:10:49 2017) [sssd[be[sssd16.qe]]] [resolv_getsrv_send] (0x0100): Trying to resolve SRV record of '_ldap._tcp.first.sssd16.qe'
(Fri Apr 28 06:10:49 2017) [sssd[be[sssd16.qe]]] [sdap_id_op_connect_done] (0x4000): caching successful connection after 1 notifies
(Fri Apr 28 06:10:49 2017) [sssd[be[sssd16.qe]]] [be_run_unconditional_online_cb] (0x4000): List of unconditional online callbacks is empty, nothing to do.
(Fri Apr 28 06:10:49 2017) [sssd[be[sssd16.qe]]] [be_run_online_cb] (0x0080): Going online. Running callbacks.
(Fri Apr 28 06:10:49 2017) [sssd[be[sssd16.qe]]] [schedule_request_timeout] (0x2000): Scheduling a timeout of 6 seconds
(Fri Apr 28 06:10:49 2017) [sssd[be[sssd16.qe]]] [schedule_timeout_watcher] (0x2000): Scheduling DNS timeout watcher
(Fri Apr 28 06:10:49 2017) [sssd[be[sssd16.qe]]] [unschedule_timeout_watcher] (0x4000): Unscheduling DNS timeout watcher
(Fri Apr 28 06:10:49 2017) [sssd[be[sssd16.qe]]] [request_watch_destructor] (0x0400): Deleting request watch
(Fri Apr 28 06:10:49 2017) [sssd[be[sssd16.qe]]] [resolv_discover_srv_done] (0x0040): SRV query failed [4]: Domain name not found
(Fri Apr 28 06:10:49 2017) [sssd[be[sssd16.qe]]] [fo_set_port_status] (0x0100): Marking port 0 of server '(no name)' as 'not working'
(Fri Apr 28 06:10:49 2017) [sssd[be[sssd16.qe]]] [resolve_srv_done] (0x0040): Unable to resolve SRV [1432158235]: SRV record not found
(Fri Apr 28 06:10:49 2017) [sssd[be[sssd16.qe]]] [set_srv_data_status] (0x0100): Marking SRV lookup of service 'sd_first.sssd16.qe' as 'not resolved'
(Fri Apr 28 06:10:49 2017) [sssd[be[sssd16.qe]]] [be_resolve_server_process] (0x0080): Couldn't resolve server (SRV lookup meta-server), resolver returned [1432158235]: SRV record not found
(Fri Apr 28 06:10:49 2017) [sssd[be[sssd16.qe]]] [be_resolve_server_process] (0x1000): Trying with the next one!
(Fri Apr 28 06:10:49 2017) [sssd[be[sssd16.qe]]] [fo_resolve_service_send] (0x0100): Trying to resolve service 'sd_first.sssd16.qe'
(Fri Apr 28 06:10:49 2017) [sssd[be[sssd16.qe]]] [get_port_status] (0x1000): Port status of port 0 for server '(no name)' is 'not working'

Is it expected that the SRV records are missing? It still looks like a bug, but maybe it's about not cleaning the failed status properly.

Comment 7 shridhar 2017-04-28 12:47:24 UTC
srv records are there. After resyncing child-domain with root domain (restart of ad-services from child-AD server) corrected issue for now. That is groups are resolved correctly without ad_server mentioned. 

Why were users resolved and not the groups?

Comment 8 Jakub Hrozek 2017-04-28 12:54:06 UTC
(In reply to shridhar from comment #7)
> srv records are there. After resyncing child-domain with root domain
> (restart of ad-services from child-AD server) corrected issue for now. That
> is groups are resolved correctly without ad_server mentioned. 
> 

Well, SSSD couldn't find them:

(Fri Apr 28 06:10:49 2017) [sssd[be[sssd16.qe]]] [resolv_getsrv_send] (0x0100): Trying to resolve SRV record of '_ldap._tcp.first.sssd16.qe'

(Fri Apr 28 06:10:49 2017) [sssd[be[sssd16.qe]]] [resolv_discover_srv_done] (0x0040): SRV query failed [4]: Domain name not found

Did you try the same record from the client's command line with e.g. dig?

> Why were users resolved and not the groups?

I don't know, but my bet is that group lookups hit the Global Catalog, but the user lookups hit the LDAP server from that particular domain.

Comment 16 Lukas Slebodnik 2017-05-12 14:15:52 UTC
Upstream ticket:
https://pagure.io/SSSD/sssd/issue/3397

Comment 20 Lukas Slebodnik 2017-05-19 15:02:56 UTC
master:
* c4ddb9ccab670f9c0d0377680237b62f9f91c496
* b4ca0da4d8d70bcfbd4f809f3b3b094d43d64cfc

Comment 22 shridhar 2017-05-24 08:11:20 UTC
user_search_base works correctly. However group_search_base still does not work correctly.

sssd.conf

[domain/sssd16.qe]
ad_domain = sssd16.qe
krb5_realm = SSSD16.QE
realmd_tags = manages-system joined-with-adcli 
cache_credentials = True
id_provider = ad
krb5_store_password_if_offline = True
default_shell = /bin/bash
#ldap_sasl_authid = SHR-R7-PERMANEN$
ldap_id_mapping = True
use_fully_qualified_names = True
fallback_homedir = /home/%u@%d
access_provider = ad
#ldap_user_search_base =  CN=Users,DC=sssd16,DC=qe
#ldap_group_search_base =  CN=Users,DC=sssd16,DC=qe
debug_level = 9

[domain/sssd16.qe/first.sssd16.qe]
ad_server = shr-w16-permane.first.sssd16.qe
use_fully_qualified_names = True
ldap_user_search_base = OU=finance,DC=first,DC=sssd16,DC=qe
dap_group_search_base = OU=finance,DC=first,DC=sssd16,DC=qe
cache_credentials = True
debug_level = 2



[root@shr-r7-permanent ~]# id fu1.qe
id: fu1.qe: no such user
[root@shr-r7-permanent ~]# id finu1.qe
uid=130201111(finu1.qe) gid=130201111(finu1.qe) groups=130201111(finu1.qe),130200513(domain users.qe),130201112(fing1.qe)
[root@shr-r7-permanent ~]# id salfinu1.qe
uid=130201114(salfinu1.qe) gid=130201114(salfinu1.qe) groups=130201114(salfinu1.qe),130200513(domain users.qe),130201124(salfing2.qe)

[root@shr-r7-permanent ~]# getent group fing1.qe
fing1.qe:*:130201112:finu1.qe
[root@shr-r7-permanent ~]# getent group salfing1.qe
salfing1.qe:*:130201113:

Following group is in 'CN=USERS' so should not be returned.

[root@shr-r7-permanent ~]# getent group fg2.qe
fg2.qe:*:130201117:fu2.qe,fu1.qe

Comment 23 shridhar 2017-05-24 08:12:16 UTC
tested with sssd-1.15.2-33.el7.x86_64

Comment 24 Michal Zidek 2017-05-24 09:45:03 UTC
Sorry, but I can not reproduce it. Can you provide logs or access to test machine?

Michal

Comment 26 Michal Zidek 2017-05-24 12:49:38 UTC
> [domain/sssd16.qe/first.sssd16.qe]
> ad_server = shr-w16-permane.first.sssd16.qe
> use_fully_qualified_names = True
> ldap_user_search_base = OU=finance,DC=first,DC=sssd16,DC=qe
> dap_group_search_base = OU=finance,DC=first,DC=sssd16,DC=qe
  ^^^

You have a typo here. When you correct option name is used it works :)

> cache_credentials = True
> debug_level = 2
>

Comment 27 Michal Zidek 2017-05-24 12:54:03 UTC
Btw. typo in option name is a common mistake. You can use

$ sssctl config-check

to catch these types of errors.

Comment 28 shridhar 2017-05-24 14:25:12 UTC
(In reply to Michal Zidek from comment #27)
> Btw. typo in option name is a common mistake. You can use
> 
> $ sssctl config-check
> 
> to catch these types of errors.

My bad. 

It is working correctly.
verified with sssd-1.15.2-33.el7.x86_64
[root@shr-r7-permanent ~]# getent group fg2.qe
[root@shr-r7-permanent ~]# getent group salfing1.qe
salfing1.qe:*:130201113:
(reverse-i-search)`': +^C
[root@shr-r7-permanent ~]# getent group fing1.qe
fing1.qe:*:130201112:finu1.qe

Comment 29 errata-xmlrpc 2017-08-01 09:06:23 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:2294