1665867 – proxy provider is not working with enumerate=true when trying to fetch all groups

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1665867 - proxy provider is not working with enumerate=true when trying to fetch all groups

Summary: proxy provider is not working with enumerate=true when trying to fetch all gr...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 8
Classification:	Red Hat
Component:	sssd
Sub Component:
Version:	8.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	rc
Target Release:	8.0
Assignee:	Alexey Tikhonov
QA Contact:	sssd-qe
Docs Contact:
URL:
Whiteboard:
Depends On:	1682305
Blocks:
TreeView+	depends on / blocked

Reported:	2019-01-14 10:05 UTC by Madhuri
Modified:	2020-05-02 19:05 UTC (History)
CC List:	11 users (show)
Fixed In Version:	sssd-2.2.0-1.el8
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2019-11-05 22:34:01 UTC
Type:	Bug
Target Upstream Version:
Embargoed:
Dependent Products:
Flags:	dbula: mirror+

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	SSSD sssd issues 4911	0	None	closed	proxy provider is not working with enumerate=true when trying to fetch all groups	2020-12-24 13:18:01 UTC
Red Hat Product Errata	RHSA-2019:3651	0	None	None	None	2019-11-05 22:34:15 UTC

Description Madhuri 2019-01-14 10:05:57 UTC

Description of problem:
The proxy provider is not working with enumerate=true when trying to fetch all groups

Version-Release number of selected component (if applicable):
sssd-2.0.0-36.el8.x86_64

How reproducible:
always


Steps to Reproduce:
1. Configure sssd with a proxy provider
2. Fetch all groups using '# getent group' 


Actual results:
Fetching all groups take too much time(approx 80 to 90 seconds)

Expected results:
Should not take that much time to fetch all groups.

Additional info:
# cat /etc/sssd/sssd.conf
[sssd]
config_file_version = 2
domains = proxy, ldap
sbus_timeout = 30
services = nss, pam

[domain/proxy]
auth_provider = proxy
cache_credentials = True
enumerate = TRUE
id_provider = proxy
debug_level = 0xFFF0
proxy_lib_name = ldap
proxy_pam_target = sssdproxyldap
filter_users = puser10
use_fully_qualified_names = True

[domain/ldap]
id_provider = ldap
auth_provider = ldap
cache_credentials = FALSE
ldap_search_base = dc=bos,dc=redhat,dc=com
chpass_provider = ldap
ldap_id_use_start_tls = True
debug_level = 0xFFF0
min_id  = 1000
enumerate = TRUE
ldap_uri = ldaps://server.example.com:636
ldap_tls_cacert = /etc/openldap/certs/cacert2.pem

[nss]
filter_groups = root
filter_users = root
debug_level = 9

[pam]

Comment 1 Sumit Bose 2019-01-14 10:21:43 UTC

Enumeration caused a crash in the proxy provider:

(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007fa0b6875c95 in __GI_abort () at abort.c:79
#2  0x00007fa0b6e6ed31 in talloc_abort (reason=0x7fa0b6e7c838 "Bad talloc magic value - access after free") at ../talloc.c:500
#3  0x00007fa0b6e6f46d in talloc_abort_access_after_free () at ../talloc.c:525
#4  talloc_chunk_from_ptr (ptr=0x55abd058df10) at ../talloc.c:525
#5  _talloc_steal_loc (new_ctx=0x55abd05d37c0, ptr=0x55abd058df10, location=<optimized out>) at ../talloc.c:1329
#6  0x00007fa0a2730159 in remove_duplicate_group_members (_grp=<synthetic pointer>, orig_grp=0x55abd058df10, mem_ctx=0x55abd05d37c0) at src/providers/proxy/proxy_id.c:708
#7  save_group (sysdb=sysdb@entry=0x55abd0571770, dom=dom@entry=0x55abd05890f0, grp=grp@entry=0x55abd058df10, real_name=0x55abd05c6f20 "Group2@proxy", alias=alias@entry=0x0)
    at src/providers/proxy/proxy_id.c:738
#8  0x00007fa0a2732602 in enum_groups (dom=0x55abd05890f0, sysdb=0x55abd0571770, ctx=0x55abd05a36d0, mem_ctx=0x55abd058de40) at src/providers/proxy/proxy_id.c:1298
#9  proxy_account_info (domain=0x55abd05890f0, be_ctx=<optimized out>, data=<optimized out>, ctx=0x55abd05a36d0, mem_ctx=0x55abd058de40) at src/providers/proxy/proxy_id.c:1632
#10 proxy_account_info_handler_send (mem_ctx=<optimized out>, id_ctx=0x55abd05a36d0, data=<optimized out>, params=0x55abd05bb3e0) at src/providers/proxy/proxy_id.c:1763
#11 0x000055abcec1abf9 in file_dp_request (_dp_req=<synthetic pointer>, req=0x55abd05ad880, request_data=0x55abd059bb40, dp_flags=1, method=DPM_ACCOUNT_HANDLER, target=DPT_ID, 
    name=<optimized out>, domainname=0x55abd05badd0 "proxy", provider=0x55abd0571bf0, mem_ctx=<optimized out>) at src/providers/data_provider/dp_request.c:250
#12 dp_req_send (mem_ctx=0x55abd059baa0, provider=provider@entry=0x55abd0571bf0, domain=domain@entry=0x55abd05badd0 "proxy", name=<optimized out>, target=target@entry=DPT_ID, 
    method=method@entry=DPM_ACCOUNT_HANDLER, dp_flags=1, request_data=0x55abd059bb40, _request_name=0x55abd059baa0) at src/providers/data_provider/dp_request.c:295
#13 0x000055abcec1d90e in dp_get_account_info_send (mem_ctx=<optimized out>, ev=0x55abd0561ab0, sbus_req=<optimized out>, provider=0x55abd0571bf0, dp_flags=1, entry_type=<optimized out>, 
    filter=<optimized out>, domain=0x55abd05badd0 "proxy", extra=0x55abd05bae40 "") at src/providers/data_provider/dp_target_id.c:528
#14 0x00007fa0b76e57f2 in _sbus_sss_invoke_in_uusss_out_qus_step (ev=0x55abd0561ab0, te=<optimized out>, tv=..., private_data=<optimized out>) at src/sss_iface/sbus_sss_invokers.c:2837
#15 0x00007fa0b708bbd9 in tevent_common_invoke_timer_handler (te=te@entry=0x55abd0570370, current_time=..., removed=removed@entry=0x0) at ../tevent_timed.c:369
#16 0x00007fa0b708bd7e in tevent_common_loop_timer_delay (ev=ev@entry=0x55abd0561ab0) at ../tevent_timed.c:441
#17 0x00007fa0b708cf2b in epoll_event_loop_once (ev=0x55abd0561ab0, location=<optimized out>) at ../tevent_epoll.c:922
#18 0x00007fa0b708b1bb in std_event_loop_once (ev=0x55abd0561ab0, location=0x7fa0ba3489d9 "src/util/server.c:724") at ../tevent_standard.c:110
#19 0x00007fa0b7086395 in _tevent_loop_once (ev=ev@entry=0x55abd0561ab0, location=location@entry=0x7fa0ba3489d9 "src/util/server.c:724") at ../tevent.c:772
#20 0x00007fa0b708663b in tevent_common_loop_wait (ev=0x55abd0561ab0, location=0x7fa0ba3489d9 "src/util/server.c:724") at ../tevent.c:895
#21 0x00007fa0b708b14b in std_event_loop_wait (ev=0x55abd0561ab0, location=0x7fa0ba3489d9 "src/util/server.c:724") at ../tevent_standard.c:141
#22 0x00007fa0ba327a07 in server_loop (main_ctx=0x55abd0562f80) at src/util/server.c:724
#23 0x000055abcec0d38b in main (argc=8, argv=<optimized out>) at src/providers/data_provider_be.c:699


I think the reason is the talloc_steal() in the done-block of remove_duplicate_group_members().

 704 done:
 705     talloc_zfree(tmp_ctx);
 706 
 707     if (ret == ENOENT) {
 708         *_grp = talloc_steal(mem_ctx, orig_grp);
 709         ret = EOK;
 710     }
 711 
 712     return ret;
 713 }

because the address pointed to by orig_grp is reused in the do-loop of enum_groups() but due to the talloc_steal() it will be freed in save_group() when the temporary talloc context is freed. To fix this '*_grp = orig_grp;' should be sufficient, in the worst case (if _grp is somewhere freed explicitly) copying the memory should help.

Comment 2 Alexey Tikhonov 2019-01-28 18:50:32 UTC

(In reply to Sumit Bose from comment #1)

> To fix this '*_grp = orig_grp;' should be sufficient

 Technically - yes. But I feel it is a bad idea since it breaks "promise" that one could expect looking at signature of `remove_duplicate_group_members(mem_ctx, orig_group, new_group)`
It is not documented anywhere, but I would rather prefer function to behave consistently and to always return copy of group in given mem context.


> in the worst case (if _grp is somewhere freed explicitly) copying the memory should help.

I would go that way. There is no much overhead (especially taking in account how ineffective code around is anyway).

Comment 3 Sumit Bose 2019-01-29 08:00:49 UTC

(In reply to Alexey Tikhonov from comment #2)
> (In reply to Sumit Bose from comment #1)
> 
> > To fix this '*_grp = orig_grp;' should be sufficient
> 
>  Technically - yes. But I feel it is a bad idea since it breaks "promise"
> that one could expect looking at signature of
> `remove_duplicate_group_members(mem_ctx, orig_group, new_group)`
> It is not documented anywhere, but I would rather prefer function to behave
> consistently and to always return copy of group in given mem context.
> 
> 
> > in the worst case (if _grp is somewhere freed explicitly) copying the memory should help.
> 
> I would go that way. There is no much overhead (especially taking in account
> how ineffective code around is anyway).

I agree, thank you for taking care of this issue.

bye,
Sumit

Comment 4 Alexey Tikhonov 2019-01-29 17:07:57 UTC

Upstream ticket: https://pagure.io/SSSD/sssd/issue/3931

Comment 8 Alexey Tikhonov 2019-02-14 12:51:48 UTC

Upstream PR: https://github.com/SSSD/sssd/pull/737

Comment 9 Jakub Hrozek 2019-03-28 21:46:06 UTC

Fixed as part of:
 * 8efa202
 * cd1538b
 * 29ac739
 * cc9f0f4
 * 0f62cc9
 * feb0832

Comment 12 Madhuri 2019-08-22 07:25:54 UTC

Verified with
[root@ci-vm-10-0-146-233 ~]# rpm -qa sssd
sssd-2.2.0-16.el8.x86_64

Verification steps:

1. Configure sssd with a proxy provider

2. Add enumerate = TRUE option in proxy domain section

[root@ci-vm-10-0-146-233 ~]# cat /etc/sssd/sssd.conf

[sssd]
config_file_version = 2
sbus_timeout = 30
services = pam, nss
domains = proxy, ldap2

[domain/proxy]
auth_provider = proxy
enumerate = True
id_provider = proxy
debug_level = 0xFFF0
proxy_lib_name = ldap
proxy_pam_target = sssdproxyldap
filter_users = puser10

[domain/ldap2]
id_provider = ldap
auth_provider = ldap
chpass_provider = ldap
ldap_id_use_start_tls = True
debug_level = 0xFFF0
enumerate = True
ldap_tls_cacert = /etc/openldap/cacerts/cacert.pem
ldap_uri = ldaps://server.example.com
ldap_search_base = dc=example1,dc=test

3. Stop sssd, Remove the cache and start it again

# systemctl stop sssd; rm -rf /var/lib/sss/db/*; rm -rf /var/log/sssd/*; systemctl start sssd

4. Fetch all groups using '# getent group' and calculate the time

[root@ci-vm-10-0-146-233 ~]# time getent group
pgroup12:*:2012:
pgroup3:*:2003:
pgroup5:*:2005:
pgroup15:*:2015:
duplicate:*:2019:
pgroup10:*:2010:
pgroup11:*:2011:
pgroup0:*:2000:
pgroup7:*:2007:
pgroup2:*:2002:
pgroup9:*:2009:
pgroup18:*:2018:
pgroup14:*:2014:
pgroup4:*:2004:
pgroup6:*:2006:
pgroup17:*:2017:
pgroup1:*:2001:
pgroup13:*:2013:
pgroup16:*:2016:
pgroup8:*:2008:
qgroup13:*:3013:
qgroup8:*:3008:
qgroup16:*:3016:
qgroup5:*:3005:
qgroup11:*:3011:
qgroup0:*:3000:
qgroup7:*:3007:
qgroup2:*:3002:
qgroup9:*:3009:
qgroup15:*:3015:
qgroup10:*:3010:
qgroup4:*:3004:
duplicate:*:3019:
qgroup6:*:3006:
qgroup18:*:3018:
qgroup14:*:3014:
.....
.....
.....

real	0m0.050s
user	0m0.002s
sys	0m0.002s

5. Repeat step 4. to calculate average time,

i. Time taken to fetch all groups, first iteration after deletion cache,
real	0m0.042s
user	0m0.002s
sys	0m0.001s

ii. Time taken to fetch all groups, second iteration after deletion cache,
real	0m0.043s
user	0m0.003s
sys	0m0.000s

iii. Time taken to fetch all groups, third iteration after deletion cache,
real	0m0.040s
user	0m0.000s
sys	0m0.003s

iv.  Time taken to fetch all groups, forth iteration after deletion cache,
real	0m0.044s
user	0m0.001s
sys	0m0.002s

v. Time taken to fetch all groups, fifth iteration after deletion cache,
real	0m0.045s
user	0m0.003s
sys	0m0.001s


To fetch all groups took less time,
From above observations, marking this bug as verified.

Comment 14 errata-xmlrpc 2019-11-05 22:34:01 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2019:3651

Note You need to log in before you can comment on or make changes to this bug.