Bug 1019272 - With 20000+ users, adding a user to a group intermittently throws Internal server error
With 20000+ users, adding a user to a group intermittently throws Internal se...
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: ipa (Show other bugs)
7.0
Unspecified Unspecified
medium Severity unspecified
: rc
: 7.1
Assigned To: Jan Cholasta
Namita Soman
:
Depends On: 1204637
Blocks:
  Show dependency treegraph
 
Reported: 2013-10-15 07:55 EDT by Namita Soman
Modified: 2015-11-19 07:00 EST (History)
10 users (show)

See Also:
Fixed In Version: ipa-4.2.0-0.1.alpha1.el7
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-11-19 07:00:24 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
IPA Error 907: Network Error (121.84 KB, image/png)
2015-01-15 11:11 EST, Varun Mylaraiah
no flags Details
DS errors log (84.14 KB, text/plain)
2015-01-21 08:59 EST, Jan Cholasta
no flags Details

  None (edit)
Description Namita Soman 2013-10-15 07:55:58 EDT
Description of problem:
This is a follow up of bz970541. After adding 20000 users, when using the UI to add a user to a group, saw "internal Server Error". 

Version-Release number of selected component (if applicable):
ipa-3.0.0-37.el6

How reproducible:
almost always in UI, sometimes in CLI

Steps to Reproduce:
1. Add 20000 users
2. In UI, click on ipausers or any other group to edit it
3. Add a user to this group

Actual results:
Throws Internal Server Error

Expected results:
Be able to find a user, then add it to group

Additional info:
Refer to bz970541 for more background
Comment 1 Martin Kosek 2013-10-15 08:05:02 EDT
Upstream ticket:
https://fedorahosted.org/freeipa/ticket/3971
Comment 3 Petr Viktorin 2013-11-27 07:48:12 EST
Fixed upstream
master: https://fedorahosted.org/freeipa/changeset/4050e553c30d8c8d93c7045f871f8c1cef65aa71/
Comment 5 Varun Mylaraiah 2015-01-15 11:11:11 EST
Created attachment 980551 [details]
IPA Error 907: Network Error

issue is still reproducible. Hence, reopening the bug.

IPA Version:
ipa-server-4.1.0-15.el7.x86_64

1. Added 20000 users
2. In WebUI, click on group to edit it
3. Then Error pops up
Comment 6 Martin Kosek 2015-01-16 03:43:16 EST
Jan, Petr - I am not quite confused with the referred commit that is supposed to be the fix (Comment 3). Can you please check if the issue is fixed or not?
Comment 10 Jan Cholasta 2015-01-21 08:20:30 EST
Disabling the compat plugin fixes this. It seems it can't handle such a large number of users gracefully.
Comment 11 Martin Kosek 2015-01-21 08:45:21 EST
Thanks for investigation. It seems that the root cause is that DS/Schema Compatibility plugin starts really slow.

I think the next steps are:
- Jan to create a slapi-nis bug to investigate/fix how Schema Compatiblity Plugin performs/blocks DS with high number of users
- Varun to see if the error goes away after Schema Compatibility plugin is fully initialized (it may take a while)
Comment 12 Jan Cholasta 2015-01-21 08:59:58 EST
Created attachment 982337 [details]
DS errors log

Attaching an excerpt from /var/log/dirsrv/slapd-TESTRELM-COM capturing execution of "ipa group-show newgrp" with nsslapd-errorlog-level of 81920.
Comment 19 thierry bordaz 2015-01-28 06:23:59 EST
The searches pointed in https://bugzilla.redhat.com/show_bug.cgi?id=1019272#c15 are very slow (5 sec).

If the 'cn=changelog' search is needed, then it needs to be indexed (if possible). Can I get the dse.ldif ?

Now something I do not understand is why the base search 'anonymous-limit' is so slow. It takes 5sec for a base search that seems too high.
[22/Jan/2015:00:48:07 +051800] schema-compat-plugin - searching from "cn=anonymous-limits,cn=etc,dc=testrelm,dc=com" for "(|(objectclass=*)(objectclass=ldapsubentry))" with scope 0 (base)
[22/Jan/2015:00:48:12 +051800] schema-compat-plugin - searching from "cn=config" for "(|(objectclass=*)(objectclass=ldapsubentry))" with scope 0 (base)

Would it be possible to enable internal op logging ('nsslapd-accesslog-level: 260') and redo the test. I would be interested by access/errors logs, dse.ldif, as well as some pstacks taken when the problem occurs (add user to the group).

thanks
thierry
Comment 21 thierry bordaz 2015-01-28 13:08:00 EST
	When running the command 'ipa group-show newgrp' there is a slow SRCH request

		[29/Jan/2015:04:35:56 +051800] conn=770 op=0 BIND dn="" method=sasl version=3 mech=GSSAPI
		[29/Jan/2015:04:35:56 +051800] conn=770 op=2 RESULT err=0 tag=97 nentries=0 etime=0 dn="uid=admin,cn=users,cn=accounts,dc=testrelm,dc=com"
		[29/Jan/2015:04:35:56 +051800] conn=770 op=5 SRCH base="dc=testrelm,dc=com" scope=2 filter="(memberOf=cn=newgrp,cn=groups,cn=accounts,dc=testrelm,dc=com)" attrs="member"
		[29/Jan/2015:04:35:59 +051800] conn=770 op=5 RESULT err=3 tag=101 nentries=0 etime=3 notes=P

	This request is processing one or more long aci
		Thread 27 (Thread 0x7f9d487f8700 (LWP 2264)):
		#0  0x00007f9d75a93772 in slapi_attr_basetype () from /usr/lib64/dirsrv/libslapd.so.0
		#1  0x00007f9d75b1eca3 in vattr_map_lookup () from /usr/lib64/dirsrv/libslapd.so.0
		#2  0x00007f9d75b1fc74 in vattr_map_namespace_sp_getlist () from /usr/lib64/dirsrv/libslapd.so.0
		#3  0x00007f9d75b1fd9e in vattr_test_filter () from /usr/lib64/dirsrv/libslapd.so.0
		#4  0x00007f9d75ab704f in slapi_vattr_filter_test_ext_internal () from /usr/lib64/dirsrv/libslapd.so.0
		#5  0x00007f9d75ab804a in slapi_vattr_filter_test_ext () from /usr/lib64/dirsrv/libslapd.so.0
		#6  0x00007f9d6c91214f in acl__resource_match_aci () from /usr/lib64/dirsrv/plugins/libacl-plugin.so
		#7  0x00007f9d6c9134e0 in acl_access_allowed () from /usr/lib64/dirsrv/plugins/libacl-plugin.so
		#8  0x00007f9d6c926ba7 in acl_access_allowed_main () from /usr/lib64/dirsrv/plugins/libacl-plugin.so
		#9  0x00007f9d75aec86c in plugin_call_acl_plugin () from /usr/lib64/dirsrv/libslapd.so.0
		#10 0x00007f9d75ab6a06 in test_ava_filter () from /usr/lib64/dirsrv/libslapd.so.0
		#11 0x00007f9d75ab7a8f in slapi_filter_test_ext_internal () from /usr/lib64/dirsrv/libslapd.so.0
		#12 0x00007f9d75ab7f7a in slapi_filter_test_ext () from /usr/lib64/dirsrv/libslapd.so.0
		#13 0x00007f9d67bf2f84 in backend_search_entry_cb () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
		#14 0x00007f9d67c06113 in map_data_foreach_entry_id () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
		#15 0x00007f9d67bf54db in backend_search_set_cb () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
		#16 0x00007f9d67c0624f in map_data_foreach_map () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
		#17 0x00007f9d67bf5216 in backend_search_group_cb () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
		#18 0x00007f9d67c06190 in map_data_foreach_domain () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
		#19 0x00007f9d67bf4cf4 in backend_search_cb () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
		#20 0x00007f9d75ae9db0 in plugin_call_func () from /usr/lib64/dirsrv/libslapd.so.0
		#21 0x00007f9d75aea008 in plugin_call_plugins () from /usr/lib64/dirsrv/libslapd.so.0
		#22 0x00007f9d75add7e6 in op_shared_search () from /usr/lib64/dirsrv/libslapd.so.0
		#23 0x00007f9d75fc808e in do_search ()
		#24 0x00007f9d75fb7599 in connection_threadmain ()
		#25 0x00007f9d73ee09eb in _pt_root () from /lib64/libnspr4.so
		#26 0x00007f9d73881df3 in start_thread () from /lib64/libpthread.so.0
		#27 0x00007f9d735af05d in clone () from /lib64/libc.so.6

	An other thread (BIND) may appear (not systematic) during this SRCH. This BIND is likely waiting
	for the search to release the lock:

		Thread 22 (Thread 0x7f9d45ff3700 (LWP 2269)):
		#0  0x00007f9d7388506e in pthread_rwlock_wrlock () from /lib64/libpthread.so.0
		#1  0x00007f9d67bf7ba3 in backend_shr_add_cb.part.13 () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
		#2  0x00007f9d67bf7d51 in backend_shr_betxn_post_add_cb () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
		#3  0x00007f9d75ae9db0 in plugin_call_func () from /usr/lib64/dirsrv/libslapd.so.0
		#4  0x00007f9d75aea008 in plugin_call_plugins () from /usr/lib64/dirsrv/libslapd.so.0
		#5  0x00007f9d6999b0e4 in ldbm_back_add () from /usr/lib64/dirsrv/plugins/libback-ldbm.so
		#6  0x00007f9d75a90d8a in op_shared_add () from /usr/lib64/dirsrv/libslapd.so.0
		#7  0x00007f9d75a915e3 in add_internal_pb () from /usr/lib64/dirsrv/libslapd.so.0
		#8  0x00007f9d6822732d in retrocl_postob () from /usr/lib64/dirsrv/plugins/libretrocl-plugin.so
		#9  0x00007f9d75ae9db0 in plugin_call_func () from /usr/lib64/dirsrv/libslapd.so.0
		#10 0x00007f9d75aea008 in plugin_call_plugins () from /usr/lib64/dirsrv/libslapd.so.0
		#11 0x00007f9d699b4d19 in ldbm_back_modify () from /usr/lib64/dirsrv/plugins/libback-ldbm.so
		#12 0x00007f9d75ad6f51 in op_shared_modify () from /usr/lib64/dirsrv/libslapd.so.0
		#13 0x00007f9d75ad7a14 in modify_internal_pb () from /usr/lib64/dirsrv/libslapd.so.0
		#14 0x00007f9d6b056bd3 in ipalockout_postop () from /usr/lib64/dirsrv/plugins/libipa_lockout.so
		#15 0x00007f9d75ae9db0 in plugin_call_func () from /usr/lib64/dirsrv/libslapd.so.0
		#16 0x00007f9d75aea008 in plugin_call_plugins () from /usr/lib64/dirsrv/libslapd.so.0
		#17 0x00007f9d75fb0a0f in do_bind ()
		#18 0x00007f9d75fb744f in connection_threadmain ()
		#19 0x00007f9d73ee09eb in _pt_root () from /lib64/libnspr4.so
		#20 0x00007f9d73881df3 in start_thread () from /lib64/libpthread.so.0
		#21 0x00007f9d735af05d in clone () from /lib64/libc.so.6

	Increasing the errorlog level changes the dynamic. Enabling plugin logging may hide the issue.
	

	Next step is to identify which ACIs are killing the performance.
Comment 23 thierry bordaz 2015-01-29 11:01:36 EST
	The most direct reason of the performance hit, is the time spent in ACI for each individual entry. Of course the total number of entries to evaluate increases the performance hit. 



Only one ACI is found to be evaluated. It grants the right to read the entry:
	aci: (targetattr="ipaUniqueId || memberOf || enrolledBy || krbExtraData || krb
	 PrincipalName || krbCanonicalName || krbPasswordExpiration || krbLastPwdChang
	 e || krbLastSuccessfulAuth || krbLastFailedAuth")(version 3.0; acl "Admin rea
	 d-only attributes"; allow (read, search, compare) groupdn = "ldap:///cn=admin
	 s,cn=groups,cn=accounts,dc=testrelm,dc=com";)


 it takes (3sec) to get the right of one single entry (here cn=newauser19965)

[30/Jan/2015:00:35:09 +051800] NSACLPlugin - #### conn=1361 op=6 binddn="uid=admin,cn=users,cn=accounts,dc=testrelm,dc=com"
[30/Jan/2015:00:35:09 +051800] NSACLPlugin - Searching AVL tree for update:cn=newauser19965,cn=groups,cn=compat,dc=testrelm,dc=com: container:-1
[30/Jan/2015:00:35:09 +051800] NSACLPlugin - Searching AVL tree for update:cn=groups,cn=compat,dc=testrelm,dc=com: container:-1
[30/Jan/2015:00:35:10 +051800] NSACLPlugin - Searching AVL tree for update:cn=compat,dc=testrelm,dc=com: container:-1
[30/Jan/2015:00:35:10 +051800] NSACLPlugin - Searching AVL tree for update:dc=testrelm,dc=com: container:1
[30/Jan/2015:00:35:10 +051800] NSACLPlugin - Searching AVL tree for update:dc=com: container:-1
[30/Jan/2015:00:35:10 +051800] NSACLPlugin -     ************ RESOURCE INFO STARTS *********
[30/Jan/2015:00:35:10 +051800] NSACLPlugin -     Client DN: uid=admin,cn=users,cn=accounts,dc=testrelm,dc=com
[30/Jan/2015:00:35:10 +051800] NSACLPlugin -     resource type:256(search target_DN )
[30/Jan/2015:00:35:10 +051800] NSACLPlugin -     Slapi_Entry DN: cn=newauser19965,cn=groups,cn=compat,dc=testrelm,dc=com
[30/Jan/2015:00:35:10 +051800] NSACLPlugin -     ATTR: memberOf
[30/Jan/2015:00:35:10 +051800] NSACLPlugin -     rights:search 
[30/Jan/2015:00:35:10 +051800] NSACLPlugin -     ************ RESOURCE INFO ENDS   *********
[30/Jan/2015:00:35:10 +051800] NSACLPlugin - Using ACL Container:0 for evaluation
[30/Jan/2015:00:35:10 +051800] NSACLPlugin - ***BEGIN ACL INFO[ Name: "Admin read-only attributes"]***
[30/Jan/2015:00:35:10 +051800] NSACLPlugin - ACL Index:22   ACL_ELEVEL:6
[30/Jan/2015:00:35:10 +051800] NSACLPlugin - ACI type:(compare search read target_attr acltxt allow_rule )
[30/Jan/2015:00:35:10 +051800] NSACLPlugin - ACI RULE type:(groupdn )
[30/Jan/2015:00:35:10 +051800] NSACLPlugin - Slapi_Entry DN:dc=testrelm,dc=com
[30/Jan/2015:00:35:10 +051800] NSACLPlugin - ***END ACL INFO*****************************
[30/Jan/2015:00:35:10 +051800] NSACLPlugin - Num of ALLOW Handles:1, DENY handles:0
[30/Jan/2015:00:35:10 +051800] NSACLPlugin - Processed attr:memberOf for entry:cn=newauser19965,cn=groups,cn=compat,dc=testrelm,dc=com
[30/Jan/2015:00:35:10 +051800] NSACLPlugin - 1. Evaluating ALLOW aci(22) " "Admin read-only attributes""
[30/Jan/2015:00:35:10 +051800] NSACLPlugin - Found SEARCH ALLOW in cache
[30/Jan/2015:00:35:10 +051800] NSACLPlugin - conn=1361 op=6 (main): Allow search on entry(cn=newauser19965,cn=groups,cn=compat,dc=testrelm,dc=com).attr(memberOf) to uid=admin,cn=users,cn=accounts,dc=testrelm,dc=com: cached allow by aci(22)
[30/Jan/2015:00:35:11 +051800] NSACLPlugin - #### conn=1361 op=6 binddn="uid=admin,cn=users,cn=accounts,dc=testrelm,dc=com"

According to the pstack, most of this time was spent to identify that ACI. Its evaluation looks a priori rapid.

	To get this results I needed to:
		- change DS timelimit
		- ipaSearchTimelimit (cn=ipaconfig,cn=etc..)
		- update /usr/lib/python2.7/site-packages/ipaserver/plugins/ldap2.py to increase the time_limit in get_ipa_config:find_entries


    The next step would be to understand why it is so costly to find this aci.
Comment 24 Martin Kosek 2015-01-30 04:38:07 EST
This is strange. One of the goals of the PermissionV2 refactoring taken in FreeIPA 4.0 was to improve performance by moving ACIs out of SUFFIX to separate containers.

So the number of ACIs on entries on entries cn=newauser19965 -> cn=groups -> cn=compat -> dc=testrelm,dc=com should be lower than in FreeIPA < 4.0.
Comment 25 thierry bordaz 2015-01-30 04:48:31 EST
That is correct, the number of selected low.
Actually to evaluate the right to access this entry, only one ACI was selected (Admin read-only attributes). 
It seems it is evaluated very fast.

The problem, is that it takes a long time to select that ACI.
The selection is looking through the DIT which ACI are in the scope, are related to the appropriate target (attributes, entries...), are granting the correct right. All this select take 1-3 seconds.

Apparently it spends time into evaluating a filter.

I doubt that this ACI is guilty. It is just the selected one.
But it scans others ACIs. One/more of them may have a costly filter.
I have not identified which of them are responsible of this.
Comment 26 Martin Kosek 2015-01-30 05:12:10 EST
Ok, thanks for info. It would be nice to pin point the faulty ACI as I am afraid it may affect performance - like as in this Bug.
Comment 27 Jan Cholasta 2015-02-02 05:35:16 EST
A little bit of grepping shows that there is an ACI with dnahostname in targetfilter and an ACI with ipaconfigstring in targetfilter, neither of which is indexed.
Comment 28 Martin Kosek 2015-02-12 03:43:06 EST
As far as I know (Thierry or Ludwig may correct me), indices does not matter for ACI evaluation, given DS always evaluates ACI and the target object itself, having access to it's attribute.
Comment 30 thierry bordaz 2015-02-12 06:06:34 EST
I built a debug version but that was not precise enough to determine if some aci are costly to match the resource.
I need to redo some measure. Varun, please could you keep the host for me until end of next week.

Currently the request is scanning ~40 ACIs for each individual entry (20000) in the group.
The overall time it takes may be due to the number of ACI to scan (800K). An other possibility is that some specific ACI are expensive to scan that is why I need more investigation.
Comment 31 Varun Mylaraiah 2015-02-12 06:26:15 EST
Sure Thierry.
Comment 32 thierry bordaz 2015-02-13 04:37:28 EST
Surprisingly the ACI that is the most costly is
aci: (targetattr="ipaUniqueId || memberOf || enrolledBy || krbExtraData || krb
 PrincipalName || krbCanonicalName || krbPasswordExpiration || krbLastPwdChang
 e || krbLastSuccessfulAuth || krbLastFailedAuth")(version 3.0; acl "Admin rea
 d-only attributes"; allow (read, search, compare) groupdn = "ldap:///cn=admin
 s,cn=groups,cn=accounts,dc=testrelm,dc=com";)

Here are the measure of aci consumption (scan) after having evaluated the first ~5000 entries (from the group) out of 20000.

evaluation witout delay is the number of time the aci was scan in 0ms
evaluation with delay considers each time the aci was scan in more than 0ms
        - total: is the cumul of the scan duration for this aci
        - nb: is the number of time this aci was scan
        - avg: is the average duration of the scan (total/nb)
        - [x..y] is the range of duration of the scan (min/max)

So we can see that "Admin read-only attributes" is scanned for each entry (nb=4983). Its scanning almost always costs more than 150ms and can cost up to 12s !!!!. The only scanning of this aci cost 1758sec. The overall duration of the samples was 30000sec (8h20).

So the problem with this ACI is that it is systematically scanned and each individual scan takes too much time.


aci: "Admin read-only attributes" 
	evaluation: without delay: 26 , 	with delay 	avg 352 [149..12840]	- total 1758183 - nb 4983
aci: "permission:System: Read Global Configuration" 
	evaluation: without delay: - 	, 	with delay 	avg 255 [200..616]	- total 5358 	- nb 21
aci: "Admins can change GUI config" 
	evaluation: without delay: - 	, 	with delay 	avg 247 [200..417]	- total 5204 	- nb 21
aci: "permission:System: Read Groups" 
	evaluation: without delay: 1 	, 	with delay 	avg 210 [200..226]	- total 843 	- nb 4
aci: "anonymous, no acis" 
	evaluation: without delay: - 	, 	with delay 	avg 175 [167..192]	- total 701 	- nb 4
aci: "permission:System: Read Group Membership" 
	evaluation: without delay: 3 	, 	with delay 	avg 100 [1..200]	- total 201 	- nb 2
aci: "Admin can manage any entry" 
	evaluation: without delay: 4914 , 	with delay 	avg 68 [1..409]		- total 6514 	- nb 95
aci: "Anonymous read access to containers" 
	evaluation: without delay: 4755 , 	with delay 	avg 3 [1..225]		- total 893 	- nb 254
aci: "Users/managers can read basic token info" 
	evaluation: without delay: 4776 , 	with delay 	avg 1 [1..1]		- total 233 	- nb 233
aci: "permission:System: Read Group Compat Tree" 
	evaluation: without delay: 4929 , 	with delay 	avg 1 [1..1]		- total 80 	- nb 80
aci: "permission:System: Read Sudoers compat tree" 
	evaluation: without delay: 4940 , 	with delay 	avg 1 [1..1]		- total 69 	- nb 69
aci: "selfservice:User Self service" 
	evaluation: without delay: 5001 , 	with delay 	avg 1 [1..1]		- total 8 	- nb 8
aci: "permission:System: Read Group Views Compat Tree" 
	evaluation: without delay: 4789 , 	with delay 	avg 1 [1..3]		- total 222 	- nb 220
aci: "permission:System: Read User Compat Tree" 
	evaluation: without delay: 4953 , 	with delay 	avg 1 [1..1]		- total 56 	- nb 56
aci: "Users can create self-managed tokens" 
	evaluation: without delay: 5006 , 	with delay 	avg 1 [1..1]		- total 3 	- nb 3
aci: "Users/managers can see TOTP details" 
	evaluation: without delay: 4758 , 	with delay 	avg 1 [1..1]		- total 251 	- nb 251
aci: "permission:Retrieve Certificates from the CA"  
	evaluation: without delay: 5003 , 	with delay 	avg 1 [1..1]		- total 6 	- nb 6
aci: "permission:System: Write DNS Configuration" 
	evaluation: without delay: 4999 , 	with delay 	avg 1 [1..1]		- total 10 	- nb 10
aci: "Samba system principals can read and write NT passwords" 
	evaluation: without delay: 4979 , 	with delay 	avg 1 [1..1]		- total 30 	- nb 30
aci: "permission:System: Manage DNSSEC keys" 
	evaluation: without delay: 4944 , 	with delay 	avg 1 [1..1]		- total 65 	- nb 65
aci: "Users/managers can see HOTP details" 
	evaluation: without delay: 4746 , 	with delay 	avg 1 [1..1]		- total 263 	- nb 263
aci: "permission:System: Remove DNS Entries" 
	evaluation: without delay: 4999 , 	with delay 	avg 1 [1..1]		- total 10 	- nb 10
aci: "permission:Request Certificates from a different host"  
	evaluation: without delay: 5004 , 	with delay 	avg 1 [1..1]		- total 5 	- nb 5
aci: "Managers can write basic token info" 
	evaluation: without delay: 5005 , 	with delay 	avg 1 [1..1]		- total 4 	- nb 4
aci: "Managers can delete tokens" 
	evaluation: without delay: 5000 , 	with delay 	avg 1 [1..1]		- total 9 	- nb 9
aci: "permission:System: Read DNS Configuration" 
	evaluation: without delay: 4949 , 	with delay 	avg 1 [1..1]		- total 60 	- nb 60
aci: "Admins can write passwords" 
	evaluation: without delay: 4993 , 	with delay 	avg 1 [1..1]		- total 16 	- nb 16
aci: "Anonymous read access to DIT root" 
	evaluation: without delay: 4751 , 	with delay 	avg 1 [1..1]		- total 258 	- nb 258
aci: "selfservice:Users can manage their own SSH public keys" 
	evaluation: without delay: 5004 , 	with delay 	avg 1 [1..1]		- total 5 	- nb 5
aci: "permission:Certificate Remove Hold" 
	evaluation: without delay: 5001 , 	with delay 	avg 1 [1..1]		- total 8 	- nb 8
aci: "permission:Write IPA Configuration" 
	evaluation: without delay: 5006 , 	with delay 	avg 1 [1..1]		- total 3 	- nb 3
aci: "Admins can write password policies" 
	evaluation: without delay: 4765 , 	with delay 	avg 1 [1..1]		- total 244 	- nb 244
aci: "permission:Revoke Certificate" 
	evaluation: without delay: 5002 , 	with delay 	avg 1 [1..1]		- total 7 	- nb 7
aci: "permission:System: Read DNS Entries" 
	evaluation: without delay: 4808 , 	with delay 	avg 1 [1..4]		- total 204 	- nb 201
aci: "permission:System: Read DNSSEC metadata" 
	evaluation: without delay: 4957 , 	with delay 	avg 1 [1..1]		- total 52 	- nb 52
aci: "permission:Request Certificate"  
	evaluation: without delay: 4999 , 	with delay 	avg 1 [1..1]		- total 10 	- nb 10
aci: "permission:System: Read Netgroup Compat Tree" 
	evaluation: without delay: 4939 , 	with delay 	avg 1 [1..1]		- total 70 	- nb 70
aci: "permission:Request Certificate with SubjectAltName" 
	evaluation: without delay: 5002 , 	with delay 	avg 1 [1..1]		- total 7 	- nb 7
aci: "permission:System: Read ACIs" 
	evaluation: without delay: 4976 , 	with delay 	avg 1 [1..1]		- total 33 	- nb 33
aci: "permission:Get Certificates status from the CA"  
	evaluation: without delay: 5001 , 	with delay 	avg 1 [1..1]		- total 8 	- nb 8
aci: "permission:System: Add DNS Entries" 
	evaluation: without delay: 4999 , 	with delay 	avg 1 [1..1]		- total 10 	- nb 10
aci: "permission:System: Update DNS Entries" 
	evaluation: without delay: 4999 , 	with delay 	avg 1 [1..1]		- total 10 	- nb 10
aci: "permission:System: Read User Views Compat Tree" 
	evaluation: without delay: 4803 , 	with delay 	avg 1 [1..1]		- total 206 	- nb 206
aci: "permission:System: Manage DNSSEC metadata" 
	evaluation: without delay: 4953 , 	with delay 	avg 1 [1..1]		- total 56 	- nb 56
aci: "permission:System: Compat Tree ID View targets" 
	evaluation: without delay: 4636 , 	with delay 	avg 1 [1..2]		- total 374 	- nb 373
aci: "selfservice:Self can write own password" 
	evaluation: without delay: 4998 , 	with delay 	avg 1 [1..1]		- total 11 	- nb 11
aci: "permission:System: Read Host Compat Tree" 
	evaluation: without delay: 4950 , 	with delay 	avg 1 [1..1]		- total 59 	- nb 59





Next steps would be to do the test without that ACI and without some targetattr attribute (especially those that can be virtual attribute).
Comment 33 thierry bordaz 2015-02-17 11:52:35 EST
	- where are we spending time
	  With the instrumented version , the pstacks are quite different but still that
	  the PRE-SEARCH callback backend_search_cb triggers an aci evaluation
	  for each entry of the group.
#0  0x00007f34efe6287d in fsync () from /lib64/libpthread.so.0
#1  0x00007f34f04b72e3 in pt_Fsync () from /lib64/libnspr4.so
#2  0x00007f34f20a3408 in vslapd_log_error (fp=0x7f34f36ca3f0, subsystem=subsystem@entry=0x7f34e8f072d9 "NSACLPlugin", 
    fmt=fmt@entry=0x7f34e8f03b88 "scaning (%lx) duration %s: %d\n", ap=ap@entry=0x7f34cfff7970, locked=locked@entry=1) at ldap/servers/slapd/log.c:1953
#3  0x00007f34f20a85fd in slapd_log_error_proc_internal (subsystem=0x7f34e8f072d9 "NSACLPlugin", fmt=0x7f34e8f03b88 "scaning (%lx) duration %s: %d\n", 
    ap_err=ap_err@entry=0x7f34cfff7958, ap_file=ap_file@entry=0x7f34cfff7970) at ldap/servers/slapd/log.c:1809
#4  0x00007f34f20a8c40 in slapi_log_error (severity=severity@entry=8, subsystem=<optimized out>, fmt=fmt@entry=0x7f34e8f03b88 "scaning (%lx) duration %s: %d\n")
    at ldap/servers/slapd/log.c:1994
#5  0x00007f34e8eed681 in acl__scan_for_acis (err=<synthetic pointer>, aclpb=0x7f34f9d922e0) at ldap/servers/plugins/acl/acl.c:2029
#6  acl_access_allowed (pb=<optimized out>, e=e@entry=0x7f34f9b4e830, attr=attr@entry=0x7f34b8023a20 "memberOf", val=<optimized out>, access=access@entry=2)
    at ldap/servers/plugins/acl/acl.c:586
#7  0x00007f34e8f00c67 in acl_access_allowed_main (pb=<optimized out>, e=0x7f34f9b4e830, attrs=<optimized out>, val=<optimized out>, access=2, 
    flags=<optimized out>, errbuf=0x0) at ldap/servers/plugins/acl/aclplugin.c:392
#8  0x00007f34f20c68cc in plugin_call_acl_plugin (pb=pb@entry=0x7f34cfffeae0, e=e@entry=0x7f34f9b4e830, attrs=attrs@entry=0x7f34cfff7c10, 
    val=val@entry=0x7f34b8003b98, access=access@entry=2, flags=flags@entry=0, errbuf=errbuf@entry=0x0) at ldap/servers/slapd/plugin_acl.c:90
#9  0x00007f34f2090a36 in test_ava_filter (pb=0x7f34cfffeae0, e=e@entry=0x7f34f9b4e830, a=0x7f34f9bf1c60, ava=0x7f34b8003b90, ftype=163, verify_access=1, 
    only_check_access=0, access_check_done=0x7f34cfff7cd4) at ldap/servers/slapd/filterentry.c:330
#10 0x00007f34f2091abf in slapi_filter_test_ext_internal (pb=<optimized out>, e=0x7f34f9b4e830, f=0x7f34b8003b70, verify_access=1, only_check_access=0, 
    access_check_done=access_check_done@entry=0x7f34cfff7cd4) at ldap/servers/slapd/filterentry.c:160
#11 0x00007f34f2091faa in slapi_filter_test_ext (pb=0x7f34cfffeae0, e=0x7f34f9b4e830, f=0x7f34b8003b70, verify_access=1, only_check_access=<optimized out>)
    at ldap/servers/slapd/filterentry.c:274
#12 0x00007f34e41cbf84 in backend_search_entry_cb () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
#13 0x00007f34e41df113 in map_data_foreach_entry_id () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
#14 0x00007f34e41ce4db in backend_search_set_cb () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
#15 0x00007f34e41df24f in map_data_foreach_map () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
#16 0x00007f34e41ce216 in backend_search_group_cb () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
#17 0x00007f34e41df190 in map_data_foreach_domain () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
#18 0x00007f34e41cdcf4 in backend_search_cb () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
#19 0x00007f34f20c3e10 in plugin_call_func (list=0x7f34f3950df0, operation=operation@entry=403, pb=pb@entry=0x7f34cfffeae0, call_one=call_one@entry=0)
    at ldap/servers/slapd/plugin.c:1952
#20 0x00007f34f20c4068 in plugin_call_list (pb=0x7f34cfffeae0, operation=403, list=<optimized out>) at ldap/servers/slapd/plugin.c:1886
#21 plugin_call_plugins (pb=pb@entry=0x7f34cfffeae0, whichfunction=whichfunction@entry=403) at ldap/servers/slapd/plugin.c:459
#22 0x00007f34f20b7836 in op_shared_search (pb=pb@entry=0x7f34cfffeae0, send_result=send_result@entry=1) at ldap/servers/slapd/opshared.c:581
#23 0x00007f34f25a307e in do_search (pb=pb@entry=0x7f34cfffeae0) at ldap/servers/slapd/search.c:378
#24 0x00007f34f2592599 in connection_dispatch_operation (pb=0x7f34cfffeae0, op=0x7f35004500e0, conn=0x7f34dcc52bf0) at ldap/servers/slapd/connection.c:684
#25 connection_threadmain () at ldap/servers/slapd/connection.c:2534
#26 0x00007f34f04ba9eb in _pt_root () from /lib64/libnspr4.so

	- the Filter checks by backend_search_cb is the same than the request
	  Is that usefull ?

	- The filter is checked enforcing access right (and so aci evaluation)
          Is that usefull ?

	- The aci is scanned for each of the 20000 entries.
          It appears that the 26 first times it is scanned, the scanning is immediate.
          Then it cost ~150ms.
	  At that time, others plugins do not interfere with the scanning.

	

	The next steps
		- get a finer debugging version (especially aroung 
	          slapi_vattr_filterxxx see https://bugzilla.redhat.com/show_bug.cgi?id=1019272#c21)
		- test with modified aci
Comment 34 thierry bordaz 2015-02-17 12:49:55 EST
schema-compat PRE-search callback tests if each entry of the group match a given filter (actually the same filter than the search req).

The test of the filter is done with access_right.
I do not know if it is required. In fact, if I change:
dn: cn=users,cn=Schema Compatibility,cn=plugins,cn=config
schema-compat-check-access: no

Then the etime of the request returns to 0:
18/Feb/2015:04:54:52 +051800] conn=30 op=5 SRCH base="dc=testrelm,dc=com" scope=2 filter="(memberOf=cn=newgrp,cn=groups,cn=accounts,dc=testrelm,dc=com)" attrs="member"
[18/Feb/2015:04:54:52 +051800] conn=30 op=5 RESULT err=0 tag=101 nentries=100 etime=0 notes=P

An additional next step, is to check if schema-compat-check-access could be set to 'no'.
Comment 35 thierry bordaz 2015-02-18 09:38:17 EST
(In reply to thierry bordaz from comment #34)
> schema-compat PRE-search callback tests if each entry of the group match a
> given filter (actually the same filter than the search req).
> 
> The test of the filter is done with access_right.
> I do not know if it is required. In fact, if I change:
> dn: cn=users,cn=Schema Compatibility,cn=plugins,cn=config
> schema-compat-check-access: no
> 
> Then the etime of the request returns to 0:
> 18/Feb/2015:04:54:52 +051800] conn=30 op=5 SRCH base="dc=testrelm,dc=com"
> scope=2
> filter="(memberOf=cn=newgrp,cn=groups,cn=accounts,dc=testrelm,dc=com)"
> attrs="member"
> [18/Feb/2015:04:54:52 +051800] conn=30 op=5 RESULT err=0 tag=101
> nentries=100 etime=0 notes=P
> 
> An additional next step, is to check if schema-compat-check-access could be
> set to 'no'.


My understanding is that check-access should be enforced and so that change is not valid.

In fact, pre-search callback tests entries against a filter and if they match the filter (slapi_filter_test) it will send back the entry (and the requested attributes) to the ldap client (slapi_send_ldap_search_entry).
If the search right is not granted for attributes in the filter, then the entry should be skipped. For example if 'memberof' is not granted, it should not return the entry/attributes even if search access was granted for 'member'.
Comment 40 thierry bordaz 2015-03-26 17:39:26 EDT
I reproduced the problem on an other VM.

The performance issue is fixed by an optimisation of the 'group-find' command:
"implementation tried to get all entries which are member of group.
That means also user. User can't have any members therefore this costly
processing was unnecessary."

I tested the fix: freeipa-pvoborni-0810-speed-up-indirect-member-processing.patch

With a group of 4000 users, the duration of the command dropped from 1m to less than 3s.
The same response time is measured with 8000 users.
Comment 41 Martin Kosek 2015-04-24 08:27:45 EDT
Ok, thanks for confirmation Thierry (Comment 40). I am thus adding respective Bugzilla as a blocker for this one.

This looks as an upstream ticket fixing it:
https://fedorahosted.org/freeipa/ticket/4965
Comment 42 Petr Vobornik 2015-05-28 09:31:17 EDT
ticket 4965 was handled in bug 1204637 therefore changing state to POST
Comment 44 Abhijeet Kasurde 2015-09-15 03:53:48 EDT
Verified fix using IPA server  :ipa-server-4.2.0-8.el7.x86_64
Marking bug as verified.
Comment 45 errata-xmlrpc 2015-11-19 07:00:24 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-2362.html

Note You need to log in before you can comment on or make changes to this bug.