| Summary: | With 20000+ users, adding a user to a group intermittently throws Internal server error | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Namita Soman <nsoman> | ||||||
| Component: | ipa | Assignee: | Jan Cholasta <jcholast> | ||||||
| Status: | CLOSED ERRATA | QA Contact: | Namita Soman <nsoman> | ||||||
| Severity: | unspecified | Docs Contact: | |||||||
| Priority: | medium | ||||||||
| Version: | 7.0 | CC: | abokovoy, akasurde, ldelouw, lkrispen, mkosek, mvarun, pvoborni, rcritten, tbordaz, tscherf | ||||||
| Target Milestone: | rc | ||||||||
| Target Release: | 7.1 | ||||||||
| Hardware: | Unspecified | ||||||||
| OS: | Unspecified | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | ipa-4.2.0-0.1.alpha1.el7 | Doc Type: | Bug Fix | ||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2015-11-19 12:00:24 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Bug Depends On: | 1204637 | ||||||||
| Bug Blocks: | |||||||||
| Attachments: |
|
||||||||
|
Description
Namita Soman
2013-10-15 11:55:58 UTC
Upstream ticket: https://fedorahosted.org/freeipa/ticket/3971 Fixed upstream master: https://fedorahosted.org/freeipa/changeset/4050e553c30d8c8d93c7045f871f8c1cef65aa71/ Created attachment 980551 [details]
IPA Error 907: Network Error
issue is still reproducible. Hence, reopening the bug.
IPA Version:
ipa-server-4.1.0-15.el7.x86_64
1. Added 20000 users
2. In WebUI, click on group to edit it
3. Then Error pops up
Jan, Petr - I am not quite confused with the referred commit that is supposed to be the fix (Comment 3). Can you please check if the issue is fixed or not? Sorry, I only mentioned the top commit. All commits for this issue are: https://fedorahosted.org/freeipa/changeset/73df6150e52012ae427d8911fb8e31739c3379ce https://fedorahosted.org/freeipa/changeset/8013056194755d44105b512960951d723cc706d7 https://fedorahosted.org/freeipa/changeset/a9bf18ab9554cb541f6f5f31c6860a3610df5e13 https://fedorahosted.org/freeipa/changeset/97445b2df3ce121ebb76ccf36b3781f93b6d73c2 https://fedorahosted.org/freeipa/changeset/4050e553c30d8c8d93c7045f871f8c1cef65aa71 Disabling the compat plugin fixes this. It seems it can't handle such a large number of users gracefully. Thanks for investigation. It seems that the root cause is that DS/Schema Compatibility plugin starts really slow. I think the next steps are: - Jan to create a slapi-nis bug to investigate/fix how Schema Compatiblity Plugin performs/blocks DS with high number of users - Varun to see if the error goes away after Schema Compatibility plugin is fully initialized (it may take a while) Created attachment 982337 [details]
DS errors log
Attaching an excerpt from /var/log/dirsrv/slapd-TESTRELM-COM capturing execution of "ipa group-show newgrp" with nsslapd-errorlog-level of 81920.
The searches pointed in https://bugzilla.redhat.com/show_bug.cgi?id=1019272#c15 are very slow (5 sec). If the 'cn=changelog' search is needed, then it needs to be indexed (if possible). Can I get the dse.ldif ? Now something I do not understand is why the base search 'anonymous-limit' is so slow. It takes 5sec for a base search that seems too high. [22/Jan/2015:00:48:07 +051800] schema-compat-plugin - searching from "cn=anonymous-limits,cn=etc,dc=testrelm,dc=com" for "(|(objectclass=*)(objectclass=ldapsubentry))" with scope 0 (base) [22/Jan/2015:00:48:12 +051800] schema-compat-plugin - searching from "cn=config" for "(|(objectclass=*)(objectclass=ldapsubentry))" with scope 0 (base) Would it be possible to enable internal op logging ('nsslapd-accesslog-level: 260') and redo the test. I would be interested by access/errors logs, dse.ldif, as well as some pstacks taken when the problem occurs (add user to the group). thanks thierry When running the command 'ipa group-show newgrp' there is a slow SRCH request [29/Jan/2015:04:35:56 +051800] conn=770 op=0 BIND dn="" method=sasl version=3 mech=GSSAPI [29/Jan/2015:04:35:56 +051800] conn=770 op=2 RESULT err=0 tag=97 nentries=0 etime=0 dn="uid=admin,cn=users,cn=accounts,dc=testrelm,dc=com" [29/Jan/2015:04:35:56 +051800] conn=770 op=5 SRCH base="dc=testrelm,dc=com" scope=2 filter="(memberOf=cn=newgrp,cn=groups,cn=accounts,dc=testrelm,dc=com)" attrs="member" [29/Jan/2015:04:35:59 +051800] conn=770 op=5 RESULT err=3 tag=101 nentries=0 etime=3 notes=P This request is processing one or more long aci Thread 27 (Thread 0x7f9d487f8700 (LWP 2264)): #0 0x00007f9d75a93772 in slapi_attr_basetype () from /usr/lib64/dirsrv/libslapd.so.0 #1 0x00007f9d75b1eca3 in vattr_map_lookup () from /usr/lib64/dirsrv/libslapd.so.0 #2 0x00007f9d75b1fc74 in vattr_map_namespace_sp_getlist () from /usr/lib64/dirsrv/libslapd.so.0 #3 0x00007f9d75b1fd9e in vattr_test_filter () from /usr/lib64/dirsrv/libslapd.so.0 #4 0x00007f9d75ab704f in slapi_vattr_filter_test_ext_internal () from /usr/lib64/dirsrv/libslapd.so.0 #5 0x00007f9d75ab804a in slapi_vattr_filter_test_ext () from /usr/lib64/dirsrv/libslapd.so.0 #6 0x00007f9d6c91214f in acl__resource_match_aci () from /usr/lib64/dirsrv/plugins/libacl-plugin.so #7 0x00007f9d6c9134e0 in acl_access_allowed () from /usr/lib64/dirsrv/plugins/libacl-plugin.so #8 0x00007f9d6c926ba7 in acl_access_allowed_main () from /usr/lib64/dirsrv/plugins/libacl-plugin.so #9 0x00007f9d75aec86c in plugin_call_acl_plugin () from /usr/lib64/dirsrv/libslapd.so.0 #10 0x00007f9d75ab6a06 in test_ava_filter () from /usr/lib64/dirsrv/libslapd.so.0 #11 0x00007f9d75ab7a8f in slapi_filter_test_ext_internal () from /usr/lib64/dirsrv/libslapd.so.0 #12 0x00007f9d75ab7f7a in slapi_filter_test_ext () from /usr/lib64/dirsrv/libslapd.so.0 #13 0x00007f9d67bf2f84 in backend_search_entry_cb () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so #14 0x00007f9d67c06113 in map_data_foreach_entry_id () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so #15 0x00007f9d67bf54db in backend_search_set_cb () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so #16 0x00007f9d67c0624f in map_data_foreach_map () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so #17 0x00007f9d67bf5216 in backend_search_group_cb () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so #18 0x00007f9d67c06190 in map_data_foreach_domain () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so #19 0x00007f9d67bf4cf4 in backend_search_cb () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so #20 0x00007f9d75ae9db0 in plugin_call_func () from /usr/lib64/dirsrv/libslapd.so.0 #21 0x00007f9d75aea008 in plugin_call_plugins () from /usr/lib64/dirsrv/libslapd.so.0 #22 0x00007f9d75add7e6 in op_shared_search () from /usr/lib64/dirsrv/libslapd.so.0 #23 0x00007f9d75fc808e in do_search () #24 0x00007f9d75fb7599 in connection_threadmain () #25 0x00007f9d73ee09eb in _pt_root () from /lib64/libnspr4.so #26 0x00007f9d73881df3 in start_thread () from /lib64/libpthread.so.0 #27 0x00007f9d735af05d in clone () from /lib64/libc.so.6 An other thread (BIND) may appear (not systematic) during this SRCH. This BIND is likely waiting for the search to release the lock: Thread 22 (Thread 0x7f9d45ff3700 (LWP 2269)): #0 0x00007f9d7388506e in pthread_rwlock_wrlock () from /lib64/libpthread.so.0 #1 0x00007f9d67bf7ba3 in backend_shr_add_cb.part.13 () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so #2 0x00007f9d67bf7d51 in backend_shr_betxn_post_add_cb () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so #3 0x00007f9d75ae9db0 in plugin_call_func () from /usr/lib64/dirsrv/libslapd.so.0 #4 0x00007f9d75aea008 in plugin_call_plugins () from /usr/lib64/dirsrv/libslapd.so.0 #5 0x00007f9d6999b0e4 in ldbm_back_add () from /usr/lib64/dirsrv/plugins/libback-ldbm.so #6 0x00007f9d75a90d8a in op_shared_add () from /usr/lib64/dirsrv/libslapd.so.0 #7 0x00007f9d75a915e3 in add_internal_pb () from /usr/lib64/dirsrv/libslapd.so.0 #8 0x00007f9d6822732d in retrocl_postob () from /usr/lib64/dirsrv/plugins/libretrocl-plugin.so #9 0x00007f9d75ae9db0 in plugin_call_func () from /usr/lib64/dirsrv/libslapd.so.0 #10 0x00007f9d75aea008 in plugin_call_plugins () from /usr/lib64/dirsrv/libslapd.so.0 #11 0x00007f9d699b4d19 in ldbm_back_modify () from /usr/lib64/dirsrv/plugins/libback-ldbm.so #12 0x00007f9d75ad6f51 in op_shared_modify () from /usr/lib64/dirsrv/libslapd.so.0 #13 0x00007f9d75ad7a14 in modify_internal_pb () from /usr/lib64/dirsrv/libslapd.so.0 #14 0x00007f9d6b056bd3 in ipalockout_postop () from /usr/lib64/dirsrv/plugins/libipa_lockout.so #15 0x00007f9d75ae9db0 in plugin_call_func () from /usr/lib64/dirsrv/libslapd.so.0 #16 0x00007f9d75aea008 in plugin_call_plugins () from /usr/lib64/dirsrv/libslapd.so.0 #17 0x00007f9d75fb0a0f in do_bind () #18 0x00007f9d75fb744f in connection_threadmain () #19 0x00007f9d73ee09eb in _pt_root () from /lib64/libnspr4.so #20 0x00007f9d73881df3 in start_thread () from /lib64/libpthread.so.0 #21 0x00007f9d735af05d in clone () from /lib64/libc.so.6 Increasing the errorlog level changes the dynamic. Enabling plugin logging may hide the issue. Next step is to identify which ACIs are killing the performance. The most direct reason of the performance hit, is the time spent in ACI for each individual entry. Of course the total number of entries to evaluate increases the performance hit.
Only one ACI is found to be evaluated. It grants the right to read the entry:
aci: (targetattr="ipaUniqueId || memberOf || enrolledBy || krbExtraData || krb
PrincipalName || krbCanonicalName || krbPasswordExpiration || krbLastPwdChang
e || krbLastSuccessfulAuth || krbLastFailedAuth")(version 3.0; acl "Admin rea
d-only attributes"; allow (read, search, compare) groupdn = "ldap:///cn=admin
s,cn=groups,cn=accounts,dc=testrelm,dc=com";)
it takes (3sec) to get the right of one single entry (here cn=newauser19965)
[30/Jan/2015:00:35:09 +051800] NSACLPlugin - #### conn=1361 op=6 binddn="uid=admin,cn=users,cn=accounts,dc=testrelm,dc=com"
[30/Jan/2015:00:35:09 +051800] NSACLPlugin - Searching AVL tree for update:cn=newauser19965,cn=groups,cn=compat,dc=testrelm,dc=com: container:-1
[30/Jan/2015:00:35:09 +051800] NSACLPlugin - Searching AVL tree for update:cn=groups,cn=compat,dc=testrelm,dc=com: container:-1
[30/Jan/2015:00:35:10 +051800] NSACLPlugin - Searching AVL tree for update:cn=compat,dc=testrelm,dc=com: container:-1
[30/Jan/2015:00:35:10 +051800] NSACLPlugin - Searching AVL tree for update:dc=testrelm,dc=com: container:1
[30/Jan/2015:00:35:10 +051800] NSACLPlugin - Searching AVL tree for update:dc=com: container:-1
[30/Jan/2015:00:35:10 +051800] NSACLPlugin - ************ RESOURCE INFO STARTS *********
[30/Jan/2015:00:35:10 +051800] NSACLPlugin - Client DN: uid=admin,cn=users,cn=accounts,dc=testrelm,dc=com
[30/Jan/2015:00:35:10 +051800] NSACLPlugin - resource type:256(search target_DN )
[30/Jan/2015:00:35:10 +051800] NSACLPlugin - Slapi_Entry DN: cn=newauser19965,cn=groups,cn=compat,dc=testrelm,dc=com
[30/Jan/2015:00:35:10 +051800] NSACLPlugin - ATTR: memberOf
[30/Jan/2015:00:35:10 +051800] NSACLPlugin - rights:search
[30/Jan/2015:00:35:10 +051800] NSACLPlugin - ************ RESOURCE INFO ENDS *********
[30/Jan/2015:00:35:10 +051800] NSACLPlugin - Using ACL Container:0 for evaluation
[30/Jan/2015:00:35:10 +051800] NSACLPlugin - ***BEGIN ACL INFO[ Name: "Admin read-only attributes"]***
[30/Jan/2015:00:35:10 +051800] NSACLPlugin - ACL Index:22 ACL_ELEVEL:6
[30/Jan/2015:00:35:10 +051800] NSACLPlugin - ACI type:(compare search read target_attr acltxt allow_rule )
[30/Jan/2015:00:35:10 +051800] NSACLPlugin - ACI RULE type:(groupdn )
[30/Jan/2015:00:35:10 +051800] NSACLPlugin - Slapi_Entry DN:dc=testrelm,dc=com
[30/Jan/2015:00:35:10 +051800] NSACLPlugin - ***END ACL INFO*****************************
[30/Jan/2015:00:35:10 +051800] NSACLPlugin - Num of ALLOW Handles:1, DENY handles:0
[30/Jan/2015:00:35:10 +051800] NSACLPlugin - Processed attr:memberOf for entry:cn=newauser19965,cn=groups,cn=compat,dc=testrelm,dc=com
[30/Jan/2015:00:35:10 +051800] NSACLPlugin - 1. Evaluating ALLOW aci(22) " "Admin read-only attributes""
[30/Jan/2015:00:35:10 +051800] NSACLPlugin - Found SEARCH ALLOW in cache
[30/Jan/2015:00:35:10 +051800] NSACLPlugin - conn=1361 op=6 (main): Allow search on entry(cn=newauser19965,cn=groups,cn=compat,dc=testrelm,dc=com).attr(memberOf) to uid=admin,cn=users,cn=accounts,dc=testrelm,dc=com: cached allow by aci(22)
[30/Jan/2015:00:35:11 +051800] NSACLPlugin - #### conn=1361 op=6 binddn="uid=admin,cn=users,cn=accounts,dc=testrelm,dc=com"
According to the pstack, most of this time was spent to identify that ACI. Its evaluation looks a priori rapid.
To get this results I needed to:
- change DS timelimit
- ipaSearchTimelimit (cn=ipaconfig,cn=etc..)
- update /usr/lib/python2.7/site-packages/ipaserver/plugins/ldap2.py to increase the time_limit in get_ipa_config:find_entries
The next step would be to understand why it is so costly to find this aci.
This is strange. One of the goals of the PermissionV2 refactoring taken in FreeIPA 4.0 was to improve performance by moving ACIs out of SUFFIX to separate containers. So the number of ACIs on entries on entries cn=newauser19965 -> cn=groups -> cn=compat -> dc=testrelm,dc=com should be lower than in FreeIPA < 4.0. That is correct, the number of selected low. Actually to evaluate the right to access this entry, only one ACI was selected (Admin read-only attributes). It seems it is evaluated very fast. The problem, is that it takes a long time to select that ACI. The selection is looking through the DIT which ACI are in the scope, are related to the appropriate target (attributes, entries...), are granting the correct right. All this select take 1-3 seconds. Apparently it spends time into evaluating a filter. I doubt that this ACI is guilty. It is just the selected one. But it scans others ACIs. One/more of them may have a costly filter. I have not identified which of them are responsible of this. Ok, thanks for info. It would be nice to pin point the faulty ACI as I am afraid it may affect performance - like as in this Bug. A little bit of grepping shows that there is an ACI with dnahostname in targetfilter and an ACI with ipaconfigstring in targetfilter, neither of which is indexed. As far as I know (Thierry or Ludwig may correct me), indices does not matter for ACI evaluation, given DS always evaluates ACI and the target object itself, having access to it's attribute. I built a debug version but that was not precise enough to determine if some aci are costly to match the resource. I need to redo some measure. Varun, please could you keep the host for me until end of next week. Currently the request is scanning ~40 ACIs for each individual entry (20000) in the group. The overall time it takes may be due to the number of ACI to scan (800K). An other possibility is that some specific ACI are expensive to scan that is why I need more investigation. Sure Thierry. Surprisingly the ACI that is the most costly is
aci: (targetattr="ipaUniqueId || memberOf || enrolledBy || krbExtraData || krb
PrincipalName || krbCanonicalName || krbPasswordExpiration || krbLastPwdChang
e || krbLastSuccessfulAuth || krbLastFailedAuth")(version 3.0; acl "Admin rea
d-only attributes"; allow (read, search, compare) groupdn = "ldap:///cn=admin
s,cn=groups,cn=accounts,dc=testrelm,dc=com";)
Here are the measure of aci consumption (scan) after having evaluated the first ~5000 entries (from the group) out of 20000.
evaluation witout delay is the number of time the aci was scan in 0ms
evaluation with delay considers each time the aci was scan in more than 0ms
- total: is the cumul of the scan duration for this aci
- nb: is the number of time this aci was scan
- avg: is the average duration of the scan (total/nb)
- [x..y] is the range of duration of the scan (min/max)
So we can see that "Admin read-only attributes" is scanned for each entry (nb=4983). Its scanning almost always costs more than 150ms and can cost up to 12s !!!!. The only scanning of this aci cost 1758sec. The overall duration of the samples was 30000sec (8h20).
So the problem with this ACI is that it is systematically scanned and each individual scan takes too much time.
aci: "Admin read-only attributes"
evaluation: without delay: 26 , with delay avg 352 [149..12840] - total 1758183 - nb 4983
aci: "permission:System: Read Global Configuration"
evaluation: without delay: - , with delay avg 255 [200..616] - total 5358 - nb 21
aci: "Admins can change GUI config"
evaluation: without delay: - , with delay avg 247 [200..417] - total 5204 - nb 21
aci: "permission:System: Read Groups"
evaluation: without delay: 1 , with delay avg 210 [200..226] - total 843 - nb 4
aci: "anonymous, no acis"
evaluation: without delay: - , with delay avg 175 [167..192] - total 701 - nb 4
aci: "permission:System: Read Group Membership"
evaluation: without delay: 3 , with delay avg 100 [1..200] - total 201 - nb 2
aci: "Admin can manage any entry"
evaluation: without delay: 4914 , with delay avg 68 [1..409] - total 6514 - nb 95
aci: "Anonymous read access to containers"
evaluation: without delay: 4755 , with delay avg 3 [1..225] - total 893 - nb 254
aci: "Users/managers can read basic token info"
evaluation: without delay: 4776 , with delay avg 1 [1..1] - total 233 - nb 233
aci: "permission:System: Read Group Compat Tree"
evaluation: without delay: 4929 , with delay avg 1 [1..1] - total 80 - nb 80
aci: "permission:System: Read Sudoers compat tree"
evaluation: without delay: 4940 , with delay avg 1 [1..1] - total 69 - nb 69
aci: "selfservice:User Self service"
evaluation: without delay: 5001 , with delay avg 1 [1..1] - total 8 - nb 8
aci: "permission:System: Read Group Views Compat Tree"
evaluation: without delay: 4789 , with delay avg 1 [1..3] - total 222 - nb 220
aci: "permission:System: Read User Compat Tree"
evaluation: without delay: 4953 , with delay avg 1 [1..1] - total 56 - nb 56
aci: "Users can create self-managed tokens"
evaluation: without delay: 5006 , with delay avg 1 [1..1] - total 3 - nb 3
aci: "Users/managers can see TOTP details"
evaluation: without delay: 4758 , with delay avg 1 [1..1] - total 251 - nb 251
aci: "permission:Retrieve Certificates from the CA"
evaluation: without delay: 5003 , with delay avg 1 [1..1] - total 6 - nb 6
aci: "permission:System: Write DNS Configuration"
evaluation: without delay: 4999 , with delay avg 1 [1..1] - total 10 - nb 10
aci: "Samba system principals can read and write NT passwords"
evaluation: without delay: 4979 , with delay avg 1 [1..1] - total 30 - nb 30
aci: "permission:System: Manage DNSSEC keys"
evaluation: without delay: 4944 , with delay avg 1 [1..1] - total 65 - nb 65
aci: "Users/managers can see HOTP details"
evaluation: without delay: 4746 , with delay avg 1 [1..1] - total 263 - nb 263
aci: "permission:System: Remove DNS Entries"
evaluation: without delay: 4999 , with delay avg 1 [1..1] - total 10 - nb 10
aci: "permission:Request Certificates from a different host"
evaluation: without delay: 5004 , with delay avg 1 [1..1] - total 5 - nb 5
aci: "Managers can write basic token info"
evaluation: without delay: 5005 , with delay avg 1 [1..1] - total 4 - nb 4
aci: "Managers can delete tokens"
evaluation: without delay: 5000 , with delay avg 1 [1..1] - total 9 - nb 9
aci: "permission:System: Read DNS Configuration"
evaluation: without delay: 4949 , with delay avg 1 [1..1] - total 60 - nb 60
aci: "Admins can write passwords"
evaluation: without delay: 4993 , with delay avg 1 [1..1] - total 16 - nb 16
aci: "Anonymous read access to DIT root"
evaluation: without delay: 4751 , with delay avg 1 [1..1] - total 258 - nb 258
aci: "selfservice:Users can manage their own SSH public keys"
evaluation: without delay: 5004 , with delay avg 1 [1..1] - total 5 - nb 5
aci: "permission:Certificate Remove Hold"
evaluation: without delay: 5001 , with delay avg 1 [1..1] - total 8 - nb 8
aci: "permission:Write IPA Configuration"
evaluation: without delay: 5006 , with delay avg 1 [1..1] - total 3 - nb 3
aci: "Admins can write password policies"
evaluation: without delay: 4765 , with delay avg 1 [1..1] - total 244 - nb 244
aci: "permission:Revoke Certificate"
evaluation: without delay: 5002 , with delay avg 1 [1..1] - total 7 - nb 7
aci: "permission:System: Read DNS Entries"
evaluation: without delay: 4808 , with delay avg 1 [1..4] - total 204 - nb 201
aci: "permission:System: Read DNSSEC metadata"
evaluation: without delay: 4957 , with delay avg 1 [1..1] - total 52 - nb 52
aci: "permission:Request Certificate"
evaluation: without delay: 4999 , with delay avg 1 [1..1] - total 10 - nb 10
aci: "permission:System: Read Netgroup Compat Tree"
evaluation: without delay: 4939 , with delay avg 1 [1..1] - total 70 - nb 70
aci: "permission:Request Certificate with SubjectAltName"
evaluation: without delay: 5002 , with delay avg 1 [1..1] - total 7 - nb 7
aci: "permission:System: Read ACIs"
evaluation: without delay: 4976 , with delay avg 1 [1..1] - total 33 - nb 33
aci: "permission:Get Certificates status from the CA"
evaluation: without delay: 5001 , with delay avg 1 [1..1] - total 8 - nb 8
aci: "permission:System: Add DNS Entries"
evaluation: without delay: 4999 , with delay avg 1 [1..1] - total 10 - nb 10
aci: "permission:System: Update DNS Entries"
evaluation: without delay: 4999 , with delay avg 1 [1..1] - total 10 - nb 10
aci: "permission:System: Read User Views Compat Tree"
evaluation: without delay: 4803 , with delay avg 1 [1..1] - total 206 - nb 206
aci: "permission:System: Manage DNSSEC metadata"
evaluation: without delay: 4953 , with delay avg 1 [1..1] - total 56 - nb 56
aci: "permission:System: Compat Tree ID View targets"
evaluation: without delay: 4636 , with delay avg 1 [1..2] - total 374 - nb 373
aci: "selfservice:Self can write own password"
evaluation: without delay: 4998 , with delay avg 1 [1..1] - total 11 - nb 11
aci: "permission:System: Read Host Compat Tree"
evaluation: without delay: 4950 , with delay avg 1 [1..1] - total 59 - nb 59
Next steps would be to do the test without that ACI and without some targetattr attribute (especially those that can be virtual attribute).
- where are we spending time
With the instrumented version , the pstacks are quite different but still that
the PRE-SEARCH callback backend_search_cb triggers an aci evaluation
for each entry of the group.
#0 0x00007f34efe6287d in fsync () from /lib64/libpthread.so.0
#1 0x00007f34f04b72e3 in pt_Fsync () from /lib64/libnspr4.so
#2 0x00007f34f20a3408 in vslapd_log_error (fp=0x7f34f36ca3f0, subsystem=subsystem@entry=0x7f34e8f072d9 "NSACLPlugin",
fmt=fmt@entry=0x7f34e8f03b88 "scaning (%lx) duration %s: %d\n", ap=ap@entry=0x7f34cfff7970, locked=locked@entry=1) at ldap/servers/slapd/log.c:1953
#3 0x00007f34f20a85fd in slapd_log_error_proc_internal (subsystem=0x7f34e8f072d9 "NSACLPlugin", fmt=0x7f34e8f03b88 "scaning (%lx) duration %s: %d\n",
ap_err=ap_err@entry=0x7f34cfff7958, ap_file=ap_file@entry=0x7f34cfff7970) at ldap/servers/slapd/log.c:1809
#4 0x00007f34f20a8c40 in slapi_log_error (severity=severity@entry=8, subsystem=<optimized out>, fmt=fmt@entry=0x7f34e8f03b88 "scaning (%lx) duration %s: %d\n")
at ldap/servers/slapd/log.c:1994
#5 0x00007f34e8eed681 in acl__scan_for_acis (err=<synthetic pointer>, aclpb=0x7f34f9d922e0) at ldap/servers/plugins/acl/acl.c:2029
#6 acl_access_allowed (pb=<optimized out>, e=e@entry=0x7f34f9b4e830, attr=attr@entry=0x7f34b8023a20 "memberOf", val=<optimized out>, access=access@entry=2)
at ldap/servers/plugins/acl/acl.c:586
#7 0x00007f34e8f00c67 in acl_access_allowed_main (pb=<optimized out>, e=0x7f34f9b4e830, attrs=<optimized out>, val=<optimized out>, access=2,
flags=<optimized out>, errbuf=0x0) at ldap/servers/plugins/acl/aclplugin.c:392
#8 0x00007f34f20c68cc in plugin_call_acl_plugin (pb=pb@entry=0x7f34cfffeae0, e=e@entry=0x7f34f9b4e830, attrs=attrs@entry=0x7f34cfff7c10,
val=val@entry=0x7f34b8003b98, access=access@entry=2, flags=flags@entry=0, errbuf=errbuf@entry=0x0) at ldap/servers/slapd/plugin_acl.c:90
#9 0x00007f34f2090a36 in test_ava_filter (pb=0x7f34cfffeae0, e=e@entry=0x7f34f9b4e830, a=0x7f34f9bf1c60, ava=0x7f34b8003b90, ftype=163, verify_access=1,
only_check_access=0, access_check_done=0x7f34cfff7cd4) at ldap/servers/slapd/filterentry.c:330
#10 0x00007f34f2091abf in slapi_filter_test_ext_internal (pb=<optimized out>, e=0x7f34f9b4e830, f=0x7f34b8003b70, verify_access=1, only_check_access=0,
access_check_done=access_check_done@entry=0x7f34cfff7cd4) at ldap/servers/slapd/filterentry.c:160
#11 0x00007f34f2091faa in slapi_filter_test_ext (pb=0x7f34cfffeae0, e=0x7f34f9b4e830, f=0x7f34b8003b70, verify_access=1, only_check_access=<optimized out>)
at ldap/servers/slapd/filterentry.c:274
#12 0x00007f34e41cbf84 in backend_search_entry_cb () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
#13 0x00007f34e41df113 in map_data_foreach_entry_id () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
#14 0x00007f34e41ce4db in backend_search_set_cb () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
#15 0x00007f34e41df24f in map_data_foreach_map () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
#16 0x00007f34e41ce216 in backend_search_group_cb () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
#17 0x00007f34e41df190 in map_data_foreach_domain () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
#18 0x00007f34e41cdcf4 in backend_search_cb () from /usr/lib64/dirsrv/plugins/schemacompat-plugin.so
#19 0x00007f34f20c3e10 in plugin_call_func (list=0x7f34f3950df0, operation=operation@entry=403, pb=pb@entry=0x7f34cfffeae0, call_one=call_one@entry=0)
at ldap/servers/slapd/plugin.c:1952
#20 0x00007f34f20c4068 in plugin_call_list (pb=0x7f34cfffeae0, operation=403, list=<optimized out>) at ldap/servers/slapd/plugin.c:1886
#21 plugin_call_plugins (pb=pb@entry=0x7f34cfffeae0, whichfunction=whichfunction@entry=403) at ldap/servers/slapd/plugin.c:459
#22 0x00007f34f20b7836 in op_shared_search (pb=pb@entry=0x7f34cfffeae0, send_result=send_result@entry=1) at ldap/servers/slapd/opshared.c:581
#23 0x00007f34f25a307e in do_search (pb=pb@entry=0x7f34cfffeae0) at ldap/servers/slapd/search.c:378
#24 0x00007f34f2592599 in connection_dispatch_operation (pb=0x7f34cfffeae0, op=0x7f35004500e0, conn=0x7f34dcc52bf0) at ldap/servers/slapd/connection.c:684
#25 connection_threadmain () at ldap/servers/slapd/connection.c:2534
#26 0x00007f34f04ba9eb in _pt_root () from /lib64/libnspr4.so
- the Filter checks by backend_search_cb is the same than the request
Is that usefull ?
- The filter is checked enforcing access right (and so aci evaluation)
Is that usefull ?
- The aci is scanned for each of the 20000 entries.
It appears that the 26 first times it is scanned, the scanning is immediate.
Then it cost ~150ms.
At that time, others plugins do not interfere with the scanning.
The next steps
- get a finer debugging version (especially aroung
slapi_vattr_filterxxx see https://bugzilla.redhat.com/show_bug.cgi?id=1019272#c21)
- test with modified aci
schema-compat PRE-search callback tests if each entry of the group match a given filter (actually the same filter than the search req). The test of the filter is done with access_right. I do not know if it is required. In fact, if I change: dn: cn=users,cn=Schema Compatibility,cn=plugins,cn=config schema-compat-check-access: no Then the etime of the request returns to 0: 18/Feb/2015:04:54:52 +051800] conn=30 op=5 SRCH base="dc=testrelm,dc=com" scope=2 filter="(memberOf=cn=newgrp,cn=groups,cn=accounts,dc=testrelm,dc=com)" attrs="member" [18/Feb/2015:04:54:52 +051800] conn=30 op=5 RESULT err=0 tag=101 nentries=100 etime=0 notes=P An additional next step, is to check if schema-compat-check-access could be set to 'no'. (In reply to thierry bordaz from comment #34) > schema-compat PRE-search callback tests if each entry of the group match a > given filter (actually the same filter than the search req). > > The test of the filter is done with access_right. > I do not know if it is required. In fact, if I change: > dn: cn=users,cn=Schema Compatibility,cn=plugins,cn=config > schema-compat-check-access: no > > Then the etime of the request returns to 0: > 18/Feb/2015:04:54:52 +051800] conn=30 op=5 SRCH base="dc=testrelm,dc=com" > scope=2 > filter="(memberOf=cn=newgrp,cn=groups,cn=accounts,dc=testrelm,dc=com)" > attrs="member" > [18/Feb/2015:04:54:52 +051800] conn=30 op=5 RESULT err=0 tag=101 > nentries=100 etime=0 notes=P > > An additional next step, is to check if schema-compat-check-access could be > set to 'no'. My understanding is that check-access should be enforced and so that change is not valid. In fact, pre-search callback tests entries against a filter and if they match the filter (slapi_filter_test) it will send back the entry (and the requested attributes) to the ldap client (slapi_send_ldap_search_entry). If the search right is not granted for attributes in the filter, then the entry should be skipped. For example if 'memberof' is not granted, it should not return the entry/attributes even if search access was granted for 'member'. I reproduced the problem on an other VM. The performance issue is fixed by an optimisation of the 'group-find' command: "implementation tried to get all entries which are member of group. That means also user. User can't have any members therefore this costly processing was unnecessary." I tested the fix: freeipa-pvoborni-0810-speed-up-indirect-member-processing.patch With a group of 4000 users, the duration of the command dropped from 1m to less than 3s. The same response time is measured with 8000 users. Ok, thanks for confirmation Thierry (Comment 40). I am thus adding respective Bugzilla as a blocker for this one. This looks as an upstream ticket fixing it: https://fedorahosted.org/freeipa/ticket/4965 ticket 4965 was handled in bug 1204637 therefore changing state to POST Verified fix using IPA server :ipa-server-4.2.0-8.el7.x86_64 Marking bug as verified. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-2362.html |