Bug 742288

Summary: RFC2307bis initgroups calls are slow
Product: Red Hat Enterprise Linux 6 Reporter: Jakub Hrozek <jhrozek>
Component: sssdAssignee: Stephen Gallagher <sgallagh>
Status: CLOSED ERRATA QA Contact: IDM QE LIST <seceng-idm-qe-list>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 6.2CC: grajaiya, jgalipea, jzeleny, kbanerje, prc
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: sssd-1.5.1-59.el6 Doc Type: Bug Fix
Doc Text:
Cause: SSSD stores all users and groups retrieved from remote server in its local cache. When writing to this cache, transactions are used. In case RFC 2307bis schema was used, one transaction was used for each entity stored in the cache. Consequence: The initgroups operation performed too many disk writes when the RFC2307bis schema was used, slowing the operation down. Fix: All entities retrieved from remote server are first stored in an internal hash table and then only a single transaction is used to store all the groups and their memberships Result: The initgroups operation is now faster, especially for users who are members of large number of groups.
Story Points: ---
Clone Of:
: 748820 (view as bug list) Environment:
Last Closed: 2011-12-06 16:40:37 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 748554, 748820    

Description Jakub Hrozek 2011-09-29 15:32:01 UTC
Description of problem:
Several users have reported recently that using RFC2307bis performs very slowly on initgroups() requests. One user stated that moving the cache to /dev/shm resulted in significant performance gains*.

While going through the RFC3207bis code to address ticket #868, I think I have identified the real problem here. We are performing far too many transactions when dealing with nested groups.

We have a transaction for saving the original user, then another to save the user's direct groups, then one transaction per group above the first level for saving nested groups. This results in a large number of probably unnecessary disk-writes.

We need to save the group information we gather until the very end of processing and save it in a single transaction. This should provide a marked performance improvement. 

Version-Release number of selected component (if applicable):
sssd-1.5.1-52.el6

How reproducible:
performance issue

Steps to Reproduce:
1. remove SSSD caches
2. time id <user>
  
Actual results:
The number of transactions is high resulting in a performance bottlenec

Expected results:
Fewer transactions and better performance

Comment 1 Stephen Gallagher 2011-09-30 11:56:49 UTC
Upstream ticket:
https://fedorahosted.org/sssd/ticket/1006

Comment 3 Kaushik Banerjee 2011-10-25 15:28:36 UTC
Verified in version:
# rpm -qi sssd | head
Name        : sssd                         Relocations: (not relocatable)
Version     : 1.5.1                             Vendor: Red Hat, Inc.
Release     : 60.el6                        Build Date: Tue 18 Oct 2011 10:44:48 PM IST
Install Date: Wed 19 Oct 2011 07:42:05 PM IST      Build Host: x86-003.build.bos.redhat.com
Group       : Applications/System           Source RPM: sssd-1.5.1-60.el6.src.rpm
Size        : 3615306                          License: GPLv3+
Signature   : (none)
Packager    : Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>
URL         : http://fedorahosted.org/sssd/
Summary     : System Security Services Daemon


Verified that there were no regressions in the initgroups test cases from automation test suites.

Also, id lookup for a user(rfc2307bis) belonging to 100 groups shows:

# time id user_2307bis_multi_grp100
uid=40000(user_2307bis_multi_grp100) gid=40000(multi_grp100_2307bis_0) groups=40000(multi_grp100_2307bis_0),400018(multi_grp100_2307bis_18),40004(multi_grp100_2307bis_4),400088(multi_grp100_2307bis_88),40005(multi_grp100_2307bis_5),400089(multi_grp100_2307bis_89),40006(multi_grp100_2307bis_6),400028(multi_grp100_2307bis_28),40007(multi_grp100_2307bis_7),400029(multi_grp100_2307bis_29),400011(multi_grp100_2307bis_11),400026(multi_grp100_2307bis_26),400084(multi_grp100_2307bis_84),40001(multi_grp100_2307bis_1),400010(multi_grp100_2307bis_10),400027(multi_grp100_2307bis_27),400085(multi_grp100_2307bis_85),40002(multi_grp100_2307bis_2),400013(multi_grp100_2307bis_13),400024(multi_grp100_2307bis_24),400086(multi_grp100_2307bis_86),40003(multi_grp100_2307bis_3),400012(multi_grp100_2307bis_12),400025(multi_grp100_2307bis_25),400087(multi_grp100_2307bis_87),400015(multi_grp100_2307bis_15),400022(multi_grp100_2307bis_22),400080(multi_grp100_2307bis_80),400014(multi_grp100_2307bis_14),400023(multi_grp100_2307bis_23),400081(multi_grp100_2307bis_81),400017(multi_grp100_2307bis_17),400020(multi_grp100_2307bis_20),400082(multi_grp100_2307bis_82),400016(multi_grp100_2307bis_16),400021(multi_grp100_2307bis_21),400083(multi_grp100_2307bis_83),40008(multi_grp100_2307bis_8),400033(multi_grp100_2307bis_33),400040(multi_grp100_2307bis_40),400055(multi_grp100_2307bis_55),400091(multi_grp100_2307bis_91),40009(multi_grp100_2307bis_9),400032(multi_grp100_2307bis_32),400041(multi_grp100_2307bis_41),400054(multi_grp100_2307bis_54),400090(multi_grp100_2307bis_90),400031(multi_grp100_2307bis_31),400042(multi_grp100_2307bis_42),400057(multi_grp100_2307bis_57),400068(multi_grp100_2307bis_68),400093(multi_grp100_2307bis_93),400030(multi_grp100_2307bis_30),400043(multi_grp100_2307bis_43),400056(multi_grp100_2307bis_56),400069(multi_grp100_2307bis_69),400092(multi_grp100_2307bis_92),400037(multi_grp100_2307bis_37),400044(multi_grp100_2307bis_44),400051(multi_grp100_2307bis_51),400095(multi_grp100_2307bis_95),400036(multi_grp100_2307bis_36),400045(multi_grp100_2307bis_45),400050(multi_grp100_2307bis_50),400094(multi_grp100_2307bis_94),4000100(multi_grp100_2307bis_100),400035(multi_grp100_2307bis_35),400046(multi_grp100_2307bis_46),400053(multi_grp100_2307bis_53),400079(multi_grp100_2307bis_79),400097(multi_grp100_2307bis_97),400034(multi_grp100_2307bis_34),400047(multi_grp100_2307bis_47),400052(multi_grp100_2307bis_52),400078(multi_grp100_2307bis_78),400096(multi_grp100_2307bis_96),400048(multi_grp100_2307bis_48),400062(multi_grp100_2307bis_62),400077(multi_grp100_2307bis_77),400099(multi_grp100_2307bis_99),400049(multi_grp100_2307bis_49),400063(multi_grp100_2307bis_63),400076(multi_grp100_2307bis_76),400098(multi_grp100_2307bis_98),400039(multi_grp100_2307bis_39),400060(multi_grp100_2307bis_60),400075(multi_grp100_2307bis_75),400038(multi_grp100_2307bis_38),400061(multi_grp100_2307bis_61),400074(multi_grp100_2307bis_74),400059(multi_grp100_2307bis_59),400066(multi_grp100_2307bis_66),400073(multi_grp100_2307bis_73),400058(multi_grp100_2307bis_58),400067(multi_grp100_2307bis_67),400072(multi_grp100_2307bis_72),400064(multi_grp100_2307bis_64),400071(multi_grp100_2307bis_71),400065(multi_grp100_2307bis_65),400070(multi_grp100_2307bis_70),400019(multi_grp100_2307bis_19)

real	2m22.961s
user	0m0.004s
sys	0m0.011s

Comment 4 Jan Zeleny 2011-10-27 12:04:06 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Cause: SSSD stores all users and groups retrieved from remote server in its local cache. When writing to this cache, transactions are used. In case RFC 2307bis schema was used, one transaction was used for each entity stored in the cache.
Consequence: Storing groups with many users took an extensive amount of time when using RFC 2307bis.
Fix: All entities retrieved from remote server are first stored in internal hash table and then only a few transactions are used to store everything, thus reducing complexity of the operation from linear to constant.
Result: Storing groups with many users is now faster.

Comment 5 Jakub Hrozek 2011-10-27 12:16:45 UTC
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1,4 +1,4 @@
 Cause: SSSD stores all users and groups retrieved from remote server in its local cache. When writing to this cache, transactions are used. In case RFC 2307bis schema was used, one transaction was used for each entity stored in the cache.
-Consequence: Storing groups with many users took an extensive amount of time when using RFC 2307bis.
+Consequence: The initgroups operation performed too many disk writes when the RFC2307bis schema was used, slowing the operation down.
-Fix: All entities retrieved from remote server are first stored in internal hash table and then only a few transactions are used to store everything, thus reducing complexity of the operation from linear to constant.
+Fix: All entities retrieved from remote server are first stored in an internal hash table and then only a single  transaction is used to store all the groups and their memberships
-Result: Storing groups with many users is now faster.+Result: The initgroups operation is now faster, especially for users who are members of large number of groups.

Comment 6 errata-xmlrpc 2011-12-06 16:40:37 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2011-1529.html