Bug 805920

Summary: [RFE] Introduce concept of Ghost User instead of using Fake User
Product: Red Hat Enterprise Linux 6 Reporter: Dmitri Pal <dpal>
Component: sssdAssignee: Jakub Hrozek <jhrozek>
Status: CLOSED ERRATA QA Contact: Kaushik Banerjee <kbanerje>
Severity: unspecified Docs Contact:
Priority: high    
Version: 6.3CC: grajaiya, jgalipea, prc
Target Milestone: rcKeywords: FutureFeature
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: sssd-1.9.1-1.el6 Doc Type: Enhancement
Doc Text:
Do not document (internal task only)
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-02-21 09:21:53 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 782183, 840699    

Description Dmitri Pal 2012-03-22 13:04:59 UTC
This bug is created as a clone of upstream ticket:
https://fedorahosted.org/sssd/ticket/1255

For lookup reasons we currently create fake user objects in order to complete operations like an 'id' command.
Fake users were introduced as a performance improvement to reduce the number of LDAP requests. Previously we were resolving each user of each group we looked up which in some pathological condition could make us download the whole database.

Although this change did indeed boost our performance it can be further improved.
What we haven't properly taken in accoutn is that by simply crating objects we are putting pressure on our local database.
Each user object involves many operations including touching many indexes and memberof plugin operations.

The proposed solution here, is to not create fake users at all, and instead only add a ghost member attribute to the group. The attribute can be called 'ghost' and represent usernames (just like memberuid) of users that are supposedly group members but haven't been fully resolved yet.
These ghost user names are derived from the DNs of the originalmemberof attribute just like it is done today for creating fake users.

ghosts lists will need to be updated (to remove the name) when the actual user
is looked up and saved in the cache. This will avoid duplicates from showing up.

ghosts are otherwise updated only when a group is explicitly looked up in
ldap. we do not care if, in some cases a parent group may till show a
disappeared ghost from a member group and the reason is that user membership
are always relevant only when an actual user is being evaluated in the system,
and in that case the user has to be stored in the cache.
After a user is added to the DB the sysdb code should take care to do an extra
check searching for ghost=username and remove any remaning mention of the user.

ghosts may be left behing if the user memebrship changes between the time
groups are looked up and the user is actually looked up, so albeit rare it is
possible to have a stale ghosts.

we also need nto remove ghosts from non-stale groups, this can be done in 2
ways: a) as part of the sysdb search or b) by changing the memberof plugin to
remove values from ghost when it adds values to memberuid.

We should start doing only A) and add B) only if combining the modifies turns
out to be an actual performance gain, (B is more complex to handle and touches
an already complex plugin so we should do it only as an additional
optimization step).

By using an attribute on the groups instead of creating actual objects we
should be able to attain important performance benefit at least in the most
pathological cases (like freeipa 2.0/2.1 where all users are part of the
'ipausers' group, so a simply ID command ends up creating one object for every
user in the ipa domain, which could be tens of thousands).

Comment 1 RHEL Program Management 2012-07-10 07:06:36 UTC
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.

Comment 2 RHEL Program Management 2012-07-11 02:05:03 UTC
This request was erroneously removed from consideration in Red Hat Enterprise Linux 6.4, which is currently under development.  This request will be evaluated for inclusion in Red Hat Enterprise Linux 6.4.

Comment 4 Kaushik Banerjee 2012-11-27 16:54:27 UTC
Verified with version 1.9.2-21

Introduction of ghost users have led to a significant performance improvement in group lookups with large number of members.

Beaker automated performance run report for ghost users:
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:: [   LOG    ] :: bz805920 Lookup a group with large no. of users
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

:: [   LOG    ] :: Sleeping for 5 seconds
:: [   PASS   ] :: Running '(time getent group bulkgroup1 ) > /tmp/output 2>&1'
:: [   PASS   ] :: rfc2307 Group bulkgroup1 with 1000 member users returned in less than 2 seconds
:: [   LOG    ] :: Sleeping for 5 seconds
:: [   PASS   ] :: Running '(time getent group bulkgroup2 ) > /tmp/output 2>&1'
:: [   PASS   ] :: rfc2307bis Group bulkgroup2 with 5000 member users returned in less than 8 seconds
:: [   PASS   ] :: Running '(time getent group bulkgroup3 ) > /tmp/output 2>&1'
:: [   PASS   ] :: rfc2307bis Group bulkgroup3 with 10000 Member users returned in less than 25 seconds
:: [   LOG    ] :: Duration: 38s
:: [   LOG    ] :: Assertions: 6 good, 0 bad
:: [   PASS   ] :: RESULT: bz805920 Lookup a group with large no. of users

Comment 5 errata-xmlrpc 2013-02-21 09:21:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-0508.html