Bug 1377112 - Inconsistency in invalidating sysdb user record after running id and getent command
Summary: Inconsistency in invalidating sysdb user record after running id and getent c...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: sssd
Version: 7.3
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: SSSD Maintainers
QA Contact: Steeve Goveas
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-09-18 15:30 UTC by Amith
Modified: 2016-09-20 22:41 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-09-20 19:41:03 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Amith 2016-09-18 15:30:42 UTC
Description of problem:
Try to invalidate user record in sysdb cache after running id command. You will see that dataExpireTimestamp never gets updated however after running a user lookup with getent command, invalidation works fine and  dataExpireTimestamp updates to 1.

Version-Release number of selected component (if applicable):
sssd-1.14.0-42.el7.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Setup sssd.conf as follows: 
[sssd]
config_file_version = 2
services = nss, pam
domains = LDAP
debug_level = 6

[domain/LDAP]
debug_level = 0xFFF0
id_provider = ldap
ldap_uri = ldap://hubcap.lab.eng.pnq.redhat.com
ldap_tls_cacert = /etc/openldap/certs/cacert.asc

2. Run id cmd:
# id cachetestuser1
uid=121299(cachetestuser1) gid=10000 groups=10000

3. Invalidate user with sss-cache cmd:
sss_cache -U

4. Verify the sysdb cache:
# ldbsearch -H cache_LDAP.ldb "name=cachetestuser1@ldap" | grep dataExpireTimestamp
asq: Unable to register control with rootdse!
dataExpireTimestamp: 1474217591

5. Clean the cache : # sssctl cache-remove -ops

6. Run getent cmd:
# getent passwd -s sss cachetestuser1
cachetestuser1:*:121299:10000:Temp user:/home/cachetestuser1:/bin/bash

7. Invalidate user with sss-cache cmd:
sss_cache -U

8. Verify the sysdb cache:
# ldbsearch -H cache_LDAP.ldb "name=cachetestuser1@ldap" | grep dataExpireTimestamp
asq: Unable to register control with rootdse!
dataExpireTimestamp: 1
 
Actual results:
Inconsistent results of dataExpireTimestamp record in sysdb cache.

Expected results:
Invalidation to be consistent.

Additional info:

Comment 3 Jakub Hrozek 2016-09-20 09:16:37 UTC
I think this behavior is expected. Since 1.14/7.3, we have two levels of sysdb cache - the old sysdb (cache_$domain.ldb) and the timestamp cache (timestamps_$domain.ldb). The timestamp cache is much faster to write to, but in corner cases (like sssd crash) some data might be lost since the last write. Therefore, we use the timestamp cache only for data that is not important for sssd runtime, like timestamps -- if we lose them, then we just update the entry again.

And to take advantage of the new timestamp cache, whenever we write in sssd to sysdb, we check if the attribute being written is one of the timestamp attributes (dataExpireTimestamp is) and if it is, we avoid writing to the main sysdb cache completely. But whenever an attribute is added that was not present, we try to be on the safe side and write everything to both caches to keep everything in sync.

And the detection of added attributes is the root cause I think. When you run sss_cache -U, the tool sets both "dataExpireTimestamp" and "initgrExpireTimestamp" to 1. Because the first lookup is done using the "id" command which also performs initgroups, we have both attributes in the cache before we call sss_cache and therefore nothing is added, just timestamp attributes are changed and we only write to the timestamp cache.

After the cleanup, only getent passwd is called which doesn't add "initgrExpireTimestamp" to the cache, which means sss_cache setting "initgrExpireTimestamp" in fact adds a new value to the cache. And the cache management code treats that as modification that needs to be propagated to both caches.

Maybe we could optimize the cache difference detection more so that if only timestamp cache attributes are added, no full write is performed, but I don't think this is needed.

Bottom line - I would like to close this as a NOTABUG.

Comment 4 Pavel Březina 2016-09-20 10:48:24 UTC
Can we actually stop storing dataExpireTimestamp in the main cache? Otherwise I would prefer to set those attributes in the main cache from sss_cache directly to be consistent. Since it is just a tool it won't have any performance impact.

Comment 5 Lukas Slebodnik 2016-09-20 11:17:14 UTC
The consistency is not very important. The important is result.

If "sss_cache -U" is called thn data should be re-fetched from ldap server.
The only problem could happen only in following situation
* sss_cache -U
* rm -f /var/lib/sss/db/timestamps_*
* restart sssd

But that's quite a corner case.

Comment 6 Jakub Hrozek 2016-09-20 19:39:07 UTC
(In reply to Pavel Březina from comment #4)
> Can we actually stop storing dataExpireTimestamp in the main cache?
> Otherwise I would prefer to set those attributes in the main cache from
> sss_cache directly to be consistent. Since it is just a tool it won't have
> any performance impact.

I was considering this (and the first version of the timestamp feature was coded like this), but then I decided to be on the safe side and I was even considering making it possible to turn off the timestamp cache. In general, the logic is simpler with less special-casing and the extra writes are there only when the attribute is created (on first initgroups or first sss_cache call).

Comment 7 Jakub Hrozek 2016-09-20 19:40:23 UTC
btw if we ever move the timestamps away from the main cache, I think we should ditch the ldb layer completely and use just lmdb for the timestamps, /that/ would be fast..

Comment 8 Jakub Hrozek 2016-09-20 19:41:03 UTC
Amith, please reopen if the explanation in comment #3 doesn't make sense to you or if you see some other issues.


Note You need to log in before you can comment on or make changes to this bug.