1359208 – sssd does not refresh expired cache entries with enumerate=true

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1359208 - sssd does not refresh expired cache entries with enumerate=true

Summary: sssd does not refresh expired cache entries with enumerate=true

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	sssd
Sub Component:
Version:	7.2
Hardware:	All
OS:	Linux
Priority:	high
Severity:	medium
Target Milestone:	rc
Target Release:	---
Assignee:	Petr Čech
QA Contact:	Sudhir Menon
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2016-07-22 13:57 UTC by Thorsten Scherf
Modified:	2022-03-13 14:05 UTC (History)
CC List:	11 users (show)
Fixed In Version:	sssd-1.15.0-2.el7
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2017-08-01 08:58:07 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
Potential quick-n-dirty fix (1.46 KB, patch) 2016-09-12 15:55 UTC, Sumit Bose	no flags	Details \| Diff
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	SSSD sssd issues 4215	0	None	closed	sssd does not refresh expired cache entries with enumerate=true	2020-11-24 09:39:00 UTC
Red Hat Product Errata	RHEA-2017:2294	0	normal	SHIPPED_LIVE	sssd bug fix and enhancement update	2017-08-01 12:39:55 UTC

Description Thorsten Scherf 2016-07-22 13:57:21 UTC

Description of problem:

########################
enumerate = true
entry_cache_timeout = 30
enum_cache_timeout = 30
######################## 

# service sssd stop;rm -rf /var/lib/sss/db/* /var/lib/sss/mc/*;service sssd start
Redirecting to /bin/systemctl stop  sssd.service
Redirecting to /bin/systemctl start  sssd.service
 
Adding a member:
 
# ipa group-add-member --users tuser200 tgroup100

# date; SSS_NSS_USE_MEMCACHE=NO getent group tgroup100
Fri Jul 22 15:46:17 CEST 2016
tgroup100:*:1221800008:

# date; SSS_NSS_USE_MEMCACHE=NO getent group tgroup100
Fri Jul 22 15:46:27 CEST 2016
tgroup100:*:1221800008:tuser200

It took 10s to recognize the change.

Removing a member:

# ipa group-remove-member --users tuser200 tgroup100

# date; SSS_NSS_USE_MEMCACHE=NO getent group tgroup100
Fri Jul 22 15:49:53 CEST 2016
tgroup100:*:1221800008:tuser200

# ldbsearch -H /var/lib/sss/db/cache_coe.muc.redhat.com.ldb name=tgroup100 lastUpdate dataExpireTimestamp
asq: Unable to register control with rootdse!
# record 1
dn: name=tgroup100,cn=groups,cn=coe.muc.redhat.com,cn=sysdb
lastUpdate: 1469195380
dataExpireTimestamp: 1469195410

# date -d @1469195410
Fri Jul 22 15:50:10 CEST 2016

The entry should expire at 15:50:10, but it's not refreshed:

# date; SSS_NSS_USE_MEMCACHE=NO getent group tgroup100
Fri Jul 22 15:50:58 CEST 2016
tgroup100:*:1221800008:tuser200


Using ldap_enumeration_refresh_timeout as a workaround fixes the issue.

#####################################
enumerate = true
entry_cache_timeout = 30
enum_cache_timeout = 30
ldap_enumeration_refresh_timeout = 30
######################################

# service sssd stop;rm -rf /var/lib/sss/db/* /var/lib/sss/mc/*;service sssd start
Redirecting to /bin/systemctl stop  sssd.service
Redirecting to /bin/systemctl start  sssd.service
 
Adding a member:
 
# ipa group-add-member --users tuser200 tgroup100

# date; SSS_NSS_USE_MEMCACHE=NO getent group tgroup100
Fri Jul 22 15:53:09 CEST 2016
tgroup100:*:1221800008:

# date; SSS_NSS_USE_MEMCACHE=NO getent group tgroup100
Fri Jul 22 15:53:20 CEST 2016
tgroup100:*:1221800008:tuser200

It took 11s to recognize the change.

Removing a member:

# ipa group-remove-member --users tuser200 tgroup100

# date; SSS_NSS_USE_MEMCACHE=NO getent group tgroup100
Fri Jul 22 15:54:30 CEST 2016
tgroup100:*:1221800008:tuser200

# date; SSS_NSS_USE_MEMCACHE=NO getent group tgroup100
Fri Jul 22 15:54:35 CEST 2016
tgroup100:*:1221800008:

It took 5s to recognize the change.

I would still expect that sssd refreshes/removes expired cache entries.

Version-Release number of selected component (if applicable):
sssd-1.13.0-40.el7_2.4.x86_64


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Sumit Bose 2016-07-22 14:10:13 UTC

Iirc there is a special cleanup task for this which is currently disabled by default because it might cause some unexpected slow-downs.

Please check account_cache_expiration in man sssd.conf and ldap_purge_cache_timeout in sssd-ldap.

Comment 2 Thorsten Scherf 2016-07-22 14:27:18 UTC

HI Sumit. There is a workaround already with "ldap_enumeration_refresh_timeout" but I think it's still a bug when entries are not refreshed/removed after they have expired.

ldap_purge_cache_timeout and account_cache_expiration do not seem to be an option because I want to keep old entries in the cache. I only want to get rid of them after they expired - that is after they have been removed and cache timeout has been exceeded.

Comment 3 Petr Čech 2016-09-01 12:12:06 UTC

Hi Sumit and Thorsten,

I looked at man (sss-ldap) page. I am afraid of this is not a bug.

Thorsten, you wrote that you have a workaround with ldap_enumeration_refresh_timeout. Default value of this option is 300 seconds.
So the workaround works every 5 minutes.

There is another one mechanism, see ldap_purge_cache_timeout. Default behaviour with enumeration true is:

"Please note that if enumeration is enabled, the cleanup task is required in order to detect entries removed from the server and can't be disabled. By default, the cleanup task will run every 3 hours with enumeration enabled."

So... IMHO there is no issue. What do you think, Thorsten, Sumit?

Comment 4 Petr Čech 2016-09-05 13:59:24 UTC

Hi Thorsten,

I would like to ask you if you agree with my explanation at Comment #3

Comment 5 Sumit Bose 2016-09-05 15:23:35 UTC

I think Thorsten is right in his observation that it is odd that the group members of an expired entry in the cache are not update. Nevertheless I'm currently not sure if this is expected behavior with with enumeration or not. 

Thorsten, can you check in your first example (without ldap_enumeration_refresh_timeout = 30) if the dataExpireTimestamp is updated after the lookup when the entry is expired? Additionally can you attach the nss and backend logs with debug_level=10?

Comment 6 Thorsten Scherf 2016-09-05 15:50:18 UTC

Petr, Summit is right. I would expect an entry which is not valid anymore to be updated in the cache. 

Here is the requested data:

########################
enumerate = true
entry_cache_timeout = 30
enum_cache_timeout = 30
######################## 

# service sssd stop;rm -rf /var/lib/sss/db/* /var/lib/sss/mc/*;service sssd start
Redirecting to /bin/systemctl stop  sssd.service
Redirecting to /bin/systemctl start  sssd.service
 
# ipa group-add-member --users tuser200 tgroup100

# date; ipa group-remove-member --users tuser200 tgroup100
Mon Sep  5 17:41:50 CEST 2016
  Group name: tgroup100
  GID: 1221800008
---------------------------
Number of members removed 1
---------------------------

# date; SSS_NSS_USE_MEMCACHE=NO getent group tgroup100
Mon Sep  5 17:41:57 CEST 2016
tgroup100:*:1221800008:tuser200

# ldbsearch -H /var/lib/sss/db/cache_coe.muc.redhat.com.ldb name=tgroup100 
[...]
dataExpireTimestamp: 1473090134
member: name=tuser200,cn=users,cn=coe.muc.redhat.com,cn=sysdb
memberuid: tuser200

# date -d @1473090134
Mon Sep  5 17:42:14 CEST 2016

# date; SSS_NSS_USE_MEMCACHE=NO getent group tgroup100
Mon Sep  5 17:42:59 CEST 2016
tgroup100:*:1221800008:tuser200

# ldbsearch -H /var/lib/sss/db/cache_coe.muc.redhat.com.ldb name=tgroup100
[...]
dataExpireTimestamp: 1473090134
member: name=tuser200,cn=users,cn=coe.muc.redhat.com,cn=sysdb
memberuid: tuser200

# date -d @1473090134
Mon Sep  5 17:42:14 CEST 2016

So dataExpireTimestamp is not updated after the last lookup although the entry was already expired.

Comment 7 Petr Čech 2016-09-07 10:50:38 UTC

Thorsten, Sumit,

I went through the reproducer with last code in master. The attribute dataExpireTimestamp is updated, in ts_cache (timestamp cache), after lookup after its expiration.

I applied patch for 'better sysdb debugging' and I saw:

[sssd[be[beta]]] [sysdb_store_group] (0x1000): The group record of tgroup_1@beta did not change, only updated the timestamp cache
[sssd[be[beta]]] [sdap_save_groups] (0x4000): Group 0 processed!
[sssd[be[beta]]] [sdap_attrs_get_sid_str] (0x1000): No [objectSIDString] attribute. [0][Success]
[sssd[be[beta]]] [sdap_save_grpmem] (0x0400): Failed to get group sid
[sssd[be[beta]]] [sdap_get_primary_name] (0x0400): Processing object tgroup_1
[sssd[be[beta]]] [sdap_save_grpmem] (0x0400): Processing group tgroup_1@beta
[sssd[be[beta]]] [sdap_save_grpmem] (0x0400): No members for group [tgroup_1@beta]
[...]
[sssd[be[beta]]] [sss_ldb_ldif2log] (0x10000): ldif
[
dn: name=tgroup_1@beta,cn=groups,cn=beta,cn=sysdb
changetype: modify
replace: lastUpdate
lastUpdate: 1473231233
-
replace: dataExpireTimestamp
dataExpireTimestamp: 1473231263
-
]

I interpret it so that SSSD correctly identifies there is no member but group storing/updating fails after it.
So, it is a bug, I think.


Reproducer:

# prepare

ipa user-add --first=Test --last=User --email=tuser tuser
ipa group-add tgroup_1

# repeat

systemctl stop sssd
sudo su -c "rm -f /var/lib/sss/db/*"
sudo su -c "rm -f /var/lib/sss/mc/*"
sssctl logs-remove && systemctl start sssd

date; SSS_NSS_USE_MEMCACHE=NO getent group tgroup_1

ipa group-add-member --users=tuser tgroup_1

date; SSS_NSS_USE_MEMCACHE=NO getent group tgroup_1

ipa group-remove-member --users tuser tgroup_1
date; SSS_NSS_USE_MEMCACHE=NO getent group tgroup_1


Configuration:
# cat /etc/sssd/sssd.conf 
[domain/beta]
cache_credentials = True
krb5_store_password_if_offline = True
ipa_domain = beta
id_provider = ipa
auth_provider = ipa
access_provider = ipa
ipa_hostname = mirach.beta
chpass_provider = ipa
dyndns_update = True
ipa_server = _srv_, algol.beta
dyndns_iface = ens3
ldap_tls_cacert = /etc/ipa/ca.crt

enumerate = true
entry_cache_timeout = 30

debug_level = 0xFFFF0

[sssd]
services = nss, sudo, pam, ssh
domains = beta

debug_level = 0xFFFFFF0

[nss]
homedir_substring = /home
enum_cache_timeout = 30

Comment 8 Petr Čech 2016-09-12 13:31:43 UTC

I checked this issue again.
It occurs on SSSD 1.14 but I am not able to reproduce it on SSSD 1.13.

Comment 9 Petr Čech 2016-09-12 13:39:09 UTC

Upstream ticket:
https://fedorahosted.org/sssd/ticket/3182

Comment 10 Lukas Slebodnik 2016-09-12 13:40:04 UTC

(In reply to Petr Čech from comment #8)
> I checked this issue again.
> It occurs on SSSD 1.14 but I am not able to reproduce it on SSSD 1.13.

Which exact version of SSSD 1.13 did you use?
Was it sssd-1.13.0-40.el7_2.4.x86_64?

Comment 11 Petr Čech 2016-09-12 15:02:13 UTC

I read the reproducer again, I forgot wait on timeouts...
So, I took sssd-1.13.0-40.el7_2.12.x86_64... and the bug accured.
I need to write comments, or wait() to my reproducers :-)
I fix version in ticket.

Comment 12 Sumit Bose 2016-09-12 15:55:06 UTC

Created attachment 1200242 [details]
Potential quick-n-dirty fix

Hi Petr,

have you checked if this patch helps to fix the issue?

bye,
Sumit

Comment 13 Lukas Slebodnik 2016-09-12 16:20:53 UTC

(In reply to Sumit Bose from comment #12)
> Created attachment 1200242 [details]
> Potential quick-n-dirty fix
> 
> Hi Petr,
> 
> have you checked if this patch helps to fix the issue?
> 
IIUC; Petr cannot reproduce with SSSD 1.13. So I assume it is already fixed in
latest 1.13 and bug is only on master(1.14).

Comment 14 Petr Čech 2016-09-13 07:17:34 UTC

Hi Sumit and Lukas,

I am working on it. I have others bug which are connected to the groups.
I wrote some intg. tests for it. And I would like to be sure that I will not break anything.

Lukas, I did mistake. It is really broken on SSSD 1.13.0. I firstly checked 1.13.5 but wrong way.

Sumit, I am working with your "quick-n-dirty" fix. I will inform you during a day.

Comment 16 Petr Čech 2016-11-16 06:19:59 UTC

Ticket for this bug was
https://fedorahosted.org/sssd/ticket/3182

and the issue was solved in
https://fedorahosted.org/sssd/ticket/2940

by
e0903f41922721edf292a9f7e6605a4519db53a1 (sssd-1_14_2)

Comment 17 Lukas Slebodnik 2016-11-16 11:23:29 UTC

(In reply to Petr Čech from comment #16)
> Ticket for this bug was
> https://fedorahosted.org/sssd/ticket/3182
> 
> and the issue was solved in
> https://fedorahosted.org/sssd/ticket/2940
> 
> by
> e0903f41922721edf292a9f7e6605a4519db53a1 (sssd-1_14_2)

Are you sure that it is the same issue?
Have you tested with "enumerate = true"?

The patches for ticket #2940 do not have a test for
"enumerate = true".

Comment 18 Petr Čech 2016-11-16 12:00:29 UTC

(In reply to Lukas Slebodnik from comment #17)
> (In reply to Petr Čech from comment #16)
> > Ticket for this bug was
> > https://fedorahosted.org/sssd/ticket/3182
> > 
> > and the issue was solved in
> > https://fedorahosted.org/sssd/ticket/2940
> > 
> > by
> > e0903f41922721edf292a9f7e6605a4519db53a1 (sssd-1_14_2)
> 
> Are you sure that it is the same issue?
> Have you tested with "enumerate = true"?

Yes, I am sure. Yes, I have tested.

> The patches for ticket #2940 do not have a test for
> "enumerate = true".

If it is possible to add tests for "enumerate = true", I could write it. Is it better new ticket for it or reopen #3182?

Comment 19 Lukas Slebodnik 2016-11-16 12:07:05 UTC

(In reply to Petr Čech from comment #18)
> (In reply to Lukas Slebodnik from comment #17)
> > (In reply to Petr Čech from comment #16)
> > > Ticket for this bug was
> > > https://fedorahosted.org/sssd/ticket/3182
> > > 
> > > and the issue was solved in
> > > https://fedorahosted.org/sssd/ticket/2940
> > > 
> > > by
> > > e0903f41922721edf292a9f7e6605a4519db53a1 (sssd-1_14_2)
> > 
> > Are you sure that it is the same issue?
> > Have you tested with "enumerate = true"?
> 
> Yes, I am sure. Yes, I have tested.
> 
Thank you good to know.

> > The patches for ticket #2940 do not have a test for
> > "enumerate = true".
> 
> If it is possible to add tests for "enumerate = true", I could write it. Is
> it better new ticket for it or reopen #3182?

It's difficult to write reliable test for enumeration due to race conditions
and timing issues.

Comment 26 Sudhir Menon 2017-05-16 10:56:50 UTC

Fix is seen in sssd-1.15.2-29.el7.x86_64

Comment 27 errata-xmlrpc 2017-08-01 08:58:07 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:2294

Note You need to log in before you can comment on or make changes to this bug.