Bug 1435615

Summary:	nscd is not caching ldap netgroup data properly, hangs on nscd -i netgroup
Product:	Red Hat Enterprise Linux 7	Reporter:	Deepu K S <dkochuka>
Component:	glibc	Assignee:	DJ Delorie <dj>
Status:	CLOSED ERRATA	QA Contact:	Sergey Kolosov <skolosov>
Severity:	medium	Docs Contact:
Priority:	medium
Version:	7.2	CC:	arawat, ashankar, codonell, cww, dkochuka, edward.goodwin, fweimer, glibc-bugzilla, kludhwan, mcermak, mnewsome, pfrankli, qe-baseos-tools-bugs, skolosov
Target Milestone:	rc
Target Release:	---
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:	glibc-2.17-201.el7	Doc Type:	Bug Fix
Doc Text:	Cause: incorrect use of locks in nscd Consequence: On systems where netgroups are cached by nscd, nscd may occasionally hang, resulting in a failure to notice updates in cached information. Fix: nscd has been patched to properly release its internal locks when handling cache timeouts. Result: Cache data should properly update now.	Story Points:	---
Clone Of:	1277672	Environment:
Last Closed:	2018-04-10 13:58:28 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	1277672
Bug Blocks:	1420851, 1473718

Comment 8 DJ Delorie 2017-09-26 15:56:18 UTC

To reproduce the core of this issue, the following steps must happen:

* Configure a service that nscd can cache, which provides netgroup maps.  I use LDAP but others should work also.
* Ensure it has at least one positive query for "in net group" (which may require other maps)
* Perform multiple identical "in net group" queries within the TTL time.  If the cached service is remote, you can tell when you have enough when network traffic stops.  I used getent netgroup QAUsers "" testuser23461 ""

(the bug has now been triggered; the next steps show the effects)

* wait for the entry to time out (the positive TTL time).
* If you watch the logs (enable debug) you can see the message where the cleanup task tries to purge it, or after the timeout time, you can use "nscd -i netgroup" to force a purge

If you let the cleanup task run, you can attach to nscd with gdb and "info threads" - one will be waiting for a write lock.

If you use "nscd -i netgroup" it will hang.

To "reset" this test, you need to stop nscd *and* manually remove the persistent cached databases it uses.

Comment 13 errata-xmlrpc 2018-04-10 13:58:28 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:0805