Bug 706571

Summary: nscd is not reporting all secondary groups in /etc/group
Product: Red Hat Enterprise Linux 5 Reporter: charles.ng
Component: glibcAssignee: Jeff Law <law>
Status: CLOSED ERRATA QA Contact: Arjun Shankar <ashankar>
Severity: high Docs Contact:
Priority: high    
Version: 5.5CC: aoliva, ashankar, ayadav, bugproxy, codonell, dbasant, fweimer, jkachuck, law, lmiksik, mfranc, pmuller, rpiddapa
Target Milestone: rcKeywords: OtherQA, Patch
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: glibc-2.5 uses compat_call which in turn uses getgrent_r which is reentrant safe, but not thread safe. Consequence: As a result if multiple threads were making calls to getgrent_r via compat_call they could race against each other resulting in some groups not being properly reported. Fix: Locking was added to compat_call to prevent multiple threads from racing. Result: All groups should be properly reported, even when nscd is using multiple threads.
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-09-30 18:15:37 EDT Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Bug Depends On:    
Bug Blocks: 831765, 921048, 928849    
Description Flags
Patch that avoids multi-threaded issues with initgroups compat_call
/etc/group file from system on which problem is occurring
Hourly output of groups command for userid cladmin - most recent first none

Description charles.ng 2011-05-21 02:03:17 EDT
Description of problem:
nscd on EL5.5 occasionally doesn't report all groups for user (oracle). Because of this, oracle processes are failing intermittently. 
The problem occurs unexpectedly hence it's hardly difficult to reproduce.

Version-Release number of selected component (if applicable):
kernel: 2.6.18-
nscd: nscd-2.5-49

How reproducible:
Unable to reproduce. Attaching two straces. One with expected result and one with missing groups

Steps to Reproduce:
Actual results:

id oracle
uid=300(oracle) gid=300(dba) groups=300(dba)

Expected results:
 id oracle
uid=300(oracle) gid=300(dba) groups=300(dba),400(oinstall),402(asmdba)

Additional info:
Comment 1 Alexandre Oliva 2012-01-18 08:08:20 EST
Thanks for the bug report.  I'm afraid I've been unable to replicate this problem.  Are you still running into it?  Can you please attach your nsswitch.conf, nscd.conf, and nscd logs for a session in which both correct and incorrect results are displayed?  passwd and group files might also be important to try to figure this out.  Thanks in advance,
Comment 2 Alexandre Oliva 2012-01-18 20:31:43 EST
Wild guess: do you still get the problem if you set threads and max_threads to 1 in nscd.conf?  I suspect the problem might be multiple threads calling getgrouplist concurrently, interferring with each other while iterating over the group list using the not-really-reentrant getgrent_r.
Comment 3 Alexandre Oliva 2012-01-19 05:56:05 EST
Created attachment 556249 [details]
Patch that avoids multi-threaded issues with initgroups compat_call

This problem doesn't occur with glibc trunk because nss_files provides _nss_files_initgroups_dyn, that opens and reads /etc/group, whereas glibc 2.5 falls back to compat_call, that uses getgrent_r and makes itself vulnerable to other threads' concurrent use.

This patch improves glibc's use of any nss implementation that lacks initgroups_dyn, by stopping multiple threads from interfering with each other's within the compat_call that iterates over the group list supplied by the implementation.  I'll submit this improvement upstream.
Comment 4 Jeff Law 2012-02-03 16:37:05 EST
*** Bug 766786 has been marked as a duplicate of this bug. ***
Comment 5 IBM Bug Proxy 2012-02-03 16:44:10 EST
Created attachment 559373 [details]
/etc/group file from system on which problem is occurring
Comment 6 IBM Bug Proxy 2012-02-03 16:44:15 EST
Created attachment 559374 [details]
Hourly output of groups command for userid cladmin - most recent first
Comment 7 RHEL Product and Program Management 2012-04-02 09:10:00 EDT
This request was evaluated by Red Hat Product Management for inclusion
in a Red Hat Enterprise Linux release.  Product Management has
requested further review of this request by Red Hat Engineering, for
potential inclusion in a Red Hat Enterprise Linux release for currently
deployed products.  This request is not yet committed for inclusion in
a release.
Comment 9 Joseph Kachuck 2012-04-09 10:39:16 EDT
Do you still see this behaviour if you set threads and max_threads to 1 in

Thank You
Joe Kachuck
Comment 18 Jeff Law 2012-11-21 08:55:43 EST
If they're getting permission denied, then this is likely a different problem related to leaking file descriptors.  See 795674 and the bugs linked to within.

If they are only seeing some groups not being reported, then having them test their system with max_threads and threads to the value 1 in nscd.conf would be greatly appreciated.
Comment 35 errata-xmlrpc 2013-09-30 18:15:37 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.