Bug 841752 - Regression with getent in glibc-2.5-81.el5_8.4 on 32 bit systems
Regression with getent in glibc-2.5-81.el5_8.4 on 32 bit systems
Status: CLOSED DUPLICATE of bug 818585
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: glibc (Show other bugs)
5.8
All Linux
unspecified Severity urgent
: rc
: ---
Assigned To: Jeff Law
qe-baseos-tools
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-07-20 03:08 EDT by Klaus Steinberger
Modified: 2016-11-24 11:02 EST (History)
8 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2012-07-23 11:14:10 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Klaus Steinberger 2012-07-20 03:08:04 EDT
Description of problem:

After update to glibc-2.5-81.el5_8.4 getent failed in very curious way only on 32bit systems.

we use nss_ldap for our accounts.

After the update a "getent passwd"  shows as usual all accounts, but "getent passwd loginname" show nothing. Also "ls ~loginname" fails with: file not found

after stopping nscd, the system works again correctly

Version-Release number of selected component (if applicable):

glibc-2.5-81.el5_8.4

How reproducible:

always on 32bit systems

Steps to Reproduce:
1. use nscd together with ldap
2. update glibc and friends to 2.5-81.el5_8.4
3. try getent
  

Workaround: 
stop nscd
Comment 1 Klaus Steinberger 2012-07-20 04:21:46 EDT
I tracked the problem down a little bit more.

The problem is related to deref always setting we use:


We have the following in our ldap tree:


ou=Personen,o=physik  -> Accounts
ou=Gruppen,o=physik   -> Gruppen


ou=Location,ou=Einrichtungen,o=physik -> location specific setup

ou=Personen,ou=Location,ou=Einrichtungen,o=physik is a reference to 
ou=Personen,o=physik


in ldap.conf we set:

nss_base ou=Location,ou=Einrichtungen,o=physik
deref always

this special setup fails with glibc-2.5-81.el5_8.4 on 32bit systems!


if we use:

nss_base ou=physik

it works as expected

so probably the bug is somewhere in the deref code
Comment 2 Klaus Steinberger 2012-07-20 04:44:26 EDT
Uhh, it looks like an curious interaction with selinux!


with selinux active I always have the problem with nscd
After I switch off selinux (by echoing 0 into /selinux/enforce) I can get it working, but only when I delete the nscd database files first!
(rm -f /var/db/nscd/* ) nscd -i passwd will not help!


So it looks like the database files will be corrupted with selinux active?
Comment 3 Klaus Steinberger 2012-07-20 04:56:31 EDT
I can confirm now that it is selinux related and happens both on 32bit and 64bit systems, and is not related to our special setup as mentioned in comment #1.

To reproduce the problem on a system with selinux in warning mode:


service nscd stop
rm -f /var/db/nscd/*
echo 1 >  /selinux/enforce
service nscd start

after that: getent passwd loginname returns nothing 


to get the system working again:

service nscd stop
rm -f /var/db/nscd/*
echo 0 > /selinux/enforce
service nscd start

Just a restart of nscd or invalidating the cache will not help! Only complete removal of the cache files!
Comment 4 Jeff Law 2012-07-20 12:21:29 EDT
Reassining to the selinux-policy folks -- not sure they're the best to work on this, but it's a start.  Nothing has changed in glibc which should affect selinux at all to the best of my knowledge.

Based on past experiences, is it possible /var/db/nscd has the wrong selinux permissions/context?
Comment 5 Daniel Walsh 2012-07-20 17:30:55 EDT
I would bet you it is mislabeled or something is mislabeled.

restorecon -R -v /var/db

Would be the first thing I would try without seeing the AVC messages from /var/log/audit/audit.log.

You might also want to install setroubleshoot on this machine, which would bring more info about SELinux.
Comment 6 Tats Shibata 2012-07-21 07:05:04 EDT
I've faced with a similar issue. In my case, postfix can't start due to "localhost" resolving issue when nscd is running.

[root@charlie ~]# service postfix start
Starting postfix:                                          [FAILED]
[root@charlie ~]# tail -2 /var/log/maillog
Jul 21 18:27:02 charlie postfix/sendmail[3487]: fatal: config variable inet_interfaces: host not found: localhost
Jul 21 18:27:03 charlie postfix[3488]: fatal: config variable inet_interfaces: host not found: localhost
[root@charlie ~]# ping localhost
ping: unknown host localhost

After nscd stops, the issue is gone away.

[root@charlie ~]# service nscd stop
Stopping nscd:                                             [  OK  ]
[root@charlie ~]# ping localhost
PING charlie.rewse.jp (127.0.0.1) 56(84) bytes of data.
64 bytes from charlie.rewse.jp (127.0.0.1): icmp_seq=1 ttl=64 time=0.235 ms


Comment #3 can be reproduced completely.

[root@charlie ~]# getenforce
Enforcing
[root@charlie ~]# service nscd stop
Stopping nscd:                                             [  OK  ]
[root@charlie ~]# rm -f /var/db/nscd/*
[root@charlie ~]# service nscd start
Starting nscd:                                             [  OK  ]
[root@charlie ~]# ping localhost
ping: unknown host localhost

[root@charlie ~]# getenforce
Enforcing
[root@charlie ~]# setenforce 0
[root@charlie ~]# service nscd stop
Stopping nscd:                                             [  OK  ]
[root@charlie ~]# rm -f /var/db/nscd/*
[root@charlie ~]# service nscd start
Starting nscd:                                             [  OK  ]
[root@charlie ~]# ping localhost
PING charlie.rewse.jp (127.0.0.1) 56(84) bytes of data.
64 bytes from charlie.rewse.jp (127.0.0.1): icmp_seq=1 ttl=64 time=0.154 ms


I guess the root cause is Bug 818585 because I can always reproduce that when /ets/nsswith.conf context is wrong.

[root@charlie ~]# chcon user_u:object_r:rpm_script_tmp_t /etc/nsswitch.conf
[root@charlie ~]# service nscd stop
Stopping nscd:                                             [  OK  ]
[root@charlie ~]# rm -f /var/db/nscd/*
[root@charlie ~]# service nscd start
Starting nscd:                                             [  OK  ]
[root@charlie ~]# ping localhost
ping: unknown host localhost

[root@charlie ~]# restorecon /etc/nsswitch.conf
l[root@charlie ~]# ls -Z /etc/nsswitch.conf
-rw-r--r--  root root system_u:object_r:etc_t          /etc/nsswitch.conf
[root@charlie ~]# service nscd stop
Stopping nscd:                                             [  OK  ]
[root@charlie ~]# rm -f /var/db/nscd/*
[root@charlie ~]# service nscd start
Starting nscd:                                             [  OK  ]
[root@charlie ~]# ping localhost
PING charlie.rewse.jp (127.0.0.1) 56(84) bytes of data.
64 bytes from charlie.rewse.jp (127.0.0.1): icmp_seq=1 ttl=64 time=0.183 ms
Comment 7 Klaus Steinberger 2012-07-23 08:35:57 EDT
Indeed the problem is related to a corrupted labeling off the /etc/nsswitch.conf file.

After the glibc update the /etc/nsswitch.conf file is labeled incorrectly:

[root@git ~]# ls -Z /etc/nsswitch.conf 
-rw-r--r--  root root user_u:object_r:rpm_script_tmp_t /etc/nsswitch.conf
[root@git ~]# 


"restorecon /etc/nsswitch.conf" solves the problem.
Comment 8 Daniel Walsh 2012-07-23 10:53:27 EDT
So this is a bug in someones post install script?
Comment 9 Klaus Steinberger 2012-07-23 10:55:31 EDT
yep looks like
Comment 10 Miroslav Franc 2012-07-23 11:02:26 EDT
To me it looks like duplicate of bug 818585. Are we sure, it's not sudo causing this?
Comment 11 Daniel Walsh 2012-07-23 11:08:56 EDT
Looks like a likely candidate.
Comment 12 Jeff Law 2012-07-23 11:14:10 EDT

*** This bug has been marked as a duplicate of bug 818585 ***

Note You need to log in before you can comment on or make changes to this bug.