841752 – Regression with getent in glibc-2.5-81.el5_8.4 on 32 bit systems

Bug 841752 - Regression with getent in glibc-2.5-81.el5_8.4 on 32 bit systems

Summary: Regression with getent in glibc-2.5-81.el5_8.4 on 32 bit systems

Keywords:
Status:	CLOSED DUPLICATE of bug 818585
Alias:	None
Product:	Red Hat Enterprise Linux 5
Classification:	Red Hat
Component:	glibc
Sub Component:
Version:	5.8
Hardware:	All
OS:	Linux
Priority:	unspecified
Severity:	urgent
Target Milestone:	rc
Target Release:	---
Assignee:	Jeff Law
QA Contact:	qe-baseos-tools-bugs
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2012-07-20 07:08 UTC by Klaus Steinberger
Modified:	2016-11-24 16:02 UTC (History)
CC List:	8 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2012-07-23 15:14:10 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Klaus Steinberger 2012-07-20 07:08:04 UTC

Description of problem:

After update to glibc-2.5-81.el5_8.4 getent failed in very curious way only on 32bit systems.

we use nss_ldap for our accounts.

After the update a "getent passwd"  shows as usual all accounts, but "getent passwd loginname" show nothing. Also "ls ~loginname" fails with: file not found

after stopping nscd, the system works again correctly

Version-Release number of selected component (if applicable):

glibc-2.5-81.el5_8.4

How reproducible:

always on 32bit systems

Steps to Reproduce:
1. use nscd together with ldap
2. update glibc and friends to 2.5-81.el5_8.4
3. try getent
  

Workaround: 
stop nscd

Comment 1 Klaus Steinberger 2012-07-20 08:21:46 UTC

I tracked the problem down a little bit more.

The problem is related to deref always setting we use:


We have the following in our ldap tree:


ou=Personen,o=physik  -> Accounts
ou=Gruppen,o=physik   -> Gruppen


ou=Location,ou=Einrichtungen,o=physik -> location specific setup

ou=Personen,ou=Location,ou=Einrichtungen,o=physik is a reference to 
ou=Personen,o=physik


in ldap.conf we set:

nss_base ou=Location,ou=Einrichtungen,o=physik
deref always

this special setup fails with glibc-2.5-81.el5_8.4 on 32bit systems!


if we use:

nss_base ou=physik

it works as expected

so probably the bug is somewhere in the deref code

Comment 2 Klaus Steinberger 2012-07-20 08:44:26 UTC

Uhh, it looks like an curious interaction with selinux!


with selinux active I always have the problem with nscd
After I switch off selinux (by echoing 0 into /selinux/enforce) I can get it working, but only when I delete the nscd database files first!
(rm -f /var/db/nscd/* ) nscd -i passwd will not help!


So it looks like the database files will be corrupted with selinux active?

Comment 3 Klaus Steinberger 2012-07-20 08:56:31 UTC

I can confirm now that it is selinux related and happens both on 32bit and 64bit systems, and is not related to our special setup as mentioned in comment #1.

To reproduce the problem on a system with selinux in warning mode:


service nscd stop
rm -f /var/db/nscd/*
echo 1 >  /selinux/enforce
service nscd start

after that: getent passwd loginname returns nothing 


to get the system working again:

service nscd stop
rm -f /var/db/nscd/*
echo 0 > /selinux/enforce
service nscd start

Just a restart of nscd or invalidating the cache will not help! Only complete removal of the cache files!

Comment 4 Jeff Law 2012-07-20 16:21:29 UTC

Reassining to the selinux-policy folks -- not sure they're the best to work on this, but it's a start.  Nothing has changed in glibc which should affect selinux at all to the best of my knowledge.

Based on past experiences, is it possible /var/db/nscd has the wrong selinux permissions/context?

Comment 5 Daniel Walsh 2012-07-20 21:30:55 UTC

I would bet you it is mislabeled or something is mislabeled.

restorecon -R -v /var/db

Would be the first thing I would try without seeing the AVC messages from /var/log/audit/audit.log.

You might also want to install setroubleshoot on this machine, which would bring more info about SELinux.

Comment 6 Tats Shibata 2012-07-21 11:05:04 UTC

I've faced with a similar issue. In my case, postfix can't start due to "localhost" resolving issue when nscd is running.

[root@charlie ~]# service postfix start
Starting postfix:                                          [FAILED]
[root@charlie ~]# tail -2 /var/log/maillog
Jul 21 18:27:02 charlie postfix/sendmail[3487]: fatal: config variable inet_interfaces: host not found: localhost
Jul 21 18:27:03 charlie postfix[3488]: fatal: config variable inet_interfaces: host not found: localhost
[root@charlie ~]# ping localhost
ping: unknown host localhost

After nscd stops, the issue is gone away.

[root@charlie ~]# service nscd stop
Stopping nscd:                                             [  OK  ]
[root@charlie ~]# ping localhost
PING charlie.rewse.jp (127.0.0.1) 56(84) bytes of data.
64 bytes from charlie.rewse.jp (127.0.0.1): icmp_seq=1 ttl=64 time=0.235 ms


Comment #3 can be reproduced completely.

[root@charlie ~]# getenforce
Enforcing
[root@charlie ~]# service nscd stop
Stopping nscd:                                             [  OK  ]
[root@charlie ~]# rm -f /var/db/nscd/*
[root@charlie ~]# service nscd start
Starting nscd:                                             [  OK  ]
[root@charlie ~]# ping localhost
ping: unknown host localhost

[root@charlie ~]# getenforce
Enforcing
[root@charlie ~]# setenforce 0
[root@charlie ~]# service nscd stop
Stopping nscd:                                             [  OK  ]
[root@charlie ~]# rm -f /var/db/nscd/*
[root@charlie ~]# service nscd start
Starting nscd:                                             [  OK  ]
[root@charlie ~]# ping localhost
PING charlie.rewse.jp (127.0.0.1) 56(84) bytes of data.
64 bytes from charlie.rewse.jp (127.0.0.1): icmp_seq=1 ttl=64 time=0.154 ms


I guess the root cause is Bug 818585 because I can always reproduce that when /ets/nsswith.conf context is wrong.

[root@charlie ~]# chcon user_u:object_r:rpm_script_tmp_t /etc/nsswitch.conf
[root@charlie ~]# service nscd stop
Stopping nscd:                                             [  OK  ]
[root@charlie ~]# rm -f /var/db/nscd/*
[root@charlie ~]# service nscd start
Starting nscd:                                             [  OK  ]
[root@charlie ~]# ping localhost
ping: unknown host localhost

[root@charlie ~]# restorecon /etc/nsswitch.conf
l[root@charlie ~]# ls -Z /etc/nsswitch.conf
-rw-r--r--  root root system_u:object_r:etc_t          /etc/nsswitch.conf
[root@charlie ~]# service nscd stop
Stopping nscd:                                             [  OK  ]
[root@charlie ~]# rm -f /var/db/nscd/*
[root@charlie ~]# service nscd start
Starting nscd:                                             [  OK  ]
[root@charlie ~]# ping localhost
PING charlie.rewse.jp (127.0.0.1) 56(84) bytes of data.
64 bytes from charlie.rewse.jp (127.0.0.1): icmp_seq=1 ttl=64 time=0.183 ms

Comment 7 Klaus Steinberger 2012-07-23 12:35:57 UTC

Indeed the problem is related to a corrupted labeling off the /etc/nsswitch.conf file.

After the glibc update the /etc/nsswitch.conf file is labeled incorrectly:

[root@git ~]# ls -Z /etc/nsswitch.conf 
-rw-r--r--  root root user_u:object_r:rpm_script_tmp_t /etc/nsswitch.conf
[root@git ~]# 


"restorecon /etc/nsswitch.conf" solves the problem.

Comment 8 Daniel Walsh 2012-07-23 14:53:27 UTC

So this is a bug in someones post install script?

Comment 9 Klaus Steinberger 2012-07-23 14:55:31 UTC

yep looks like

Comment 10 Miroslav Franc 2012-07-23 15:02:26 UTC

To me it looks like duplicate of bug 818585. Are we sure, it's not sudo causing this?

Comment 11 Daniel Walsh 2012-07-23 15:08:56 UTC

Looks like a likely candidate.

Comment 12 Jeff Law 2012-07-23 15:14:10 UTC


*** This bug has been marked as a duplicate of bug 818585 ***

Note You need to log in before you can comment on or make changes to this bug.