Bug 478716 - Static /etc/hosts entries ignored after update to 5.3
Static /etc/hosts entries ignored after update to 5.3
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: glibc (Show other bugs)
5.3
All Linux
urgent Severity urgent
: rc
: ---
Assigned To: Jakub Jelinek
BaseOS QE
: Regression
: 478717 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2009-01-03 18:48 EST by Daniel Riek
Modified: 2010-07-13 09:20 EDT (History)
8 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-01-20 15:50:00 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Daniel Riek 2009-01-03 18:48:37 EST
Description of problem:
Ping ignores static host entries in /etc/hosts after the update to 5.3

How reproducible:
Always

Steps to Reproduce:
1. Create a know-to work /etc/hosts file on a 5.3 installation (or update a working 5.2 machine to 5.3). 
2. Make sure to add a hostname that does not otherwise resolve (through DNS).
3. Verify that the IP address is pingable.
3. Ping the unique name from /etc/hosts.
  
Actual results:
Ping reports "unknown host"

Expected results:
Ping pings the host

Additional info:
A similar issue seems to exist with autofs, while ssh, wget, curl, elinks, traceroute or telnet are not affected.

Opening against ping as it does not affect most other users of libresolv.
Comment 1 Daniel Riek 2009-01-03 19:15:17 EST
The autofs issue is in bug 478717. Might be related.
Comment 2 Daniel Riek 2009-01-05 10:26:36 EST
The original issue was seen in a Xen guest on x86_64. The issues does not show in i386 bare metal.
Comment 3 Daniel Riek 2009-01-05 10:28:57 EST
SELinux was in permissive mode on the x86_64 Xen guest where this issue was seen and in enforcing mode on the i386 bare metal machine where it did not reproduce. So it does not seem to be a factor.
Comment 4 Karel Volný 2009-01-05 10:35:38 EST
well, on the machine I tried, I am unable to reproduce it with ping, but the command "host" is unable to resolve the name, it goes like this:

.qa.[root@i386-5s-m1 tps]# ping bz478816
PING bz478816 (10.34.33.227) 56(84) bytes of data.
64 bytes from bz478816 (10.34.33.227): icmp_seq=1 ttl=59 time=270 ms
64 bytes from bz478816 (10.34.33.227): icmp_seq=2 ttl=59 time=271 ms
64 bytes from bz478816 (10.34.33.227): icmp_seq=3 ttl=59 time=270 ms

--- bz478816 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2000ms
rtt min/avg/max/mdev = 270.017/270.679/271.157/0.643 ms
.qa.[root@i386-5s-m1 tps]# host bz478816
Host bz478816 not found: 3(NXDOMAIN)
.qa.[root@i386-5s-m1 tps]# cat /etc/host.conf
order hosts,bind
.qa.[root@i386-5s-m1 tps]# cat /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1               localhost.localdomain localhost
::1             localhost6.localdomain6 localhost6
10.16.40.49             i386-5s-m1.lab.bos.redhat.com i386-5s-m1

10.34.33.227            bz478816
Comment 5 Karel Volný 2009-01-05 10:44:20 EST
ok, scratch that host command part, it does not use the local files

so i386 vs x86_64 or bare vs virtual is a factor?
Comment 6 Karel Volný 2009-01-05 10:51:19 EST
ok, so x86-64-5s-2-m1.lab.bos.redhat.com does not seem to reproduce the problem either ... so x86_64 is not a factor

note that "-m1" means that it is physical machine
Comment 7 David Kovalsky 2009-01-05 10:56:39 EST
Daniel, have you tried only one FQDN? Or have you tried more? Can you post the hostname you've tried? And the contents of /etc/resolv.conf. Not sure it's the cause though it may help us in creating a similar environment for testing. 

Also, do you still have access to the machine where the issue appeared? 

Might NSCD possibly be the source of problems?
Comment 8 Karel Volný 2009-01-05 11:04:35 EST
note that I cannot reproduce this (using the same setup as in comment #4) on ppc, both real and virtualized (ppcp-5s-m1.lab.bos.redhat.com and ppcp-5s-2-v1.lab.bos.redhat.com)
Comment 10 David Kovalsky 2009-01-05 11:13:40 EST
Go it - it's a NSCD thing:

[root@kovy ~]# ping duckme
PING duckme (10.34.32.33) 56(84) bytes of data.
64 bytes from kovy.englab.brq.redhat.com (10.34.32.33): icmp_seq=1 ttl=64 time=0.047 ms
64 bytes from kovy.englab.brq.redhat.com (10.34.32.33): icmp_seq=2 ttl=64 time=0.042 ms

--- duckme ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1000ms
rtt min/avg/max/mdev = 0.042/0.044/0.047/0.007 ms
[root@kovy ~]# service nscd start
Starting nscd:                                             [  OK  ]
[root@kovy ~]# ping duckme
ping: unknown host duckme
Comment 11 David Kovalsky 2009-01-05 11:16:36 EST
[root@kovy ~]# uname -a
Linux kovy.englab.brq.redhat.com 2.6.18-125.el5 #1 SMP Mon Dec 1 17:38:25 EST 2008 x86_64 x86_64 x86_64 GNU/Linux

(bare metal)
Comment 12 Karel Volný 2009-01-05 11:27:14 EST
cannot confirm ...

.qa.[root@i386-5s-m1 tps]# ping bz478816
PING bz478816 (10.34.33.227) 56(84) bytes of data.
64 bytes from bz478816 (10.34.33.227): icmp_seq=1 ttl=59 time=261 ms
64 bytes from bz478816 (10.34.33.227): icmp_seq=2 ttl=59 time=270 ms

--- bz478816 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1000ms
rtt min/avg/max/mdev = 261.740/266.261/270.782/4.521 ms
.qa.[root@i386-5s-m1 tps]# service nscd start
Starting nscd:                                             [  OK  ]
.qa.[root@i386-5s-m1 tps]# ping bz478816
PING bz478816 (10.34.33.227) 56(84) bytes of data.
64 bytes from bz478816 (10.34.33.227): icmp_seq=1 ttl=59 time=263 ms
64 bytes from bz478816 (10.34.33.227): icmp_seq=2 ttl=59 time=264 ms
64 bytes from bz478816 (10.34.33.227): icmp_seq=3 ttl=59 time=276 ms

--- bz478816 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2001ms
rtt min/avg/max/mdev = 263.180/268.102/276.410/5.937 ms
.qa.[root@i386-5s-m1 tps]# uname -a
Linux i386-5s-m1.lab.bos.redhat.com 2.6.18-92.1.22.el5PAE #1 SMP Fri Dec 5 09:58:49 EST 2008 i686 i686 i386 GNU/Linux

is there any newer nscd than nscd-2.5-24.el5_2.2.i386 (I do not see any erratum for it)?

p.s. the same on x86-64-5s-2-m1.lab.bos.redhat.com
Comment 13 Daniel Riek 2009-01-05 11:27:39 EST
ACK. Stopping NSCD workes around the issue.
Comment 14 David Kovalsky 2009-01-05 11:30:37 EST
I'm installing a 5.2 machine ATM to check if this is a regression. Will have results in a couple of minutes.
Comment 15 Daniel Riek 2009-01-05 11:37:36 EST
Moving to glibc as it is appears to be a nscd issue.
Comment 16 Daniel Riek 2009-01-05 11:38:53 EST
*** Bug 478717 has been marked as a duplicate of this bug. ***
Comment 17 Karel Volný 2009-01-05 11:40:45 EST
ok, I answer my own question, there is a newer nscd, but still it does not reproduce the problem on my setup:

.qa.[root@i386-5s-m1 tps]# ping bz478816
PING bz478816 (10.34.33.227) 56(84) bytes of data.
64 bytes from bz478816 (10.34.33.227): icmp_seq=1 ttl=59 time=305 ms
64 bytes from bz478816 (10.34.33.227): icmp_seq=2 ttl=59 time=277 ms

--- bz478816 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 999ms
rtt min/avg/max/mdev = 277.423/291.362/305.301/13.939 ms
.qa.[root@i386-5s-m1 tps]# service nscd restart
Stopping nscd:                                             [  OK  ]
Starting nscd:                                             [  OK  ]
.qa.[root@i386-5s-m1 tps]# ping bz478816
PING bz478816 (10.34.33.227) 56(84) bytes of data.
64 bytes from bz478816 (10.34.33.227): icmp_seq=1 ttl=59 time=267 ms
64 bytes from bz478816 (10.34.33.227): icmp_seq=2 ttl=59 time=276 ms
64 bytes from bz478816 (10.34.33.227): icmp_seq=3 ttl=59 time=294 ms

--- bz478816 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2000ms
rtt min/avg/max/mdev = 267.321/279.565/294.837/11.451 ms
.qa.[root@i386-5s-m1 tps]# rpm -q nscd
nscd-2.5-31.i386
Comment 18 Jiri Skala 2009-01-05 11:44:44 EST
I tested it on virtual RHEL5 i386. I couldn't reproduce it with nscd-2.5-18 but I reached this issue after updating to nscd-2.5-33.
Comment 19 David Kovalsky 2009-01-05 11:51:42 EST
Karel, you're likely going to need a non-shared environment to be able to
reproduce this every time (to be able to reboot)

Test plan: 

Install fresh RHEL5 Server U2:
 - reboot
 - log in (root, ssh)
 - add 'duckme' to /etc/hosts with link to machine's IP (10.34.33.32 in my
case)
 - `ping duckme`
 - `service nscd start`
 - `ping duckme`

Passes on RHEL-5.2.0 Server, x86_64 bare metal. So that definitely is a
regression.

kernel-2.6.18-92.el5
nscd-2.5-24
iputils-20020927-43.el5

[root@kovy ~]# cat /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1               localhost.localdomain localhost
::1             localhost6.localdomain6 localhost6
10.34.32.33             kovy.englab.brq.redhat.com kovy
10.34.32.33             duckme
Comment 20 David Kovalsky 2009-01-05 11:53:58 EST
and glibc-2.5-24, if that matters
Comment 21 David Kovalsky 2009-01-05 11:59:06 EST
Failing packages on latest trees:
kernel-2.6.18-128.el5
glibc-2.5-33
nscd-2.5-33
iputils-20020927-45.el5

Jiri, send me your public ssh key and I'll ad it to my machine so you can log in.
Comment 23 Karel Volný 2009-01-05 12:19:33 EST
after upgrading to 2.5-33 the problem still wasn't reproducible until after machine reboot ... then it worked at the first try (i.e. ping resolved the hostname), but after messing with /etc/hosts (adding another alias for the same IP), the resolving stopped working and it continued to report "ping: unknown host bz478816" even after restoring /etc/hosts to the same state as after reboot

I'll try to figure the exact action which triggers the bug later
Comment 33 errata-xmlrpc 2009-01-20 15:50:00 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2009-0080.html

Note You need to log in before you can comment on or make changes to this bug.