Bug 605003 - glibc/getent performs ipv6 host lookup when host is defined in /etc/hosts, invalidates nscd cache
Summary: glibc/getent performs ipv6 host lookup when host is defined in /etc/hosts, in...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: glibc
Version: 5.5
Hardware: All
OS: Linux
low
high
Target Milestone: rc
: ---
Assignee: Andreas Schwab
QA Contact: qe-baseos-tools-bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-06-17 09:26 UTC by Dag Wieers
Modified: 2016-11-24 15:47 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-06-17 10:08:07 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Dag Wieers 2010-06-17 09:26:47 UTC
A discussion about this problem has been held here:

    https://www.redhat.com/archives/rhelv5-list/2010-June/msg00084.html

In short, if you have an entry in /etc/hosts and you configure /etc/nsswitch.conf to prefer 'files' over 'dns', glibc will still perform an AAAA (ipv6) host lookup.

We expect it to have the same behavior as ipv4, where a successful lookup in 'files' will return and not cause A (ipv4) host lookup.

The problem with this behaviour is that nscd has problems caching those entries from /etc/hosts because the ipv6 lookup fails. And if DNS is not available (in disaster scenario cases) services are dead-slow due to DNS timeouts. (Which is why we rely on /etc/hosts in those cases, for eg. clusters).

This problem also affects RHEL4.

PS This is not related to the ipv4/ipv6 resolving issue, we know that even when ipv6 is disabled, AAAA lookups are performed to comply with some RFC.

Here is a small walk-through to mimic the behavior for yourself:

This is easy to verify:

 - Add a fake name to /etc/hosts

	1.2.3.4		testsys

 - Make sure /etc/nsswitch.conf has

	hosts:	files dns

 - Add a non-working DNS server at the top of /etc/resolv.conf

	nameserver 4.3.2.1

 - Perform a namelookup and see that it times out

	getent hosts testsys
	getent hosts 1.2.3.4

 - Verify that a namelookup is not performed when doing

	getent hosts

Even when you configure /etc/nsswitch to do:

	hosts:	files [SUCCESS=return] dns

it fails to work as we would expect.

Comment 1 Dag Wieers 2010-06-17 09:35:40 UTC
We also opened a support service request with number #2032072.

Comment 2 Dag Wieers 2010-06-17 09:55:03 UTC
If you look at the following output you can see the inconsistency, only when doing a hosts database lookup you see the ipv6 DNS
request being made. Even when specifying ahostsv4 it doesn't do it because it is resolved by /etc/hosts, and I assume ahostsv6
fails because ipv6 is disabled.

[dag@moria ~]$ strace -e connect getent hosts localhost
connect(3, {sa_family=AF_FILE, path="/var/run/nscd/socket"...}, 110) = -1 ENOENT (No such file or directory)
connect(3, {sa_family=AF_FILE, path="/var/run/nscd/socket"...}, 110) = -1 ENOENT (No such file or directory)
connect(3, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("212.224.255.252")}, 28) = 0
127.0.0.1       localhost.localdomain localhost

[dag@moria ~]$ strace -e connect getent ahostsv4 localhost
connect(3, {sa_family=AF_FILE, path="/var/run/nscd/socket"...}, 110) = -1 ENOENT (No such file or directory)
connect(3, {sa_family=AF_FILE, path="/var/run/nscd/socket"...}, 110) = -1 ENOENT (No such file or directory)
127.0.0.1       STREAM localhost.localdomain
127.0.0.1       DGRAM
127.0.0.1       RAW

[dag@moria ~]$ strace -e connect getent ahostsv6 localhost6

[dag@moria ~]$ strace -e connect getent -s files hosts localhost
connect(3, {sa_family=AF_FILE, path="/var/run/nscd/socket"...}, 110) = -1 ENOENT (No such file or directory)
connect(3, {sa_family=AF_FILE, path="/var/run/nscd/socket"...}, 110) = -1 ENOENT (No such file or directory)
127.0.0.1       moria localhost.localdomain localhost

[dag@moria ~]$ strace -e connect getent -s dns ahostsv4 localhost
connect(3, {sa_family=AF_FILE, path="/var/run/nscd/socket"...}, 110) = -1 ENOENT (No such file or directory)
connect(3, {sa_family=AF_FILE, path="/var/run/nscd/socket"...}, 110) = -1 ENOENT (No such file or directory)
connect(3, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("212.224.255.252")}, 28) = 0

[dag@moria ~]$ strace -e connect getent -s dns ahostsv6 localhost6

Comment 3 Andreas Schwab 2010-06-17 10:07:23 UTC
"getent hosts" knows nothing about NSS, it exclusively looks in /etc/hosts and DNS, explicitly asking for IPv6 first.  If you want to test the glibc resolver (getaddrinfo) use the ahosts{,v[46]} databases.

Comment 4 Dag Wieers 2010-06-17 10:22:51 UTC
Well, we are reporting this behavior because it is the same behavior nscd shows. Do you want me to open a new bug-report for nscd ?

Can you also confirm that if DNS is unavailable and a hostname/IP is in /etc/hosts, that nscd should not timeout on a DNS issue and refuse to cache when it finally times out and returns the /etc/hosts entry ?

Comment 5 Andreas Schwab 2010-06-17 10:40:28 UTC
"getent hosts" does not go through nscd at all.

Comment 6 Dag Wieers 2010-06-17 11:17:20 UTC
Thanks for your prompt replies.

We thought it did based on this:

[dag@moria ~]$ strace -e connect getent hosts localhost
connect(3, {sa_family=AF_FILE, path="/var/run/nscd/socket"...}, 110) = -1 ENOENT (No such file or directory)
connect(3, {sa_family=AF_FILE, path="/var/run/nscd/socket"...}, 110) = -1 ENOENT (No such file or directory)
connect(3, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("212.224.255.252")}, 28) = 0
127.0.0.1       localhost.localdomain localhost

[root@moria dstat]# /etc/init.d/nscd restart
Stopping nscd:                                             [FAILED]
Starting nscd:                                             [  OK  ]

[root@moria dstat]# nscd -i hosts

[dag@moria ~]$ strace -e connect getent hosts localhost
connect(3, {sa_family=AF_FILE, path="/var/run/nscd/socket"...}, 110) = 0
connect(3, {sa_family=AF_FILE, path="/var/run/nscd/socket"...}, 110) = 0
connect(3, {sa_family=AF_FILE, path="/var/run/nscd/socket"...}, 110) = 0
127.0.0.1       localhost.localdomain localhost

Comment 7 Andreas Schwab 2010-06-18 08:52:13 UTC
Yo are right, I misread the code.  But this is still not a bug since getent hosts explicitly asks for the IPv6 address.

Comment 8 Andreas Schwab 2010-06-18 08:52:14 UTC
You are right, I misread the code.  But this is still not a bug since getent hosts explicitly asks for the IPv6 address.


Note You need to log in before you can comment on or make changes to this bug.