Bug 901532 - getent ahosts gives no output when nscd started
Summary: getent ahosts gives no output when nscd started
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: glibc
Version: 8.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: 8.0
Assignee: DJ Delorie
QA Contact: qe-baseos-tools-bugs
URL:
Whiteboard:
Depends On:
Blocks: 880347
TreeView+ depends on / blocked
 
Reported: 2013-01-18 12:32 UTC by Dagmar Prokopová
Modified: 2023-07-18 14:30 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-10-06 02:40:30 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
reproducer of the problem (1.66 KB, application/x-gzip)
2013-01-18 12:32 UTC, Dagmar Prokopová
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Sourceware 26630 0 P2 NEW getaddrinfo drops ipv6 V4MAPPED addresses from ncsd results 2021-02-08 14:10:47 UTC

Description Dagmar Prokopová 2013-01-18 12:32:48 UTC
Created attachment 682395 [details]
reproducer of the problem

Description of problem:
Getent ahosts gives no output when nscd started for domains with ipv6 begining with zeros (e.g. 0000:0000:0000:0000:0000:ffff:138.89.0.97). It works fine when nscd stopped or if domain has an ipv6 which doesn't start with zeros (e.g. 3b1c:e53b:52b4:4412:ce6f:7095:54a2:8371).

This occurs only for getent ahosts. Getent hosts works correctly.

Version-Release number of selected component (if applicable):
glibc-2.12-1.80.el6.x86_64
bind-9.8.2-0.10.rc1.el6.x86_64
nscd-2.12-1.80.el6.x86_64

How reproducible:
always

Steps to Reproduce:
1. install bind and nscd
2. extract the tarball reproducer.tgz
3. run ./reproducer.sh
4. compare files ahosts_nscd_off and ahosts_nscd_on
  
Actual results:
The files differ because of no output for getent ahosts dns1.705fb50377.asia in the file ahosts_nscd_on.

Expected results:
The files should be same or at least there should be an output for getent ahosts dns1.705fb50377.asia in the file ahosts_nscd_on.

Additional info:

fails only on x86_64 and ppc64
works fine on s390x and i386

Comment 9 RHEL Program Management 2013-01-26 06:47:59 UTC
This request was not resolved in time for the current release.
Red Hat invites you to ask your support representative to
propose this request, if still desired, for consideration in
the next release of Red Hat Enterprise Linux.

Comment 10 Carlos O'Donell 2013-02-01 20:54:18 UTC
Dagmar,

Thanks for submitting this issue. I was able to reproduce this on RHEL 6.3.

We believe that this bug is likely related to getaddrinfo usage in IPv6 configurations. We are looking into this broader issue of the correctness of the resolver in glibc for IPv6.

I've added this issue to one of our tracker bugs and we'll be coming back to this as we make progress.

Comment 11 Dagmar Prokopová 2013-02-04 13:15:47 UTC
Carlos, thanks for the update.

Comment 14 Carlos O'Donell 2015-01-14 21:51:18 UTC
This issue needs in depth review upstream before we make any decisions on it for RHEL. RHEL6 is likely to stay with the existing behaviour because changes to getaddrinfo are risky. I'm moving this to RHEL 7 to keep tracking the problem.

Comment 19 Carlos O'Donell 2020-04-28 17:00:55 UTC
We are going to review this in RHEL 8 to see if we need to work on this issue upstream.

Comment 23 DJ Delorie 2020-09-10 20:26:45 UTC
I've reproduced and debugged this using the upstream sources, where the bug also exists and where the fix should begin.  It looks like the root cause is the following:

* In sysdeps/posix/getaddrinfo.c around line 691, we are iterating on the list of addresses we got from nscd, and detect that we've gotten at least one valid IPv6 address, and set the "got_ipv6" flag[1].

* Around line 1035 we iterate over the list of addresses and discard any "V4MAPPED" addresses if we've seen any IPv6 addresses.

The problem is, the IPv6 address we saw - and set the flag for - is the same address we're later discarding.  If the test around line 691 adds a conditional on not being a V4-mapped address, the test case seems to work correctly, although the actual fix may be more complicated than that.  Testing should include all eight permutations of address family availability (v4, v4mapped, v6) with and without nscd running.

[1] Note this flag is also set around line 807, for non-nscd lookups, but does not seem to cause the same problem.  Further investigation is warranted here.

Comment 26 Carlos O'Donell 2020-10-06 02:40:30 UTC
NSCD has certain caching behaviours that are problematic and this RCA shows that these cases have not been well tested. The only solution right now is to avoid the problematic case where the cache differs or test for it and mark it as an expected failure. Alternatives to nscd include a local cache like a local bind, dnsmasq or sssd.

We are going to be tracking this issue upstream with the following upstream bug:
https://sourceware.org/bugzilla/show_bug.cgi?id=26630

I am closing this bug as CLOSED/UPSTREAM, and when the upstream bug is fixed we can consider a backport.


Note You need to log in before you can comment on or make changes to this bug.