From Bugzilla Helper: User-Agent: Mozilla/5.0 (compatible; Konqueror/3; Linux; X11; , en) Description of problem: Glibc seems to be ignoring /etc/nsswitch.conf. Even when you place into /etc/nsswitch.conf: hosts: files [SUCCESS=return] dns and you try the excess . hack as described in: https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=77538#c13 things are not right. One has a lot of IPv6 DNS lookups going on. Observe: # cat /etc/resolv.conf search prv.test1.org nameserver 75.25.79.9 nameserver 200.150.41.9 # egrep '^hosts' /etc/nsswitch.conf hosts: files dns # egrep 'emma|lucas|gauss|localhost' /etc/hosts 127.0.0.1 localhost.prv.test1.org localhost 127.0.0.1 localhost.prv.test1.org. localhost. 10.1.1.2 emma.prv.test1.org emma.test1.org emma 10.1.1.2 emma.prv.test1.org. emma.test1.org. emma. 10.2.2.2 lucas.dmz.test1.org lucas.test1.org lucas 10.2.2.2 lucas.dmz.test1.org. lucas.test1.org. lucas. 10.2.2.4 gauss.dmz.test2.net gauss.test2.net gauss 10.2.2.4 gauss.dmz.test2.net. gauss.test2.net. gauss. Now when pull the machine off the network so that I cannot contact its name servers and do a: telnet localhost then tcp dump shows for a RH8.0 with glibc-2.2.93-5, 8 IPv6 lookups will be attemted. SInce the name server is unreachable, a full 5*8 seconds of delay is introduced! It does find 127.0.0.1 (due to the .-hack-a-round for RH8.0), but why all of those IPv6 (AAAA?) lookups?! When your name servers are down, those DNS timeouts can be very painful! Q: How can one prevent these IPv6 DNS lookups? The same thing happens on RH7.3 with glibc-2.2.5-42 systems. The command: telnet lucas.dmz.test1.org. will result in 4 IPv6 lookups, followed by 4 IPv4 lookups, followed 4 more IPv6 lookups! A whooping 12*5 seconds = 1 minute delay! The resolver(5) man page suggests there is an inet6 option (query AAAA before A). I do not have this option set. One would think that without it, A would occur before AAAA and given success in A lookups, the AAAA lookup would be skipped. But no ... Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: 1.remove the host from the network (or block access to name servers(s) 2.In another window, monitor DNS traffic: tcpdump -n -i eth0 udp port domain 3. telnet localhost or: telnet some.hostname.in.etc.hosts.com Actual Results: 8 IPv6 DNS lookup attempts will be performed. When the DNS server(s) or the network to the DNS server(s) is down, these 5*8 second delay in IPv6 lookups can be very painful. Expected Results: The hostname (or hostname. entry in the case of the RH8.0 hack-a-round See Bug #77538 comment #c13) should match. Once the IPv4 match occurs, no IPv6 lookups should be performed. Additional info: This excessive IPv6 problem has been found on RH8.0 with glibc-2.2.93-5 as well as this RH7.3 with glibc-2.2.5-42. This appears to be a larger pattern if nor processing the /etc/nsswitch.conf file and /etc/hosts correctly. I will attach a detailed tcpdump of various combinations. There are a number of related bugs: Bug 61391 telnet delay connecting to site not in DNS Bug 58568 glibc does not exactly follow nsswitch.conf settings Bug 66682 unexpected nsswitch behavior Bug 71546 ldap for user files always used, regardless of nsswitch.conf Bug 58568 nis for host files always used, regardless of nsswitch.conf Bug 76543 name to IP resolution issues Bug 77538 Konqueror will not resolve domain names entered in /etc/hosts file
Created attachment 90022 [details] tcpdumps of excessive IPv6 and some extra IPv4 DNS lookups tcpdumps of excessive IPv6 and some extra IPv4 DNS lookups being performed under various conditions from both ssh and telnet on both RH8.0 and RH7.3 systems. Access to their name servers was blocked by disconnecting the external gateway/router to their networks. The actual hostnames and IP addresses were changed. However their relationships (network and domain-wise) are real.
Please try rawhide glibc.
Do you mean try glibc-2.3.1-6.i686.rpm on RH8.0? If not, which RPM (and location) would you suggest that I test? Which RPM would you suggest that I test under RH7.3, the same?
No, I mean try glibc-2.3.1-46 (or -48) from ftp.redhat.com/pub/redhat/linux/rawhide/ You can try it on 7.3 too (though I'd try it first on some testbox in case it is a production 7.3 box).
Just to let you know, we have not forgotten your request. This weekend we plan to load the rawhide glibc* set onto a RH8.0 machine and rerun the tests as found in the attachment.
On an RH8.0 system with current/up2date RPMs I installed: binutils-2.13.90.0.18-6 glibc-2.3.1-51 glibc-common-2.3.1-51 glibc-devel-2.3.1-51 glibc-kernheaders-2.4-8.10 glibc-profile-2.3.1-51 glibc-utils-2.3.1-51 memprof-0.5.1-3 from rawhide. After rebooting, many of the previously reported DNS problems were resolved. In particular: Bug 61391 telnet delay connecting to site not in DNS Fixed! Bug 58568 glibc does not exactly follow nsswitch.conf settings Fixed! Bug 66682 unexpected nsswitch behavior Fixed! Bug 76543 name to IP resolution issues Fixed! Bug 77538 Konqueror will not resolve domain names entered in /etc/hosts file Fixed! I no longer need the "dots after duplicate hostname lines" hack-a-round as described in: https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=77538#c13 to resolve entries in /etc/hosts. Now the /etc/nsswitch.conf line: hosts: files dns causes /etc/hosts to take priority over DNS. =-=-= All the above is good news. However, the excessive IPv6 issue still remains. When the host is disconnected from the network and there is an attempt to resolve a hostname that is not otherwise found in /etc/hosts, a number of IPv6 lookups are performed head of IPv4 lookups. A site with 2 DNS servers listed in /etc/resolv.conf will incur 16 DNS timeouts (8 IPv6 followed by 8 IPv4) of 5 second each for a whopping 80 seconds of relay before the DNS resolution gives up. For sites that do not carry IPv6 traffic, it would be very helpful if they could disable the IPv6 DNS lookups. Doing so would cut out 1/2 the timeout period when disconnected from the network. Even when a host is connected to the network and the 1st DNS server responds, an external DNS resolution must perform 2 IPv6 lookups before the successful IPv4 lookup is performed. I saw a typical connection (again when everything was working) delay on successful DNS lookups jump from 9.4msec (for IPv4 only) to 139.0msec because of the two extra IPv6 lookups. Those unneeded IPv6 lookups increased the DNS connection startup delay by a factor of 14.7! Your mileage may vary. But even if those 2 extra IPv6 failed lookups return results as fast as the successful IPv4 lookup, you will still be talking about a 3x increase in DNS induced startup delay. For sites that are doing IPv4 traffic only, I highly recommend some configuration parameter that allows them to say "do not even bother doing IPv6 lookups". Maybe two new keywords could be added to the /etc/nsswitch.conf syntax: ipv4 perform a DNS IPv4 based lookup ipv6 perform a DNS IPv6 based lookup The 'dns' keyword could still mean 'IPv6 then IPV4'. However sites who only carry IPv4 traffic could do something like: hosts: files ipv4 and see a connection establishment performance increase over the current case. Or sites could perform both IPv4 and IPv6, but in a different order: hosts: files ipv4 ipv6 Successful IPv4 DNS lookups would not have any IPv6 induced penalty and IPv6 lookups would still occur.
Mr. Noll's suggestion is good, but would it not be simpler to have glibc not even do an IPv6 lookup if the local host's IP address is v4?
A clarification on: https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=84105#c6 You should NOT install those RPMs on a production system. Rawhide is raw bits. Those RPMs were only in relationship to various DNS issues. Those rpms have a number of non-DNS related problems. For example, they cause the rpm command to dump core. They did resolve the DNS issues, with the possible exception of excessive IPv6 lookups.
See bug #86564 for comments related to IPv6 under RH9.0.
The problem is that ftp&telnet are compiled with IPv6 patch, which links programs with libinet6.a, and that library turns on resolver's "options inet6" at startup.
I'm interpreting this bug now solely in the glibc sense. There are other problems involved (my guess is bugs in programs using name resolving and bugs in PAM). In these cases specific bug repots should be filed for those programs. As for glibc, the code in RHL9 should already solve most problems. The getaddrinfo() function does search the services from nsswitch.conf in the specified order. If files is listed first and the host info is contained in /etc/hosts the search stops. The original RHL8 code (and previous release) did not have this. Problems occur if IPv6-enabled programs do not use getaddrinfo(). Some of them look up names like this: gethostbyname2 ("somehost", AF_INET6) if (not found) gethostbyname2 ("somehost", AF_INET) In this case the DNS server is contacted if /etc/hosts does not contain an IPv6 address for "somehost". Only an IPv4 address is not sufficient. This is the kind of problems I suspect PAM and various programs to have. Those programs have to change, glibc is just fine. Now, there is one more case where getaddrinfo() does too much: if the system has no IPv6 interfaces, according to POSIX the programmer can use the AI_ADDRCONFIG option to prevent looking up IPv6 addresses. The same is true for IPv4 addresses if no IPv4 interfaces are present. This flag for getaddrinfo() is not implemented in RHL9. But it is now in the official glibc CVS archive. The next release (maybe the next glibc binary) will have the necessary changes. Therefore I'm closing the bug with UPSTREAM.