Description of problem: getnameinfo fails instead of returning IP address when looking up a hostname by IP address. Version-Release number of selected component (if applicable): glibc-2.5-3 How reproducible: This happens when a)The IP is not listed in /etc/hosts b)hosts enty in /etc/nsswitch.conf is as it is by default i.e. hosts: files dns Steps to Reproduce: 1. compile the attached program dnstst.c using gcc -odnstst dnstst.c 2. ./dnstst 10.1.1.1 # i.e. any dummy ip address not listed in /etc/hosts file Actual results: [root@eowen dnstst]# ./dnstst 10.1.1.1 gethostbyaddr host(10.1.1.1) getnameinfo: localhost: Success Temporary failure in name resolution: Illegal seek error=-3 Expected results: [jjf@jfaith ~/tmp/dnstst]$ dnstst 10.1.1.1 gethostbyaddr host(10.1.1.1) getnameinfo namebuf(10.1.1.1) Additional info: On both FC2 and FC4 the getnameinfo function returns the IP address when it is unable to find the hostname, as in expected results above. The reason I noticed this problem is that if telnetd(telnet-server-0.17-37.rpm) is installed and setup. Any attempt to telnet into the machine fails if the client's IP address is not listed in /etc/hosts. The call to getnameinfo in the attached dnstst.c is based on the code used in telnetd. rlogind fails in a similar way to telnetd. If the nsswitch.conf file is changed to:- hosts: files getnameinfo works as expected. But nsswitch.conf on FC4 uses 'hosts: files dns' and getnameinfo works ok. So this seems to be a change in the way getnameinfo works which causes various programs to fail.
Created attachment 144903 [details] test prog to call getnodeinfo
Sorry, can't reproduce this. gcc -o /tmp/221583{,.c}; rpm -q glibc; grep 10.1 /etc/hosts; grep ^hosts /etc/nsswitch.conf; /tmp/221583 10.1.1.1 glibc-2.5-3 glibc-2.5-3 hosts: files dns gethostbyaddr host(10.1.1.1) getnameinfo namebuf(10.1.1.1) Perhaps misconfigured DNS (/etc/resolv.conf wrong or broken DNS server) on your side?
Sorry you are correct if a nameserver is configured correctly it return the IP address as expected. But if /etc/resolv.conf does not exist the problem occurs. This is different from glibc in FC2 and FC4 where if resolv.conf does not exist the IP address is returned. Also I think that if for some reason the nameserver is down it may be more appropriate to use the IP address than to return the EAI_AGAIN error code unless the NI_NAMEREQD flag is set in the getnameinfo call.
See http://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=204122
I have examined bug 204122 but I'm sorry but I still think this is a bug. The manual page for getnameinfo states:- NI_NUMERICHOST If set, then the numeric form of the hostname is returned. (When not set, this will still happen in case the node's name cannot be looked up.) If there is no /etc/resolv.conf file I think the numeric form should be returned. The only case where this would be inappropriate is if the NI_NAMEREQD flag is set. In which case it is appropriate for the EAI_AGAIN or maybe EAI_NONAME error to occur
nsswitch.conf containing hosts: ... dns and missing resolv.conf is an admin error, either you should remove dns from nsswitch.conf, or supply a valid resolv.conf. The missing resolv.conf is the same as resolv.conf containing no nameserver lines and that causes libresolv to return TRY_AGAIN, the same as if a DNS is unreachable. For that EAI_AGAIN is the correct answer, instead of silently pretending DNS replied, but said there is no DNS entry for that IP address.
I think that for the EAI_AGAIN or EAI_NONAME errors to be returned the NI_NAMEREQD flag must be set in the call to getnameinfo. Snippets from the getnameinfo manual page describing the 'flags' argument:- NI_NUMERICHOST If set, then the numeric form of the hostname is returned. (When not set, this will still happen in case the node’s name cannot be looked up.) NI_NAMEREQD If set, then a error is returned if the hostname cannot be looked up. In the 'RETURN VALUE' section the manual page for getnaminfo states:- EAI_AGAIN The name could not be resolved at this time. Try again later. EAI_NONAME The name does not resolve for the supplied parameters. NI_NAMEREQD is set and the host’s name cannot be located, or neither hostname nor service name were requested. Clearly the manual page is explicit that EAI_NONAME should only be returned when NI_NAMEREQD is set. But the section describing the NI_NUMERICHOST flag indicates that the numeric form of the hostname will be returned when the hostname cannot be looked up, so I think is is also the case that the EAI_AGAIN error should ONLY occur if the NI_NAMEREQD flag is set. The bugfix for bug 204122 which changed getnameinfo to return EAI_AGAIN instead of EAI_NONAME under certain circumstances failed to recognise that EAI_NONAME should only be returned if the NI_NAMEREQD flag is set. Given that bug 204122 related to postfix I have checked the postfix code and in the src/util/myaddrinfo.c file there is a call to getnameinfo that uses the NI_NAMEREQD flag. I think that the bugfix is correct to return EAI_AGAIN rather than EAI_NONAME under the conditions in question. But ONLY if the NI_NAMEREQD flag is set. In other words if NI_NAMEREQD is NOT set and the hostname cannot be determined for ANY reason the numeric form of the address should be returned.
We need to agree to disagree. http://www.opengroup.org/onlinepubs/009695399/functions/getnameinfo.html says quite clearly that EAI_NONAME shouldn't be returned if NI_NAMEREQD is not set and one of nodename and servname is not NULL. But EAI_AGAIN can be returned any time, it tells the caller that if he tries later, it might resolve to a name.
If you are certain that the EAI_AGAIN error should be returned even when NI_NAMEREQD is NOT set then it will be necessary to fix every program that calls getnameinfo without the NI_NAMEREQD flag and that currently does not handle EAI_AGAIN. Many programs that call getnameinfo without NI_NAMEREQD seem to abort if an error is returned as the only errors they expect are fairly terminal. Probably the simplest fix is to add code to convert the IP address to a string and store it in the host variable when EAI_AGAIN is returned as this was the previous behaviour of these programs. It will be necessary to patch at least telnetd and rlogind as I know both of these have the problem. I do not have access to a fully expanded source code repository so I cannot easily search all programs for getnameinfo calls without the NI_NAMEREQD flag. Are you able to do this? Should I raise bugs against telnetd and rloingd?