Bug 1044628

Summary: getaddrinfo return EAI_NONAME instead of EAI_AGAIN in case the DNS query times out
Product: Red Hat Enterprise Linux 6 Reporter: Patrik Kis <pkis>
Component: glibcAssignee: Siddhesh Poyarekar <spoyarek>
Status: CLOSED ERRATA QA Contact: Arjun Shankar <ashankar>
Severity: high Docs Contact:
Priority: high    
Version: 6.6CC: ashankar, codonell, fweimer, jkurik, jtrowbri, ksrot, mcermak, mnewsome, nalayil, pfrankli, spoyarek
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glibc-2.12-1.140.el6 Doc Type: Bug Fix
Doc Text:
The getaddrinfo function returns a permanent error EAI_NONAME when the DNS server is unreachable or the DNS query times out. This is now fixed so that the function returns EAI_AGAIN to indicate a temporary failure in name resolution.
Story Points: ---
Clone Of:
: 1098042 (view as bug list) Environment:
Last Closed: 2014-10-14 04:42:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 928849, 994246, 1023566, 1028635, 1042734, 1056252, 1098050    

Description Patrik Kis 2013-12-18 17:37:53 UTC
Description of problem:
getaddrinfo returns EAI_NONAME when it should EAI_AGAIN, like when DNS query times out.
It is also a question what should it return when no DNS is configured in resolv.conf and the hostname is also not in /etc/hosts; currently it rerurns also EAI_NONAME.

Version-Release number of selected component (if applicable):
glibc-2.12-1.132.el6

How reproducible:
always

Steps to Reproduce:
[root@rhel6 audit]# cat getaddrinfo.c 
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <netdb.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <arpa/inet.h>

int
lookup_host (const char *host)
{
  struct addrinfo hints, *res;
  int errcode;
  char addrstr[100];
  void *ptr;

  memset (&hints, 0, sizeof (hints));
  hints.ai_family = PF_UNSPEC;
  hints.ai_socktype = SOCK_STREAM;
  hints.ai_flags |= AI_CANONNAME;

  errcode = getaddrinfo (host, NULL, &hints, &res);
  if (errcode != 0)
    {
      printf ("Error num: %d\n", errcode);
      printf ("Error def: %s\n", gai_strerror(errcode));
      perror ("getaddrinfo");
      return -1;
    }

  printf ("Host: %s\n", host);
  while (res)
    {
      inet_ntop (res->ai_family, res->ai_addr->sa_data, addrstr, 100);

      switch (res->ai_family)
        {
        case AF_INET:
          ptr = &((struct sockaddr_in *) res->ai_addr)->sin_addr;
          break;
        case AF_INET6:
          ptr = &((struct sockaddr_in6 *) res->ai_addr)->sin6_addr;
          break;
        }
      inet_ntop (res->ai_family, ptr, addrstr, 100);
      printf ("IPv%d address: %s (%s)\n", res->ai_family == PF_INET6 ? 6 : 4,
              addrstr, res->ai_canonname);
      res = res->ai_next;
    }

  return 0;
}

int
main (int argc, char *argv[])
{
  if (argc < 2)
    exit (1);
  return lookup_host (argv[1]);
}

[root@rhel6 audit]# 
[root@rhel6 audit]# 
[root@rhel6 audit]# 
[root@rhel6 audit]# gcc -o getaddrinfo getaddrinfo.c
[root@rhel6 audit]# ./getaddr
getaddr      getaddrinfo  
[root@rhel6 audit]# cat /etc/resolv.conf
nameserver 127.0.0.1
[root@rhel6 audit]# cat /var/named/bbb 
$TTL    86400 ; 24 hours could have been written as 24h or 1d
$ORIGIN bbb.net.
@  1D  IN    SOA rhel6.bbb.net.   hostmaster.bbb.net. (
                  2002022401 ; serial
                  3H ; refresh
                  15 ; retry
                  1w ; expire
                  3h ; minimum
                 )
           IN  NS     rhel6.bbb.net. ; in the domain
           IN  A      192.168.100.60  ;name server definition
rhel6      IN  A      192.168.100.60  ;name server definition
rhel61     IN  A      192.168.100.61 ;name server definition

[root@rhel6 audit]# 
[root@rhel6 audit]# ./getaddrinfo rhel61.bbb.net
Host: rhel61.bbb.net
IPv4 address: 192.168.100.61 (rhel61.bbb.net)
[root@rhel6 audit]# 
[root@rhel6 audit]# iptables -A OUTPUT -m udp -p udp --sport 53 -j REJECT
[root@rhel6 audit]# ./getaddrinfo rhel61.bbb.net
Error num: -2
Error def: Name or service not known
getaddrinfo: Connection timed out
[root@rhel6 audit]# 
[root@rhel6 audit]# rpm -q glibc
glibc-2.12-1.132.el6.x86_64
glibc-2.12-1.132.el6.i686
[root@rhel6 audit]#

Actual results:
EAI_NONAME

Expected results:
EAI_AGAIN

Comment 7 Siddhesh Poyarekar 2014-04-16 09:34:38 UTC
*** Bug 758193 has been marked as a duplicate of this bug. ***

Comment 8 Siddhesh Poyarekar 2014-04-16 14:13:25 UTC
OK, so it is indeed the discrepancy that was pointed out in bug 758193 that was the problem.  For AF_INET and AF_INET6, the returned error code is (as always has been) EAI_AGAIN.  For AF_UNSPEC though, the error returned is EAI_NONAME, which is wrong.

EAI_NONAME is an authoritative response saying that the result was not found.  This does not match with the result of network being down because the latter is a transient failure and not necessarily a permanent one.  Likewise, the herrno value (TRY_AGAIN) is a non-authoritative not-found response, so marking a permanent failure for it would be wrong.

As a result, the correct behaviour is to return EAI_AGAIN when the server is not reachable, as this bug requests.  I have posted a patch that fixes the broken behaviour for AF_UNSPEC:

https://sourceware.org/ml/libc-alpha/2014-04/msg00321.html

I would really appreciate testing feedback on this because the code is quite fragile and hence prone to regressions.

Comment 15 errata-xmlrpc 2014-10-14 04:42:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2014-1391.html