Bug 808147 - getaddrinfo("::1") now fails on an otherwise-ipv4-only system
getaddrinfo("::1") now fails on an otherwise-ipv4-only system
Status: CLOSED DUPLICATE of bug 721350
Product: Fedora
Classification: Fedora
Component: glibc (Show other bugs)
17
Unspecified Unspecified
unspecified Severity unspecified
: ---
: ---
Assigned To: Jeff Law
Fedora Extras Quality Assurance
: Reopened
: 843051 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-03-29 13:18 EDT by Dan Winship
Modified: 2013-01-27 20:49 EST (History)
12 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2012-12-16 08:53:58 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
getaddrinfo with HINT to return all addresses. This works fine. (1011 bytes, text/plain)
2012-07-03 09:42 EDT, Rok Papez
no flags Details
getaddrinfo without the HINT. Implicit AI_ADDRCONFIG. This works inconsistently. (913 bytes, text/plain)
2012-07-03 09:43 EDT, Rok Papez
no flags Details
ifdown misteriously reactivates returning of IPv6 loopback (2.01 KB, text/plain)
2012-07-03 09:45 EDT, Rok Papez
no flags Details
ifconfig ethX down has no influence (740 bytes, text/plain)
2012-07-03 09:46 EDT, Rok Papez
no flags Details
TEMPORARY patch to ignore AI_ADDRCONFIG (2.09 KB, patch)
2012-09-22 11:11 EDT, Pavel Šimerda (pavlix)
no flags Details | Diff

  None (edit)
Description Dan Winship 2012-03-29 13:18:37 EDT
On a system where the only IPv6 addresses are either loopback or link-local, getaddrinfo("::1") now fails. It used to work.

eg:

danw@laptop:~> ssh ::1
ssh: Could not resolve hostname ::1: Address family for hostname not supported
danw@laptop:~> sudo ifconfig wlan2 add 1234::5678
danw@laptop:~> ssh ::1
The authenticity of host '::1 (::1)' can't be established blah blah blah
danw@laptop:~> sudo ifconfig wlan2 del 1234::5678
danw@laptop:~> ssh ::1
ssh: Could not resolve hostname ::1: Address family for hostname not supported
Comment 1 Jeff Law 2012-03-30 16:18:32 EDT
This is intentional.  getaddrinfo is supposed to ignore link-local & loopback IPV6 addresses for AI_ADDRCONFIG according to rfc2553.

Failure to properly ignore the link-local & loopback results in erroneous AAAA DNS lookups from hosts with no IPV6 connectivity.  This in turn results in major headaches while the resolver code waits for a timeout on the AAAA request (as many DNS servers don't handle AAAA lookups).
Comment 2 Dan Winship 2012-03-30 17:39:28 EDT
I didn't mean it should do IPv6 lookups in general. I'm familiar with the lossage modes there.

I'm talking specifically about "::1". Parsing an IP address doesn't require making any DNS queries, and that particular IPv6 address is reachable regardless of whether or not you have global IPv6 connectivity, so returning it doesn't violate the spirit of AI_ADDRCONFIG.

Also, it always used to work in the past, and the new behavior breaks things. Eg, in F17, an apache config with "Listen [::1]:8000" is only valid if you have a global IPv6 address; if you don't, then there's no way to have apache listen on ::1. And that in turn breaks the WebKit test suite, which wants to be able to test certain things involving IPv6-addresses-in-URLs, etc.
Comment 3 Asko Tontti 2012-05-24 04:48:36 EDT
Also link local FE80::/10 addresses don't work anymore for commands in Fedora 17. For example "ssh fe80::1234:1234:1234:1234%eth1".
Comment 4 Jeff Law 2012-06-29 15:48:15 EDT
Dan, sorry this has taken so long to get back to.

It's not a question of parsing the address, but what to do with it once it's parsed.  ie, we parse the address just fine via inet_pton.  But since we're ignoring link-local and loopback addresses for AI_ADDRCONFIG calls and the returned address isn't an V4 mapped address the result is considered invalid and dropped on the floor.

I don't see a good way to fix this.   Rather than passing AI_ADDRCONFIG for this special case, could you pass AI_ALL?
Comment 5 Rok Papez 2012-07-03 09:40:22 EDT
This now is a big mess. When trying to resolve localhost, AI_ADDRCONFIG works inconsistently for IPv6 but for IPv4 it will always happily include loopback address.

Reading the new manpages from: http://man7.org/linux/man-pages/man3/getaddrinfo.3.html
the loopback should never be returned if it's the only interface in the system.

Here is the relevant portion:
=============================
If hints.ai_flags includes the AI_ADDRCONFIG flag, then IPv4 addresses are
returned in the list pointed to by res only if the local system has at
least one IPv4 address configured, and IPv6 addresses are only returned if
the local system has at least one IPv6 address configured. The loopback
address is not considered for this case as valid as a configured address.

Now for IPv6 getaddrinfo becomes inconsistent:
1. If I bring down all the eth* interfaces using ifdown it will start returning IPv6 loopback address.
2. If I bring down all the eth* interfaces using ifconfig it will NOT return IPv6 loopback address.

Please find the source files test.c, test2.c and outputs attached.
Comment 6 Rok Papez 2012-07-03 09:42:10 EDT
Created attachment 595983 [details]
getaddrinfo with HINT to return all addresses. This works fine.
Comment 7 Rok Papez 2012-07-03 09:43:06 EDT
Created attachment 595984 [details]
getaddrinfo without the HINT. Implicit AI_ADDRCONFIG. This works inconsistently.
Comment 8 Rok Papez 2012-07-03 09:45:12 EDT
Created attachment 595985 [details]
ifdown misteriously reactivates returning of IPv6 loopback

ifdown misteriously reactivates returning of IPv6 loopback even when AI_ADDRCONFIG is implicitly set.
Comment 9 Rok Papez 2012-07-03 09:46:28 EDT
Created attachment 595986 [details]
ifconfig ethX down has no influence
Comment 10 Dan Winship 2012-07-03 12:12:40 EDT
(In reply to comment #4)
> I don't see a good way to fix this.

Well, I think most people interpret AI_ADDRCONFIG to mean "don't return addresses I can't connect to" (regardless of whether or not that's precisely what the docs say) so I feel like the fix is to just return ::1 here.

> Rather than passing AI_ADDRCONFIG for
> this special case, could you pass AI_ALL?

I'm not hitting the problem directly, just indirectly via apache (see comment 2). And apache is hitting it via some portability function in apr which presumably doesn't want AI_ALL behavior in some cases. I'm sure there's some way they could rewrite things to work around this (and this bug could be refiled to apache if you're not going to change the behavior in glibc; I originally filed this bug assuming that it would cause problems in lots of packages, and lots of other people would eventually notice it, but I guess that hasn't happened..)
Comment 11 Rok Papez 2012-07-04 06:52:29 EDT
(In reply to comment #10)
> (In reply to comment #4)
> > I don't see a good way to fix this.
> 
> Well, I think most people interpret AI_ADDRCONFIG to mean "don't return
> addresses I can't connect to" (regardless of whether or not that's precisely
> what the docs say) so I feel like the fix is to just return ::1 here.

Actually I fail to see any meaning for AI_ADDRCONFIG. A globally defined address is a poor substitute for IPv6 connectivity test.

> > Rather than passing AI_ADDRCONFIG for
> > this special case, could you pass AI_ALL?
> 
> I'm not hitting the problem directly, just indirectly via apache (see
> comment 2). And apache is hitting it via some portability function in apr
> which presumably doesn't want AI_ALL behavior in some cases. I'm sure
> there's some way they could rewrite things to work around this (and this bug
> could be refiled to apache if you're not going to change the behavior in
> glibc; I originally filed this bug assuming that it would cause problems in
> lots of packages, and lots of other people would eventually notice it, but I
> guess that hasn't happened..)

AI_ALL by itself is ignored. It needs to used with AI_V4MAPPED and only specifies to return both IPv4 to IPv6 mapped addresses *and* IPv6 addresses. Most likely the _proper_ way to handle lookups would be like this:

memset(&hint, 0x00, sizeof(hint));
hint.ai_family = AF_UNSPEC;
rc = getaddrinfo ("localhost", NULL, &hint, &addx);
Comment 12 Jeff Law 2012-07-06 02:11:25 EDT
Dan WRT c#10.  It really depends -- the prior behaviour of AI_ADDRCONFIG caused a huge number of problems.  And unfortunately, as I mentioned, there's really no good way to return ::1 given the way this mess of spaghetti code is "structured" (and I use that term very loosely).  To handle the ::1 case we'd effectively have to back out the patch.

Ignoring the link-local addresses isn't something that's been accepted upstream yet (it's been proposed and asked for repeatedly and totally ignored); I haven't really pushed the issue as the upstream maintainers are really just getting on their feet after the change in maintainership.  So I wouldn't totally rule out reversion to prior behavior.

Rok, I'll have to look more closely at your tests; there's been numerous problems with the check_pf code, so it'd be helpful to know exactly what glibc build your using so I don't go off on a wild goose chase.
Comment 13 Rok Papez 2012-07-06 08:56:00 EDT
I might add that Heimdal maintainers weren't too happy with the changes to getaddrinfo, let me quote them:

"This differs from the definition of getaddrinfo(3) in POSIX.1-2001,
so it seems that the maintainers of glibc have decided to create
a gratuitous incompatibility with libc's that adhere to the relevant
standards."

Ignoring link-local might not be the best way; if IPv6 designers didn't want link-local communication they wouldn't come up with the link-local addressing.
There is more to IPv6 than only Internet, some applications (windows file sharing?) might want to work even if there is no IPv4 and global IPv6 connectivity.

The glibc is:

# rpm -q glibc -i
Name        : glibc
Version     : 2.15
Release     : 37.fc17
Architecture: i686
Install Date: Tue 29 May 2012 06:18:13 PM CEST
Group       : System Environment/Libraries
Size        : 14750287
License     : LGPLv2+ and LGPLv2+ with exceptions and GPLv2+
Signature   : RSA/SHA256, Fri 11 May 2012 08:10:31 PM CEST, Key ID 50e94c991aca3465
Source RPM  : glibc-2.15-37.fc17.src.rpm
Build Date  : Fri 11 May 2012 05:37:27 AM CEST
Build Host  : x86-18.phx2.fedoraproject.org
Relocations : (not relocatable)
Packager    : Fedora Project
Vendor      : Fedora Project
URL         : http://www.gnu.org/software/glibc/
[...]
Comment 14 Jeff Law 2012-07-25 10:00:49 EDT
*** Bug 843051 has been marked as a duplicate of this bug. ***
Comment 15 Jeff Law 2012-07-26 02:01:33 EDT
The problematical code to ignore link-local IPv6 addresses for the purposes of AI_ADDRCONFIG has been removed.  The problems folks were trying to solve with that change will have to be addressed in some other way in the upstream glibc sources.
Comment 16 Pavel Šimerda (pavlix) 2012-07-26 04:12:34 EDT
Any change this is getting to Fedora 17 (the target of the bugreport)?
Comment 17 Jeff Law 2012-07-26 14:43:42 EDT
It should be in the next f17 update; I installed the change last night, but haven't spun builds or issued an update request via koji yet.
Comment 18 Pavel Šimerda (pavlix) 2012-07-27 10:23:04 EDT
See upstream bug reports:

http://sourceware.org/bugzilla/show_bug.cgi?id=12377
http://sourceware.org/bugzilla/show_bug.cgi?id=12398
Comment 19 Tore Anderson 2012-07-31 09:02:12 EDT
(In reply to comment #15)
> The problematical code to ignore link-local IPv6 addresses for the purposes
> of AI_ADDRCONFIG has been removed.  The problems folks were trying to solve
> with that change will have to be addressed in some other way in the upstream
> glibc sources.

With all due respect, I think this is a mistake.

It will not really solve the issue reported in comment #0 - namely that getaddrinfo(::1) w/AI_ADDRCONFIG will fail on an IPv4-only system. To be specfic, after reverting the patch from bug #697149, you will still be able to reproduce the issue by disabling IPv6 on the external interfaces, like so:

> [root@laptop ~]# grep localhost /etc/hosts
> 127.0.0.1               localhost.localdomain localhost
> ::1             localhost6.localdomain6 localhost6
> [root@laptop ~]# sysctl -w net/ipv6/conf/wlan0/disable_ipv6=1
> net.ipv6.conf.wlan0.disable_ipv6 = 1
> [root@laptop ~]# ip address list
> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
>     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
>     inet 127.0.0.1/8 scope host lo
>     inet6 ::1/128 scope host
>        valid_lft forever preferred_lft forever
> 4: wlan0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
>     link/ether 00:16:ea:c2:ce:26 brd ff:ff:ff:ff:ff:ff
>     inet 87.238.41.161/24 brd 87.238.41.255 scope global wlan0
> [root@laptop ~]# gai ::1
> [         0us] begin gai_and_connect(::1)
> [+     1242us] getaddinfo(::1) done
> [+     1434us] dest = ::1 (AF_INET6)
> [root@laptop ~]# gai -ac ::1
> [         0us] -ac seen, using AI_ADDRCONFIG from now on
> [         0us] begin gai_and_connect(::1)
> [+     2696us] getaddrinfo(::1) failed: Address family for hostname not supported

This has been the way it getaddrinfo() and AI_ADDRCONFIG have worked for a very long time.

Reverting the patch will in other words only fix one out of serveral ways the issue may occur, and it does not address the underlying issue itself. Closing this report with "fixed in rawhide" is therefore inaccurate. In addition, you will re-open the can of worms that the patch was designed to fix in the first place, see these bug reports for reference - judging by the sheer number of subscribers and commenters, this is a very common issue that is experiences as very problematic for the affected users:

https://bugs.launchpad.net/ubuntu/+source/eglibc/+bug/417757
https://bugzilla.redhat.com/show_bug.cgi?id=505105

Furthermore, the exact same issue exists in the "opposite direction", too:

> [root@laptop ~]# grep localhost /etc/hosts
> 127.0.0.1               localhost.localdomain localhost
> ::1             localhost6.localdomain6 localhost6
> [root@laptop ~]# ip address list
> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
>     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
>     inet 127.0.0.1/8 scope host lo
>     inet6 ::1/128 scope host
>        valid_lft forever preferred_lft forever
> 4: wlan0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
>     link/ether 00:16:ea:c2:ce:26 brd ff:ff:ff:ff:ff:ff
>     inet6 2a02:c0:1002:101:5445:68e8:6c02:5e5a/64 scope global temporary dynamic
>        valid_lft 604564sec preferred_lft 85564sec
>     inet6 2a02:c0:1002:101:216:eaff:fec2:ce26/64 scope global dynamic
>        valid_lft 2591980sec preferred_lft 604780sec
>     inet6 fe80::216:eaff:fec2:ce26/64 scope link
>        valid_lft forever preferred_lft forever
> [root@laptop ~]# gai 127.0.0.1
> [         0us] begin gai_and_connect(127.0.0.1)
> [+     1239us] getaddinfo(127.0.0.1) done
> [+      230us] dest = 127.0.0.1 (AF_INET)
> [root@laptop ~]# gai -ac 127.0.0.1
> [         0us] -ac seen, using AI_ADDRCONFIG from now on
> [         0us] begin gai_and_connect(127.0.0.1)
> [+     1551us] getaddrinfo(127.0.0.1) failed: Address family for hostname not supported

This will not change by backing out the patch from bug #697149, either.

I have thought a bit on how to handle the problem in a better way, one that will not cancel out the primary function of AI_ADDRCONFIG (namely to suppress pointless and potentially harmful DNS lookups). My suggestion is to make getaddrinfo() ignore AI_ADDRCONFIG in the following circumstances:

1) when looking up literal IP addresses, and
2) when returning answers from /etc/hosts

#1 will fix this bug, while #2 will make it so that a hypothetical service configured to listen on "localhost" listens on both ::1 and 127.0.0.1, regardless of external connectivity (disregarding the apparant bug that in a default Fedora install, "localhost" does not resolve to ::1). Thoughts?

Tore
Comment 20 Pavel Šimerda (pavlix) 2012-07-31 09:27:53 EDT
> I have thought a bit on how to handle the problem in a better way, one that
> will not cancel out the primary function of AI_ADDRCONFIG (namely to
> suppress pointless and potentially harmful DNS lookups). My suggestion is to
> make getaddrinfo() ignore AI_ADDRCONFIG in the following circumstances:
> 
> 1) when looking up literal IP addresses, and
> 2) when returning answers from /etc/hosts

3) when returning answers from any future nss plugin that may return link-local
addresses (and node-local, if it's that case).

So this is an unlimited number of special cases. Did you try to think it the other way round? Specialcasing just DNS for the beginning? As far as I
understand, this is meant as an 'ugly hack' to work around particular DNS
misconfigurations and maybe an optimization technique for DNS.

This would work as a quick fix. The proper way would be to fix:

http://sourceware.org/bugzilla/show_bug.cgi?id=14413

Then every plugin would be able to provide full getaddrinfo() semantics including
AI_ADDRCONFIG and this could be done by nss-dns.
Comment 21 Rok Papez 2012-07-31 11:13:50 EDT
> I have thought a bit on how to handle the problem in a better way, one that
> will not cancel out the primary function of AI_ADDRCONFIG (namely to
> suppress pointless and potentially harmful DNS lookups).

- If this is the AI_ADDRCONFIG primary role, why is not documented as such in man pages?
- And why is it enabled by default in glibc, contrary to standard libc?

Quoting from man page:
"According to POSIX.1-2001, specifying hints as NULL should cause ai_flags to be assumed as 0. The GNU C library instead assumes a value of (AI_V4MAPPED | AI_ADDRCONFIG) for this case, since this value is considered an improvement on the specification."

POSIX.1-2001 compliant boxes will operate via link-local and loopback and GNU libc boxes will not?!

Also RFC2553 specifies http://www.ietf.org/rfc/rfc2553.txt:

"The AI_ADDRCONFIG flag specifies that a query for AAAA records
should occur only if the node has at least one IPv6 source
address configured and a query for A records should occur only
if the node has at least one IPv4 source address configured."

::1 *is* a configured source address.

Maybe the main problem is that different parties interpret getaddrinfo differently?

> My suggestion is to
> make getaddrinfo() ignore AI_ADDRCONFIG in the following circumstances:
> 
> 1) when looking up literal IP addresses, and
> 2) when returning answers from /etc/hosts
> 
> #1 will fix this bug, while #2 will make it so that a hypothetical service
> configured to listen on "localhost" listens on both ::1 and 127.0.0.1,
> regardless of external connectivity (disregarding the apparant bug that in a
> default Fedora install, "localhost" does not resolve to ::1). Thoughts?

I don't think adding a hack to bypass a problem an ugly hack created is a good solution.

======================================
The original problem is that some resolvers choke on AAAA
Comment 22 Rok Papez 2012-07-31 11:19:49 EDT
<previos post was truncated>
======================================
The original problem is that some resolvers choke on AAAA query and wait for a long time before another A query is sent and responded.

I would recommend the following:
1. back out the glibc extension and comply with POSIX.1-2001
2. handle resolving like smart resolvers from other vendors ;)
fire off two queries, one A and one AAAA. If A arrives and AAAA takes much longer, just return the results from query A. Add a setting to gai.conf to disable this behaviour. It's a hack afterall :-/.
3. whatever you do, *document* it in the man pages.
Comment 23 Pavel Šimerda (pavlix) 2012-07-31 13:05:17 EDT
> Quoting from man page:
> "According to POSIX.1-2001, specifying hints as NULL should cause ai_flags
> to be assumed as 0. The GNU C library instead assumes a value of
> (AI_V4MAPPED | AI_ADDRCONFIG) for this case, since this value is considered
> an improvement on the specification."

AI_V4MAPPED is void when used with AF_UNSPEC according to getaddrinfo(3). There
is no point in making it default. It doesn't do anything.

http://sourceware.org/bugzilla/show_bug.cgi?id=14415

AI_ADDRCONFIG does nothing when you only have addresses on *lo*. It only does something with IPv6 when you actually *have* IPv4. I consider this a bug.

> POSIX.1-2001 compliant boxes will operate via link-local and loopback and
> GNU libc boxes will not?!

We already agreed that this should *only* be applied to DNS.

http://sourceware.org/bugzilla/show_bug.cgi?id=12377#c18

> Also RFC2553 specifies http://www.ietf.org/rfc/rfc2553.txt:
> 
> "The AI_ADDRCONFIG flag specifies that a query for AAAA records
> should occur only if the node has at least one IPv6 source
> address configured and a query for A records should occur only
> if the node has at least one IPv4 source address configured."
> 
> ::1 *is* a configured source address.
> 
> Maybe the main problem is that different parties interpret getaddrinfo
> differently?

RFC 2553 is obsolete.

See http://tools.ietf.org/html/rfc3493#section-6.1

There is an exception for *lo*:

   If the AI_ADDRCONFIG flag is specified, IPv4 addresses shall be
   returned only if an IPv4 address is configured on the local system,
   and IPv6 addresses shall be returned only if an IPv6 address is
   configured on the local system.  The loopback address is not
   considered for this case as valid as a configured address.

> I don't think adding a hack to bypass a problem an ugly hack created is a
> good solution.

It's pretty clear that the solution is performing AI_ADDRCONFIG processing
only for DNS-originated addresses (ideally in the DNS plugin) as was specified by the obsolete RFC 2553. The new RFC 3493 is *wrong* on this matter.
Comment 24 Fedora Update System 2012-08-03 09:58:57 EDT
glibc-2.15-54.fc17 has been submitted as an update for Fedora 17.
https://admin.fedoraproject.org/updates/glibc-2.15-54.fc17
Comment 25 Fedora Update System 2012-08-15 18:53:48 EDT
glibc-2.15-54.fc17 has been pushed to the Fedora 17 stable repository.  If problems still persist, please make note of it in this bug report.
Comment 26 Tore Anderson 2012-08-16 07:43:47 EDT
(In reply to comment #25)
> glibc-2.15-54.fc17 has been pushed to the Fedora 17 stable repository.  If
> problems still persist, please make note of it in this bug report.

Yes, this issue still persists. Please re-open this bug report.

This console log proves it. It is from my F17 workstation, connected using an IPv4-only mobile broadband connection. I have not done any special tweaks to provoke the issue to appear, everything here is done in a «plug and play» manner as any ordinary user would.

> $ rpm -q glibc
> glibc-2.15-54.fc17.x86_64
> glibc-2.15-54.fc17.i686
> $ ip address list
> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN 
>     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
>     inet 127.0.0.1/8 scope host lo
>     inet6 ::1/128 scope host 
>        valid_lft forever preferred_lft forever
> 2: eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state DOWN qlen 1000
>     link/ether 00:1d:60:48:f5:9e brd ff:ff:ff:ff:ff:ff
> 4: usb0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state DOWN qlen 1000
>     link/ether 02:10:35:c6:da:36 brd ff:ff:ff:ff:ff:ff
> 8: ppp0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 3
>     link/ppp 
>     inet 188.148.184.20 peer 10.0.0.1/32 brd 188.148.184.20 scope global ppp0
> $ ./gai ::1
> [         0us] begin gai_and_connect(::1)
> [+      485us] getaddinfo(::1) done
> [+       83us] dest = ::1 (AF_INET6)
> [+        9us] about to connect()
> [+       75us] connect() suceeds
> 
> $ ./gai -ac ::1
> [         0us] -ac seen, using AI_ADDRCONFIG from now on
> 
> [         0us] begin gai_and_connect(::1)
> [+      550us] getaddrinfo(::1) failed: Address family for hostname not supported
> $ ssh ::1
> ssh: Could not resolve hostname ::1: Address family for hostname not supported

As I predicted, backing out the patch from bug #697149 did *not* solve the underlying issue here, which is exactly the same as the one reported in bug #721350.

I see two ways to handle this issue in a uniform way:

Option 1) Do the same with this bug as was done with #721350, in other words, consider it not a bug at all, but the intended behaviour.

Option 2) Re-open both this and bug #721350 and implement a solution that fixes both of them at the same time. One way would be to make getaddrinfo() disregard AI_ADDRCONFIG completely when working on IPv4/IPv6 literals, another could be to only apply AI_ADDRCONFIG filtering when doing external DNS lookups (RFC 2553-style AI_ADDRCONFIG). Both Pavel and me favours this latter approach - cf. http://sourceware.org/bugzilla/show_bug.cgi?id=12377.

Regardless of which approach is chosen, the patch from #697149 should be re-applied. As proven above, it is not the cause of this issue, and it provides a valuable improvement for users who happens to be behind defective IPv4-only DNS resolvers/forwarders.

Tore
Comment 27 Pavel Šimerda (pavlix) 2012-08-16 10:18:06 EDT
> As I predicted, backing out the patch from bug #697149 did *not* solve the
> underlying issue here, which is exactly the same as the one reported in bug
> #721350.

Correct. The patch should only affect link-local.

> 
> I see two ways to handle this issue in a uniform way:
> 
> Option 1) Do the same with this bug as was done with #721350,

That means mark it UPSTREAM and ignore it in Fedora. I would prefer tracking
potentially critical bugs like this also in rh bugzilla.

> in other
> words, consider it not a bug at all, but the intended behaviour.

^^^

> Option 2) Re-open both this and bug #721350 and implement a solution that
> fixes both of them at the same time.

Yes. They are very related or could be even viewed as duplicate.

> One way would be to make getaddrinfo()
> disregard AI_ADDRCONFIG completely when working on IPv4/IPv6 literals,
> another could be to only apply AI_ADDRCONFIG filtering when doing external
> DNS lookups (RFC 2553-style AI_ADDRCONFIG). Both Pavel and me favours this
> latter approach - cf. http://sourceware.org/bugzilla/show_bug.cgi?id=12377.

Acked.

> Regardless of which approach is chosen, the patch from #697149 should be
> re-applied.

In my opinion, it should only be re-applied after fixing the bug. As the patch extents the problem from node-local addresses to link-local addresses and we should not assume that nobody uses them.

> As proven above, it is not the cause of this issue, and it
> provides a valuable improvement for users who happens to be behind defective
> IPv4-only DNS resolvers/forwarders.

If used correctly, i.e. as you specified above. Currently the patch also causes problems to any users that need to work with LL addresses without IPv6 connectivity (not only those connected to defective networks).

The problem is that i filed bug 843051 that was marked duplicate but covers also LL addresses so I am treating this bug report also about LL addresses.

I'm definitely for reopening at least one bug report (even if it covers the whole issue for all types of addresses from node-local IPv4 through node-local IPv6 and link-local IPv6 to pseudo-global addresses (e.g. global address assigned to the lo interface) so that we have at least a way to track that.

Problems with other packages may (and already do) depend on this bug.
Comment 28 Gaofeng 2012-09-08 03:11:22 EDT
(In reply to comment #5)
> This now is a big mess. When trying to resolve localhost, AI_ADDRCONFIG
> works inconsistently for IPv6 but for IPv4 it will always happily include
> loopback address.
> 
> Reading the new manpages from:
> http://man7.org/linux/man-pages/man3/getaddrinfo.3.html
> the loopback should never be returned if it's the only interface in the
> system.
> 
> Here is the relevant portion:
> =============================
> If hints.ai_flags includes the AI_ADDRCONFIG flag, then IPv4 addresses are
> returned in the list pointed to by res only if the local system has at
> least one IPv4 address configured, and IPv6 addresses are only returned if
> the local system has at least one IPv6 address configured. The loopback
> address is not considered for this case as valid as a configured address.
> 
> Now for IPv6 getaddrinfo becomes inconsistent:
> 1. If I bring down all the eth* interfaces using ifdown it will start
> returning IPv6 loopback address.

I think this is incorrect,with AI_ADDRCONFIG flag,we should never return ipv6 
loopback address as manpages said.
we need do some check in getaddrinfo.in AI_ADDRCONFIG flag situation,if we
find there are no ip addresses(ipv4 and ipv6) other than ipv4 and ipv6 lo
address.we should return error.

> 2. If I bring down all the eth* interfaces using ifconfig it will NOT return
> IPv6 loopback address.
> 

ifconfig doesn't delete the eth*'s ipv4 address,so __check_pf will set seen_ipv4 
true and seen_ipv6 false, then getaddrinfo will change hints->ai_family to 
PF_INET.but we are using getaddrinfo to get the address info of "::1" which is
an ipv6 address. so the error(-EAI_ADDRFAMILY) will be returned in 
check_pf.c:519.

if you delete ipv4 address of the eth* interfaces,you will get ipv6 loopback 
address beeing returned.
Comment 29 Pavel Šimerda (pavlix) 2012-09-22 11:11:26 EDT
Created attachment 615840 [details]
TEMPORARY patch to ignore AI_ADDRCONFIG

As I don't know about anybody working on the proper solution of handling AI_ADDRCONFIG only for DNS, I've created a temporary path to just remove
it. It greatly improves my daily workflow, so I'm also posting a link
to a scratch build I'm using:

http://koji.fedoraproject.org/koji/taskinfo?taskID=4513546

Hope it helps.
Comment 30 Gaofeng 2012-11-05 01:21:16 EST
Last week,I get contact with McCann, Jack the author of RFC 3493.
He said that
"As for updating the RFC, that could be done, but as stated in the introduction,
the RFC is informational only, the actual API standard jointly owned by the
Open Group, IEEE, and ISO.  It looks like that standard *does not* include
the sentence "The loopback address is not considered for this case as
valid as a configured address."

int getaddrinfo(const char *node, const char *service,
                       const struct addrinfo *hints,
                       struct addrinfo **res);

So when the node actual is the loopback address.we should regard the system'loopback address as a validate address.

Maybe we should add some extra operations when the node is loopback address.
Comment 31 Pavel Šimerda (pavlix) 2012-11-05 07:40:38 EST
Thanks, great news someone is working on this.

The problem is, that the whole AI_ADDRCONFIG is about computers connected to non-IPv6 networks. This is why it was created. Because of broken DNS servers
(freely quoting Tore Anderson).

That means treating 127.0.0.1 and ::1 as IP addresses in the AI_ADDRCONFIG sense
is roughly the same as ignoring AI_ADDRCONFIG entirely (which is my own preference
until the problem is resolved).

The most helpful way to treat AI_ADDRCONFIG is to only use it for DNS and other global-only name resolution services. With glibc, that unfortunately requires
more code, but it wouldn't be wasted as it's needed for other useful stuff
to (e.g. link-local networking with multicast DNS). Then it should actually
disregard *all* IP addresses that don't constitute global connectivity (localhost
and link-local IPv6).

I can try to write down an RFC errata or review someone else's one.
Comment 32 Tore Anderson 2012-11-05 08:13:12 EST
(In reply to comment #31)
> Thanks, great news someone is working on this.
> 
> The problem is, that the whole AI_ADDRCONFIG is about computers connected to
> non-IPv6 networks. This is why it was created. Because of broken DNS servers
> (freely quoting Tore Anderson).

Yes, although I suspect that reducing bandwidth consumption and query load on the resolvers may have also been design goals. Whether or not that's relevant is 2012 is another question...

> The most helpful way to treat AI_ADDRCONFIG is to only use it for DNS and
> other global-only name resolution services.

100% agreed. AI_ADDRCONFIG makes plenty of sense for DNS lookups, but very little sense for e.g. /etc/hosts lookups. So ideally, the interpretation of AI_ADDRCONFIG should be left up to the NSS backend plugins. DNS should ignore loopbacks and link-locals when determining to apply AI_ADDRCONFIG, /etc/hosts should just ignore AI_ADDRCONFIG entirely, and mdns should probably ignore only loopback addresses, and so on.

> I can try to write down an RFC errata or review someone else's one.

I have another draft (on something completely unrelated) in the works, so I don't think I time to write this too myself, but I would be happy to review yours.

Tore
Comment 33 Tore Anderson 2012-11-05 08:21:22 EST
(In reply to comment #26)
> (In reply to comment #25)
> > glibc-2.15-54.fc17 has been pushed to the Fedora 17 stable repository.  If
> > problems still persist, please make note of it in this bug report.
> 
> Yes, this issue still persists. Please re-open this bug report.

This is still the case, by the way. The reported issue still persists. That this bug report has the status "CLOSED ERRATA" is quite simply wrong, it should be reopened to ensure it won't be forgotten about.

(See comment #26 for details.)

Tore
Comment 34 Pavel Šimerda (pavlix) 2012-11-05 08:45:25 EST
> and mdns should probably ignore only loopback addresses, and so on.

Possibly. Good point.

> I have another draft (on something completely unrelated) in the works, so I
> don't think I time to write this too myself, but I would be happy to review
> yours.

Thanks. I have another draft out there too, together with Fernando Gont, but
we didn't get many answers, I'm afraid. It's about broken RDNSS/DNSSL standards:

http://tools.ietf.org/html/draft-gont-6man-slaac-dns-config-issues-00

> This is still the case, by the way. The reported issue still persists. That
> this bug report has the status "CLOSED ERRATA" is quite simply wrong, it
> should be reopened to ensure it won't be forgotten about.

+1
Comment 35 Gaofeng 2012-11-13 21:16:10 EST
(In reply to comment #31)
> Thanks, great news someone is working on this.
> 
> The problem is, that the whole AI_ADDRCONFIG is about computers connected to
> non-IPv6 networks. This is why it was created. Because of broken DNS servers
> (freely quoting Tore Anderson).
> 
> That means treating 127.0.0.1 and ::1 as IP addresses in the AI_ADDRCONFIG
> sense
> is roughly the same as ignoring AI_ADDRCONFIG entirely (which is my own
> preference
> until the problem is resolved).
> 
> The most helpful way to treat AI_ADDRCONFIG is to only use it for DNS and
> other global-only name resolution services. With glibc, that unfortunately
> requires
> more code, but it wouldn't be wasted as it's needed for other useful stuff
> to (e.g. link-local networking with multicast DNS). Then it should actually
> disregard *all* IP addresses that don't constitute global connectivity
> (localhost
> and link-local IPv6).
> 

I have one question,if the system only has link-local and loopback addresses,
what's the return value of getaddrinfo("www.kame.net", "http", NULL, &res);

-9 (Address family for hostname not supported)
or
-2 (Name or service not known) ?

I think it should be -9,because getaddrinfo failed because the system doesn't have proper ip address. the -2 means getaddrinfo does dns lookup,but it failed.

Am I right?
Comment 36 Rok Papez 2012-11-14 03:48:10 EST
(In reply to comment #35)

> I have one question,if the system only has link-local and loopback addresses,
> what's the return value of getaddrinfo("www.kame.net", "http", NULL, &res);
> 
> -9 (Address family for hostname not supported)
> or
> -2 (Name or service not known) ?
> 
> I think it should be -9,because getaddrinfo failed because the system
> doesn't have proper ip address. the -2 means getaddrinfo does dns lookup,but
> it failed.
> 
> Am I right?

I disagree. EAI_NONAME (-2) seems to me, to be a better solution.

Systems with only link-local and loopback addresses can still communicate over IPv6. That's the whole point of link-local. Think of appliances and smart-sensors. Why configure a network at all, just plug them into an unrouted ethernet and they can communicate over IPv6 link-local addresses with the display unit.

Thinking "IPv6 == Internet" is just *wrong*. Internet is (just) one of the
IPv6 networks.

It also seems that EIA_ADDRFAMILY (-9) is a GNU extension, which is also bad :-(. Extending the return value of a standardised API and then not even documenting it as such, is something I wouldn't do.

# ifdef __USE_GNU
[...]
#  define EAI_ADDRFAMILY  -9    /* Address family for NAME not supported.  */
Comment 37 Pavel Šimerda (pavlix) 2012-11-14 11:15:03 EST
(In reply to comment #36)
> (In reply to comment #35)
> > I have one question,if the system only has link-local and loopback addresses,
> > what's the return value of getaddrinfo("www.kame.net", "http", NULL, &res);
> > 
> > -9 (Address family for hostname not supported)
> > or
> > -2 (Name or service not known) ?
> > 
> > I think it should be -9,because getaddrinfo failed because the system
> > doesn't have proper ip address. the -2 means getaddrinfo does dns lookup,but
> > it failed.
> > 
> > Am I right?
> 
> I disagree. EAI_NONAME (-2) seems to me, to be a better solution.

I must agree that I don't really care about the error returned.

But from the logical point of view, not supporting address family in getaddrinfo() is just stupid, unless there is a global compile-time
or runtime switch to disalbe *any* IPv6 processing. But that's not much
useful.

> Systems with only link-local and loopback addresses can still communicate
> over IPv6.

The same also applies to IPv4 with its loopback address and optional
link-local addresses. So the whole AI_ADDRCONFIG thing in its current
implementations is breaking specific subsets of both IPv4 and IPv6
communication for the sake of saving IPv4 from IPv6.

> That's the whole point of link-local. Think of appliances and
> smart-sensors. Why configure a network at all, just plug them into an
> unrouted ethernet and they can communicate over IPv6 link-local addresses
> with the display unit.

Exactly. This is how I found out about all those problems with getaddrinfo()
including the fact that it is not supported by GLIBC's nsswitch.

> Thinking "IPv6 == Internet" is just *wrong*. Internet is (just) one of the
> IPv6 networks.
> 
> It also seems that EIA_ADDRFAMILY (-9) is a GNU extension, which is also bad
> :-(. Extending the return value of a standardised API and then not even
> documenting it as such, is something I wouldn't do.

Especially when it's useless.

The CLOSED/ERRATA status is ridiculous.
Comment 38 Gaofeng 2012-11-15 01:57:20 EST
(In reply to comment #36)
> (In reply to comment #35)
> 
> > I have one question,if the system only has link-local and loopback addresses,
> > what's the return value of getaddrinfo("www.kame.net", "http", NULL, &res);
> > 
> > -9 (Address family for hostname not supported)
> > or
> > -2 (Name or service not known) ?
> > 
> > I think it should be -9,because getaddrinfo failed because the system
> > doesn't have proper ip address. the -2 means getaddrinfo does dns lookup,but
> > it failed.
> > 
> > Am I right?
> 
> I disagree. EAI_NONAME (-2) seems to me, to be a better solution.
> 
> Systems with only link-local and loopback addresses can still communicate
> over IPv6. That's the whole point of link-local. Think of appliances and
> smart-sensors. Why configure a network at all, just plug them into an
> unrouted ethernet and they can communicate over IPv6 link-local addresses
> with the display unit.
> 

The reason I expect the returned value is EIA_ADDRFAMILY (-9) is that,
I think EIA_ADDRFAMILY (-9) is a accurate value to tell user what causes
the dns lookup failed.it's not because the domain-name is incorrect. it's
because we can't do dns lookup.

> Thinking "IPv6 == Internet" is just *wrong*. Internet is (just) one of the
> IPv6 networks.

Yes, you are right,actually I want to make a test case for AI_ADDRCONFIG,so I need to know if AI_ADDRCONFIG takes effect. I have no idea when the returned value is -2. Do you have some good idea?

Thanks!

Another problem I think we should fix is what I mentioned in comment 28.
the glibc seems regard the already disabled address as a valid ip address.
This ip address can't be used to communicate with outside too.
Comment 39 Pavel Šimerda (pavlix) 2012-11-18 11:52:39 EST
> The reason I expect the returned value is EIA_ADDRFAMILY (-9) is that,
> I think EIA_ADDRFAMILY (-9) is a accurate value to tell user what causes
> the dns lookup failed.it's not because the domain-name is incorrect. it's
> because we can't do dns lookup.

This is wrong. We *can* do the DNS lookup for the other (supported) protocol and we *don't* know whether there's no answer because the domain name doesn't exist at all or it doesn't exist only in the supported protocol.

Therefore information you want to get from the error code is therefore impossible
to get if AI_ADDRCONF works properly, because we simply don't ask.
Comment 40 Pavel Šimerda (pavlix) 2012-11-19 09:22:42 EST
As this bug hasn't been fixed (neither in fedora master/branches, nor upstream), I feel obliged to remove CLOSED ERRATA when the maintainer doesn't to that himself.
Comment 41 Pavel Šimerda (pavlix) 2012-12-16 08:53:58 EST
Merging the bug reports to simplify the workflow.

*** This bug has been marked as a duplicate of bug 721350 ***

Note You need to log in before you can comment on or make changes to this bug.