Bug 844921

Summary:	AI_ADDRCONFIG does not suppress IN A lookups from IPv6-only hosts
Product:	[Fedora] Fedora	Reporter:	Tore Anderson <tore>
Component:	glibc	Assignee:	Jeff Law <law>
Status:	CLOSED UPSTREAM	QA Contact:	Fedora Extras Quality Assurance <extras-qa>
Severity:	unspecified	Docs Contact:
Priority:	unspecified
Version:	17	CC:	fweimer, jakub, law, orion, pfrankli, psimerda, schwab
Target Milestone:	---
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2012-08-21 22:01:40 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	2182745

Description Tore Anderson 2012-08-01 09:20:16 UTC

Description of problem:

When looking a host name using getaddrinfo() with AI_ADDRCONFIG from an IPv6-only host, IN A queries are sent to the DNS server. According to RFC 3493, they should not.

Version-Release number of selected component (if applicable):

glibc-2.15-51.fc17.i686

How reproducible:

100%

Steps to Reproduce:
1. Ensure the system is IPv6-only (it is optional remove 127.0.0.1/8 from the loopback interface, as it is ignored by getaddrinfo() for the purposes of determining whether or not the system has IPv4 connectivity or not). My test system gives the following output:

[tore@laptop ~]$ ip -4 address list; ip -4 route list
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN 
    inet 127.0.0.1/8 scope host lo
[tore@laptop ~]$ 

2. Ensure the system has a working IPv6 DNS server configured in /etc/resolv.conf, "echo nameserver 2001:4860:4860::8888 > /etc/resolv.conf" should do the trick.

3. Download and compile my getaddrinfo() test program from http://fud.no/gai.c (or use any other getaddrinfo() test program that can be made to use AI_ADDRCONFIG): "wget -O - http://fud.no/gai.c | gcc -x c -o gai -"

4. Start a tcpdump to inspect DNS server traffic: "tcpdump -i any -n port 53"

5. Resolve a host name using getaddrinfo() w/AI_ADDRCONFIG, e.g. "./gai -ac www.ripe.net"
  
Actual results:

The tcpdump process reports queries being made for both A and AAAA resource records:

11:16:00.685198 IP6 2a02:c0:1002:101:c449:7c3d:76b6:bbd1.48007 > 2001:4860:4860::8888.domain: 44105+ A? www.ripe.net. (30)
11:16:00.685235 IP6 2a02:c0:1002:101:c449:7c3d:76b6:bbd1.48007 > 2001:4860:4860::8888.domain: 35929+ AAAA? www.ripe.net. (30)
11:16:00.727253 IP6 2001:4860:4860::8888.domain > 2a02:c0:1002:101:c449:7c3d:76b6:bbd1.48007: 44105 1/0/0 A 193.0.6.139 (46)
11:16:00.727278 IP6 2001:4860:4860::8888.domain > 2a02:c0:1002:101:c449:7c3d:76b6:bbd1.48007: 35929 1/0/0 AAAA 2001:67c:2e8:22::c100:68b (58)

Expected results:

Only an AAAA record query should have been made; AI_ADDRCONFIG should have suppressed the A queries.

Additional info:

This works correctly in the "opposite direction". An IPv4-only host (disregarding the ::1/128 loopback adress and/or any link-local addresses from fe80::/10, which are ignored in the same way as 127.0.0.1/8 is), does *not* query for AAAA records when getaddrinfo() is called with AI_ADDRCONFIG.

Also, this works correctly on Ubuntu Precise, running libc6 version 2.15-0ubuntu10. I am therefore submitting this bug in the Fedora bug tracker rather than the upstream one, as looks possible that this is caused by a Fedora-specific patch.

Comment 1 Pavel Šimerda (pavlix) 2012-08-01 09:29:02 UTC

Confirming. Just adding that a fix to this one could break things unless AI_ADDRCONFIG is done only for DNS as discussed in:

http://sourceware.org/bugzilla/show_bug.cgi?id=12377

Comment 2 Jeff Law 2012-08-15 19:24:40 UTC

Unfortunately, I don't have any way to set up an IPV6 only system -- the only way I can do IPV6 right now is via tunneling.  If someone can get me access to an IPV6 only system where I could muck around (preferably inside a throw-away VM), it'd be greatly appreciated

The current F18 sources only have a few twiddles to getaddrinfo most of which are the same as what you'd find in Ubuntu.  There's one change from Andreas which isn't well documented that might (or might not) be related to the undesired behaviour.

Comment 3 Tore Anderson 2012-08-16 10:02:43 UTC

(In reply to comment #2)
> Unfortunately, I don't have any way to set up an IPV6 only system -- the
> only way I can do IPV6 right now is via tunneling.  If someone can get me
> access to an IPV6 only system where I could muck around (preferably inside a
> throw-away VM), it'd be greatly appreciated

If you send me an ssh pubkey, I can see if I can get this set up for you at work next week.

That said, if you already have IPv6 via tunneling, wouldn't it be easier for you to spin up an IPv6-only VM on your own workstation? Tunneled connectivity is more than sufficient in order to reproduce this bug.

Come to think of it, you don't need connectivity at all. If you configure a disconnected host with a fake IPv6 address like 2001:db8::1 on the loopback interface, set the resolv.conf nameserver entry to point to it, you'll see the unexpected IN A DNS requests in a tcpdump on the loopback interface:

$ ip -6 address add 2001:db8::1/128 dev lo
$ echo nameserver 2001:db8::1 > /etc/resolv.conf
$ ip -4 address list
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN 
    inet 127.0.0.1/8 scope host lo
$ tcpdump -i lo -n port 53
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on lo, link-type EN10MB (Ethernet), capture size 65535 bytes
11:58:55.610365 IP6 2001:db8::1.44948 > 2001:db8::1.domain: 8731+ A? foobar.com. (28)

You won't get to see any processing of results obviously (unless you set up a local DNS server that's authoritative for foobar.com), but at least you see the unwanted IN A queries being made.

Tore

Comment 4 Jeff Law 2012-08-16 15:55:06 UTC

I got the impression that we're issuing IN A lookups on IPV6-only hosts.  Tunneling IPV6 requires IPV4, so that would seem to be an unsuitable environment to track this down unless I'm missing something.

Setting up a disconnected VM with a fake IPV6 address on the loopback interface is a good idea.  As long as we can either see the queries on the loopback or capture them with a breakpoint in {send,sendto,sendmsg}, that should be sufficient to track down what patch is causing this behaviour.  If that's insufficient for some reason or another, I'll contact you with ssh keys.

In general, this stuff is well out of my area of expertise.  Thus, I'm going to be pushing to minimize the changes between Fedora & the upstream bits and encouraging resolution of as many issues as possible upstream.

Comment 5 Tore Anderson 2012-08-16 16:43:42 UTC

(In reply to comment #4)
> I got the impression that we're issuing IN A lookups on IPV6-only hosts. 
> Tunneling IPV6 requires IPV4, so that would seem to be an unsuitable
> environment to track this down unless I'm missing something.

Assuming your workstation has native IPv4 and tunneled IPv6, I was thinking that you could create a VM on it, using virt-manager or something like that, and avoid configuring any IPv4 addresses on it, only IPv6. So the hypervisor/VM host would be dual-stacked, but the VM would be single-stacked with IPv6 only.

> Setting up a disconnected VM with a fake IPV6 address on the loopback
> interface is a good idea.  As long as we can either see the queries on the
> loopback or capture them with a breakpoint in {send,sendto,sendmsg}, that
> should be sufficient to track down what patch is causing this behaviour.  If
> that's insufficient for some reason or another, I'll contact you with ssh
> keys.

Okay! I noticed that I don't see any AAAA queries, probably because the A query doesn't give any response, so in order to get the best testing, you should also install named and set it up to listen on the loopback address, and add a fake zone with A/AAAA records for a host name you can query for.

> In general, this stuff is well out of my area of expertise.  Thus, I'm going
> to be pushing to minimize the changes between Fedora & the upstream bits and
> encouraging resolution of as many issues as possible upstream.

Sounds great - there's been very little response upstream on the bugs I've filed unfortunately. But I guess that the more people taking an interest, the better the chances of upstream changes being made. :-)

Tore

Comment 6 Pavel Šimerda (pavlix) 2012-08-16 21:07:57 UTC

> Assuming your workstation has native IPv4 and tunneled IPv6, I was thinking
> that you could create a VM on it, using virt-manager or something like that,
> and avoid configuring any IPv4 addresses on it, only IPv6. So the
> hypervisor/VM host would be dual-stacked, but the VM would be single-stacked
> with IPv6 only.

I'd just like to add that virtualization is just a way to do the same things with a smaller number of physical host. Tunneling IPv6 in IPv4 on the router for dualstack and IPv6-only hosts is exactly the same case.

> > In general, this stuff is well out of my area of expertise.  Thus, I'm going
> > to be pushing to minimize the changes between Fedora & the upstream bits and
> > encouraging resolution of as many issues as possible upstream.

I would definitely not oppose patching Fedora with the clear intention to get the patches upstream. But all this should be done more carefully. Unfortunately I'm not currently able to deliver any patches as I have enough work continually learning my own project.

But I can test and do small stuff.

> Sounds great - there's been very little response upstream on the bugs I've
> filed unfortunately.

We could most probably have better response if we have patches. And getting them working in Fedora could also make our case better.

> But I guess that the more people taking an interest,
> the better the chances of upstream changes being made. :-)

I'd be definitely spreading the word (I've already started) but the next step would be to have a working implementation at hand. AI_ADDRCONFIG may actually be a great feature.

Comment 7 Jeff Law 2012-08-17 03:17:04 UTC

Haha, this works in Ubuntu because they blindly disable the gethostbyname4 capabilities from upstream.  The gethostbyname4 interface does the A & AAAA queries in parallel over the same socket.  It's one of the many kludges folks have tried to deal with braindead DNS servers, particularly in consumer products.

ISTM the better thing to do would be to avoid gethostbyname4 when we're suppressing the A lookup.  [ When we're suppressing the AAAA lookup, we use gethostbyname2, hence the asymmetry noted in the original report. ]



Pavel -- this is beyond patching fedora with the intention of upstreaming.  Nearly every patch in fedora glibc already has the intention to upstream.  What I'm saying is that, the policy needs to return to "upstream first" with minimal local hackery -- doubly so for areas like this.

In the past there were some significant issues with the "upstream first" policy leaving too many issues unresolved due to issues with the upstream maintainers.  I see significant improvement in the upstream responsiveness.

I also relaxed the "upstream first" policy for a period of time when I took over glibc for Fedora & RHEL to get the insane backlog of problems under control.  But I think it's time to return to the upstream first policy to the fullest extent possible.

Comment 8 Tore Anderson 2012-08-17 12:41:15 UTC

(In reply to comment #7)
> Haha, this works in Ubuntu because they blindly disable the gethostbyname4
> capabilities from upstream.  The gethostbyname4 interface does the A & AAAA
> queries in parallel over the same socket.  It's one of the many kludges
> folks have tried to deal with braindead DNS servers, particularly in
> consumer products.

Okay. This doesn't quite match with my test results for the disconnected host scenario I described in comment #3, though. If there was no available local nameserver, all I could see in the tcpdump was the IN A lookups. It appeared that glibc saw that the IN A lookup failed, and therefore didn't bother to kick off the IN AAAA query. That does not seem like parallel queries to me? (If I set up a local name server, both the IN A and IN AAAA queries were seen.)

> ISTM the better thing to do would be to avoid gethostbyname4 when we're
> suppressing the A lookup.  [ When we're suppressing the AAAA lookup, we use
> gethostbyname2, hence the asymmetry noted in the original report. ]

I'm not familiar enough with the glibc internals to comment on this. I'm a networking guy, really, not a coder. I can do trivial patches, but that's about it I'm afraid.

> Pavel -- this is beyond patching fedora with the intention of upstreaming. 
> Nearly every patch in fedora glibc already has the intention to upstream. 
> What I'm saying is that, the policy needs to return to "upstream first" with
> minimal local hackery -- doubly so for areas like this.
> 
> In the past there were some significant issues with the "upstream first"
> policy leaving too many issues unresolved due to issues with the upstream
> maintainers.  I see significant improvement in the upstream responsiveness.
> 
> I also relaxed the "upstream first" policy for a period of time when I took
> over glibc for Fedora & RHEL to get the insane backlog of problems under
> control.  But I think it's time to return to the upstream first policy to
> the fullest extent possible.

Fair enough. So - do you want me to re-submit this bug in the sourceware.org bugzilla, or will you take it upstream?

Tore

Comment 9 Jeff Law 2012-08-17 15:51:13 UTC

I'll take it upstream since I've got some context.  If there's quesetions from the maintainers that I can't answer, I'll probalby have to defer to you or someone else with more knowledge of the networking standards.

Comment 10 Jeff Law 2012-08-21 22:01:40 UTC

We'll pick this up via upstream if/when they fix it.  A potential patch has been posted for review.

Comment 11 Tore Anderson 2012-08-22 04:24:10 UTC

Okay. I'd be happy to test that patch - where has it been posted? It's not attached in the sourceware.org bug report as far as I can tell.

Comment 12 Jeff Law 2012-08-22 16:37:10 UTC

It was posted to the glibc development list.  I've just attached it to the upstream BZ (14505) as well.