Bug 188364

Summary:	glibc RFC3484 code favours site-local IPv6 connection to global IPv6 address with new kernels
Product:	[Fedora] Fedora	Reporter:	David Woodhouse <dwmw2>
Component:	glibc	Assignee:	Jakub Jelinek <jakub>
Status:	CLOSED CURRENTRELEASE	QA Contact:	Brian Brock <bbrock>
Severity:	medium	Docs Contact:
Priority:	medium
Version:	5	CC:	davem, drepper, pekkas, redhat-bugzilla, wtogami
Target Milestone:	---	Keywords:	Reopened
Target Release:	---
Hardware:	All
OS:	Linux
Whiteboard:
Fixed In Version:	2.4-6	Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2006-05-29 05:24:14 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description David Woodhouse 2006-04-08 14:48:01 UTC

There's a three-minute delay in loading some web pages, because Firefox is
attempting to connect to a Global IPv6 address from a Site-local IPv6 address.
It ought to be trying IPv4 before it tries that.

I don't _think_ this is a glibc getaddrinfo() bug, because other apps seem to be
working correctly.

15:42:02.260474 IP hades.cambridge.redhat.com.32954 >
ns1.cambridge.redhat.com.domain:  20094+ AAAA? www.kame.net. (30)
15:42:02.699718 IP ns1.cambridge.redhat.com.domain >
hades.cambridge.redhat.com.32954:  20094 1/2/1 AAAA orange.kame.net (123)
15:42:02.705187 IP hades.cambridge.redhat.com.32955 >
ns1.cambridge.redhat.com.domain:  41459+ PTR?
5.8.0.3.5.a.e.f.f.f.7.4.3.0.2.0.2.0.0.8.0.0.0.0.0.0.2.0.1.0.0.2.ip6.arpa. (90)
15:42:02.705346 IP hades.cambridge.redhat.com.32956 >
ns1.cambridge.redhat.com.domain:  37239+ A? www.kame.net. (30)
15:42:03.000418 IP ns1.cambridge.redhat.com.domain >
hades.cambridge.redhat.com.32956:  37239 1/2/1 A orange.kame.net (111)
15:42:03.008973 IP6 fec0::1:202:b3ff:fe03:45c1.60408 > orange.kame.net.http: S
1916570260:1916570260(0) win 5760 <mss 1440,sackOK,timestamp 865226624
0,nop,wscale 2>
15:42:03.734777 IP ns1.cambridge.redhat.com.domain >
hades.cambridge.redhat.com.32955:  41459 1/1/0 PTR orange.kame.net. (133)
15:42:03.745118 IP hades.cambridge.redhat.com.32956 >
ns1.cambridge.redhat.com.domain:  6692+ PTR? 194.141.178.203.in-addr.arpa. (46)
15:42:06.009052 IP6 fec0::1:202:b3ff:fe03:45c1.60408 > orange.kame.net.http: S
1916570260:1916570260(0) win 5760 <mss 1440,sackOK,timestamp 865227374
0,nop,wscale 2>
15:42:06.322587 IP ns1.cambridge.redhat.com.domain >
hades.cambridge.redhat.com.32956:  6692 1/3/5 PTR orange.kame.net. (264)
15:42:06.350901 IP hades.cambridge.redhat.com.32956 >
ns1.cambridge.redhat.com.domain:  49+ PTR?
1.c.5.4.3.0.e.f.f.f.3.b.2.0.2.0.1.0.0.0.0.0.0.0.0.0.0.0.0.c.e.f.ip6.arpa. (90)
15:42:06.351413 IP ns1.cambridge.redhat.com.domain >
hades.cambridge.redhat.com.32956:  49 NXDomain 0/1/0 (151)
15:45:12.041167 IP hades.cambridge.redhat.com.40981 > orange.kame.net.http: S
2103899813:2103899813(0) win 5840 <mss 1460,sackOK,timestamp 865273876
0,nop,wscale 2>
15:45:12.461298 IP orange.kame.net.http > hades.cambridge.redhat.com.40981: S
1070976691:1070976691(0) ack 2103899814 win 57344 <mss 1460,nop,wscale
0,nop,nop,timestamp 322004325 865273876>
15:45:12.461395 IP hades.cambridge.redhat.com.40981 > orange.kame.net.http: .
ack 1 win 1460 <nop,nop,timestamp 865273981 322004325>
15:45:12.462403 IP hades.cambridge.redhat.com.40981 > orange.kame.net.http: P
1:425(424) ack 1 win 1460 <nop,nop,timestamp 865273982 322004325>

Comment 1 David Woodhouse 2006-04-08 14:52:26 UTC

I lie... if I use 'telnet' to an address which hasn't already been attempted,
then it fails similarly....

hades /home/dwmw2 $ telnet www.sixxs.net 80
Trying 2001:838:1:1:210:dcff:fe20:7c7c...
telnet: connect to address 2001:838:1:1:210:dcff:fe20:7c7c: Connection timed
outTrying 213.197.29.32...
Connected to www.sixxs.net (213.197.29.32).
Escape character is '^]'.

I know that site-local addresses are (foolishly) being deprecated -- but they do
still exist so hopefully this is just an oversight, and glibc hasn't
intentionally stopped working in this case? It's not as if the fec0::/48 range
is being used for anything _else_ yet (and in the foreseeable future, if ever).

Reassigning to glibc.

Comment 4 David Woodhouse 2006-04-16 12:04:28 UTC

Thanks. I know that RFC3879 seems to say that you MUST break existing site-local
deployment with immediate effect, but that doesn't count because they're on
crack, right?

Or am I missing something -- is there something else I could do to deploy IPv6
internally, without having to have real global addresses and routing?

As far as I can tell, the deprecation of site-local addresses would be a massive
barrier to internal deployment and testing of IPv6.

Comment 5 Ulrich Drepper 2006-04-17 15:52:57 UTC

Well, rfc 3879 cannot retroactively change implementations.  I'll keep the
site-local sorting rules in unless somebody gives a really good reason.  rfc
3879 declares that the address range cannot be reused so that's not easy.  The
problem here was not related to that, though.  It was exposed because of the
IPv6 site-local address being used but that was not the bug.

Anyway, I think we should go on and finally implement and enable rfc 4193 fully.
 That's the replacement for site-local addresses and I can see that it is
simpler.  Isn't this what the router advertisement daemon is about?

Comment 6 David Woodhouse 2006-04-17 16:04:57 UTC

I don't see RFC4193 adding anything back to RFC3484, so I don't understand how
it can work as a replacement for site-local addresses. If we just replace our
existing site-local addresses with these 'Unique Local IPv6 Unicast Addresses',
won't we end up trying to use those addresses to talk to Global IPv6 addresses
outside our network, instead of using IPv4?

We'd end up with precisely the same behaviour which led to this bug being filed
-- a three-minute delay before each connection, while IPv6 is (wrongly) tried
and fails.

We _need_ the separate scopes, so that a machine can have a site-local address
and know that it's _not_ suitable for communication with global addresses.
Without that functionality, we can't deploy IPv6 internally unless we get the IS
department to give us full global IPv6 routing... which just isn't going to
happen any time soon.

Unless, as I said, I'm missing something.

Comment 7 Ulrich Drepper 2006-04-17 16:18:32 UTC

All gateway machines/routers should block all RFC 4193 addresses from leaving
the local network.  This is a change which must be implemented by the IT
department.  Ideally all these machines have a simple switch to enable it.

In this respect the new address range isn't different from the site-local
addresses.  The big difference is that the built-in structure of the new address
range makes it unlikely that leaked addresses are causing problems because there
is a good chance that the addresses are globally unique.  This also helps, as
the RFC says, when merging sites (after an aquisition or so).

With these changes the RFC 4193 has some advantages.  Whether all the
implementations in routers etc is up-to-date is another question.

Comment 8 David Woodhouse 2006-04-17 16:28:01 UTC

All IPv6 packets are _already_ prevented from leaving the local network (by
which I mean the 'intranet'). We just don't have a route to the outside world.
To do _otherwise_ would require assistance (or at least _approval_) from the IT
department which isn't going to happen. 

The problem with RFC4193 is that the hosts do actually _generate_ such packets,
which then fail to get out. With the current scheme using site-local addressing,
each machine _knows_ that it can't contact a global IPv6 address from its
site-local address, so it just uses IPv4 instead.

From my point of view, the biggest difference between the two schemes is that
the 'new' scheme lacks the useful RFC3484 address-selection magic, so can't be
used for an internal-only IPv6 deployment without breaking external connectivity.

The 'global unique' part is just mutual masturbation on the part of the IPv6
folks -- these addresses were site-local anyway, so had no more need to be
globally unique than link-local addresses do.

Comment 9 Ulrich Drepper 2006-04-17 20:33:24 UTC

It's true, there is a need to update RFC 3484 for this.  The problem is, I
think, the scope value used.  If a machine has only an IPv4 site-local address
(10.x, 172.16.x, 192.168.x) and a RFC4193-style IPv6 address, then the scope
value for an external machine with both IPv4 and IPv6 address would be 14 and
the scopes for the local addresses would be 5 and 14 respectively.  This would
cause the IPv6 address to be preferred.

What likely should happen is that the RFC 4193 addresses either get their own
scope values or they are treated like the old site-local addresses (i.e., the
IN6_IS_ADDR_SITELOCAL macro would be extended).

But RFC 3484 adds one thing which we haven't implemented so far: extensions by
the user.  With a site config file the implementation could be modified
accordingly.  This is how the authors of RFC 3484 will likely respond when asked
to deal with RFC 4193.

Comment 10 Jakub Jelinek 2006-04-25 15:28:36 UTC

Should be fixed in glibc-2.4.90-1.

Comment 11 David Woodhouse 2006-05-05 20:28:23 UTC

Reopening. The 'fix' in the new glibc isn't really the right answer -- it's just
making us prefer IPv4 _all_ the time, even when we should be using IPv6.

The real problem here is that glibc's trick of UDP connect() and then
getsockname() (see http://people.redhat.com/drepper/linux-rfc3484.html) is no
longer working in the FC5 kernel.

When we have no Global IPv6 address, this is what used to happen when glibc was
asked to look up 'pmac.infradead.org', for example...

socket(PF_INET6, SOCK_DGRAM, IPPROTO_IP) = 3
connect(3, {sa_family=AF_INET6, sin6_port=htons(80), inet_pton(AF_INET6,
"2001:8b0:10b:1:20d:93ff:fe7a:3f2c", &sin6_addr), sin6_flowinfo=0,
sin6_scope_id=0}, 28) = -1 EADDRNOTAVAIL (Cannot assign requested address)
socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 3
connect(3, {sa_family=AF_INET, sin_port=htons(80),
sin_addr=inet_addr("81.187.2.168")}, 16) = 0
getsockname(3, {sa_family=AF_INET, sin_port=htons(33450),
sin_addr=inet_addr("172.16.18.126")}, [16]) = 0

At this point, glibc quite sanely returns the IPv4 address first.

On the FC5 kernel, however, the IPv6 connect() appears to succeed even though
there is no IPv6 address with Global scope. Should we be allowing a connect()
which has mismatching scope?

Comment 12 David Woodhouse 2006-05-06 00:17:52 UTC

Adding DaveM to Cc. If this isn't a kernel bug, then it's just an extremely
unfortunate 'feature' -- and we need agreement on the way forward for glibc
since we broke what we _used_ to tell Uli to do.

Comment 13 David Woodhouse 2006-05-17 12:08:55 UTC

Reassigning to glibc to make sure this doesn't get missed -- please could we
make sure it's in the first glibc update for FC5, since we do have a simple fix
which restores the functionality we always used to have?

I believe that Uli wants to do a label for ULA addresses as well as site-local
addresses, at the same time. I'm happy enough with that.

Comment 14 David Woodhouse 2006-05-17 13:31:20 UTC

Patch is attached to bug #190495
https://bugzilla.redhat.com/bugzilla/attachment.cgi?id=129314

Comment 15 Ulrich Drepper 2006-05-29 05:24:14 UTC

Fixed in 2.4-6 update from FC5.