Bug 188364
Summary: | glibc RFC3484 code favours site-local IPv6 connection to global IPv6 address with new kernels | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | David Woodhouse <dwmw2> |
Component: | glibc | Assignee: | Jakub Jelinek <jakub> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Brian Brock <bbrock> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 5 | CC: | davem, drepper, pekkas, redhat-bugzilla, wtogami |
Target Milestone: | --- | Keywords: | Reopened |
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | 2.4-6 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2006-05-29 05:24:14 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
David Woodhouse
2006-04-08 14:48:01 UTC
I lie... if I use 'telnet' to an address which hasn't already been attempted, then it fails similarly.... hades /home/dwmw2 $ telnet www.sixxs.net 80 Trying 2001:838:1:1:210:dcff:fe20:7c7c... telnet: connect to address 2001:838:1:1:210:dcff:fe20:7c7c: Connection timed outTrying 213.197.29.32... Connected to www.sixxs.net (213.197.29.32). Escape character is '^]'. I know that site-local addresses are (foolishly) being deprecated -- but they do still exist so hopefully this is just an oversight, and glibc hasn't intentionally stopped working in this case? It's not as if the fec0::/48 range is being used for anything _else_ yet (and in the foreseeable future, if ever). Reassigning to glibc. Thanks. I know that RFC3879 seems to say that you MUST break existing site-local deployment with immediate effect, but that doesn't count because they're on crack, right? Or am I missing something -- is there something else I could do to deploy IPv6 internally, without having to have real global addresses and routing? As far as I can tell, the deprecation of site-local addresses would be a massive barrier to internal deployment and testing of IPv6. Well, rfc 3879 cannot retroactively change implementations. I'll keep the site-local sorting rules in unless somebody gives a really good reason. rfc 3879 declares that the address range cannot be reused so that's not easy. The problem here was not related to that, though. It was exposed because of the IPv6 site-local address being used but that was not the bug. Anyway, I think we should go on and finally implement and enable rfc 4193 fully. That's the replacement for site-local addresses and I can see that it is simpler. Isn't this what the router advertisement daemon is about? I don't see RFC4193 adding anything back to RFC3484, so I don't understand how it can work as a replacement for site-local addresses. If we just replace our existing site-local addresses with these 'Unique Local IPv6 Unicast Addresses', won't we end up trying to use those addresses to talk to Global IPv6 addresses outside our network, instead of using IPv4? We'd end up with precisely the same behaviour which led to this bug being filed -- a three-minute delay before each connection, while IPv6 is (wrongly) tried and fails. We _need_ the separate scopes, so that a machine can have a site-local address and know that it's _not_ suitable for communication with global addresses. Without that functionality, we can't deploy IPv6 internally unless we get the IS department to give us full global IPv6 routing... which just isn't going to happen any time soon. Unless, as I said, I'm missing something. All gateway machines/routers should block all RFC 4193 addresses from leaving the local network. This is a change which must be implemented by the IT department. Ideally all these machines have a simple switch to enable it. In this respect the new address range isn't different from the site-local addresses. The big difference is that the built-in structure of the new address range makes it unlikely that leaked addresses are causing problems because there is a good chance that the addresses are globally unique. This also helps, as the RFC says, when merging sites (after an aquisition or so). With these changes the RFC 4193 has some advantages. Whether all the implementations in routers etc is up-to-date is another question. All IPv6 packets are _already_ prevented from leaving the local network (by which I mean the 'intranet'). We just don't have a route to the outside world. To do _otherwise_ would require assistance (or at least _approval_) from the IT department which isn't going to happen. The problem with RFC4193 is that the hosts do actually _generate_ such packets, which then fail to get out. With the current scheme using site-local addressing, each machine _knows_ that it can't contact a global IPv6 address from its site-local address, so it just uses IPv4 instead. From my point of view, the biggest difference between the two schemes is that the 'new' scheme lacks the useful RFC3484 address-selection magic, so can't be used for an internal-only IPv6 deployment without breaking external connectivity. The 'global unique' part is just mutual masturbation on the part of the IPv6 folks -- these addresses were site-local anyway, so had no more need to be globally unique than link-local addresses do. It's true, there is a need to update RFC 3484 for this. The problem is, I think, the scope value used. If a machine has only an IPv4 site-local address (10.x, 172.16.x, 192.168.x) and a RFC4193-style IPv6 address, then the scope value for an external machine with both IPv4 and IPv6 address would be 14 and the scopes for the local addresses would be 5 and 14 respectively. This would cause the IPv6 address to be preferred. What likely should happen is that the RFC 4193 addresses either get their own scope values or they are treated like the old site-local addresses (i.e., the IN6_IS_ADDR_SITELOCAL macro would be extended). But RFC 3484 adds one thing which we haven't implemented so far: extensions by the user. With a site config file the implementation could be modified accordingly. This is how the authors of RFC 3484 will likely respond when asked to deal with RFC 4193. Should be fixed in glibc-2.4.90-1. Reopening. The 'fix' in the new glibc isn't really the right answer -- it's just making us prefer IPv4 _all_ the time, even when we should be using IPv6. The real problem here is that glibc's trick of UDP connect() and then getsockname() (see http://people.redhat.com/drepper/linux-rfc3484.html) is no longer working in the FC5 kernel. When we have no Global IPv6 address, this is what used to happen when glibc was asked to look up 'pmac.infradead.org', for example... socket(PF_INET6, SOCK_DGRAM, IPPROTO_IP) = 3 connect(3, {sa_family=AF_INET6, sin6_port=htons(80), inet_pton(AF_INET6, "2001:8b0:10b:1:20d:93ff:fe7a:3f2c", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = -1 EADDRNOTAVAIL (Cannot assign requested address) socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 3 connect(3, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("81.187.2.168")}, 16) = 0 getsockname(3, {sa_family=AF_INET, sin_port=htons(33450), sin_addr=inet_addr("172.16.18.126")}, [16]) = 0 At this point, glibc quite sanely returns the IPv4 address first. On the FC5 kernel, however, the IPv6 connect() appears to succeed even though there is no IPv6 address with Global scope. Should we be allowing a connect() which has mismatching scope? Adding DaveM to Cc. If this isn't a kernel bug, then it's just an extremely unfortunate 'feature' -- and we need agreement on the way forward for glibc since we broke what we _used_ to tell Uli to do. Reassigning to glibc to make sure this doesn't get missed -- please could we make sure it's in the first glibc update for FC5, since we do have a simple fix which restores the functionality we always used to have? I believe that Uli wants to do a label for ULA addresses as well as site-local addresses, at the same time. I'm happy enough with that. Patch is attached to bug #190495 https://bugzilla.redhat.com/bugzilla/attachment.cgi?id=129314 Fixed in 2.4-6 update from FC5. |