Bug 1130328
Summary: | subversion won't fall back to IPv4 on IPv6 failure | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Pavel Šimerda (pavlix) <psimerda> | ||||||
Component: | libserf | Assignee: | Joe Orton <jorton> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | rawhide | CC: | code, i, jorton, msehnout, pemensik, ppisar, tkorbar, vanmeeuwen+fedora | ||||||
Target Milestone: | --- | Keywords: | Reopened, Tracking | ||||||
Target Release: | --- | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | libserf-1.3.9-12.fc31 libserf-1.3.9-26.fc38 | Doc Type: | Bug Fix | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2023-01-31 14:48:20 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 883152 | ||||||||
Attachments: |
|
Description
Pavel Šimerda (pavlix)
2014-08-14 21:33:39 UTC
I saw the same issue on my system: apr-1.5.1-2.fc21.x86_64 subversion-1.8.9-2.fc21.x86_64 hth, Michele From the strace (empty lines are just for easier reading, this is a continuous log): 19766 socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 8 19766 connect(8, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("185.49.140.10")}, 16) = 0 19766 getsockname(8, {sa_family=AF_INET, sin_port=htons(58343), sin_addr=inet_addr("84.246.161.86")}, [16]) = 0 19766 close(8) = 0 Attempt to connect to the IPv4 address but see the IPPROTO_IP and getsockname. Did it just check which source address would be used and then closed the socket? 19766 socket(PF_INET6, SOCK_DGRAM, IPPROTO_IP) = 8 19766 connect(8, {sa_family=AF_INET6, sin6_port=htons(0), inet_pton(AF_INET6, "2a04:b900::1:0:0:10", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = 0 19766 getsockname(8, {sa_family=AF_INET6, sin6_port=htons(45861), inet_pton(AF_INET6, "2a00:1268:1ff:f001:21f:3cff:fe1b:9e5e", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 0 19766 close(8) = 0 Same for IPv6. 19766 socket(PF_INET6, SOCK_STREAM|SOCK_CLOEXEC, IPPROTO_TCP) = 8 19766 fcntl(8, F_GETFL) = 0x2 (flags O_RDWR) 19766 fcntl(8, F_SETFL, O_RDWR|O_NONBLOCK) = 0 19766 setsockopt(8, SOL_TCP, TCP_NODELAY, [1], 4) = 0 19766 connect(8, {sa_family=AF_INET6, sin6_port=htons(80), inet_pton(AF_INET6, "2a04:b900::1:0:0:10", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = -1 EINPROGRESS (Operation now in progress) Now the real connect but with a non-blocking socket, so no final result. 19766 epoll_ctl(7, EPOLL_CTL_DEL, 8, {0, {u32=0, u64=0}}) = -1 ENOENT (No such file or directory) 19766 epoll_ctl(7, EPOLL_CTL_ADD, 8, {EPOLLIN|EPOLLOUT, {u32=23311088, u64=23311088}}) = 0 19766 epoll_wait(7, {{EPOLLIN|EPOLLOUT|EPOLLERR|EPOLLHUP, {u32=23311088, u64=23311088}}}, 16, 500) = 1 19766 read(8, 0x16460c4, 8000) = -1 ECONNREFUSED (Connection refused) And here it is, ECONNREFUSED upon read but no attempt to retry with IPv4. 19766 epoll_ctl(7, EPOLL_CTL_DEL, 8, {0, {u32=0, u64=0}}) = 0 19766 brk(0x166e000) = 0x166e000 19766 close(8) = 0 19766 close(-1) = -1 EBADF (Bad file descriptor) 19766 close(7) = 0 19766 write(2, "Connection refused: Unable to co"..., 216) = 216 The error message. The calls found in the above strace seem to be done by libserf. There is code in serf which tries to iterate through the address list on failure. There should be a getsockopt() call in there to retrieve the "real" connect() error if the code is working correctly - looks like that is not showing up in your strace? Might be worth checking with upstream. (In reply to Joe Orton from comment #4) > There is code in serf which tries to iterate through the address list on > failure. +1 for the switched component, I still wasn't sure whether the problem is in serf or in apr. > There should be a getsockopt() call in there to retrieve the > "real" connect() error if the code is working correctly I'm curious. What getsockopt() are you talking about and how can you retrieve a state of a non-blocking socket before trying to use it? > looks like that is not showing up in your strace? Nope. > Might be worth checking with upstream. Definitely. On getsockopt() I meant this stuff: https://code.google.com/p/serf/source/browse/trunk/outgoing.c#1381 When the connect fails the epoll_wait should return the error then serf should use getsockopt/SO_ERROR to retrieve the error for the failure of the non-blocking connect, rather than attempting I/O on the connection and *then* seeing the error. Do you have time to chase this upstream? One other thing: there was a rebase of libserf just this week in Fedora so make sure you have 1.3.7. (In reply to Joe Orton from comment #7) > One other thing: there was a rebase of libserf just this week in Fedora so > make sure you have 1.3.7. I originally found the bug with Gentoo and libserf 1.3.7. (In reply to Joe Orton from comment #6) > On getsockopt() I meant this stuff: > > https://code.google.com/p/serf/source/browse/trunk/outgoing.c#1381 > > When the connect fails the epoll_wait should return the error then serf > should use getsockopt/SO_ERROR to retrieve the error for the failure of the > non-blocking connect, rather than attempting I/O on the connection and > *then* seeing the error. So the expected behavior is to call getsockopt instead of read/write. > Do you have time to chase this upstream? Yep, I will find some, should I assign the bug to myself for now? *** Bug 1238745 has been marked as a duplicate of this bug. *** It doesn't make sense to keep it with F21, moving to rawhide for now but we can change it later if needed. This bug appears to have been reported against 'rawhide' during the Fedora 24 development cycle. Changing version to '24'. More information and reason for this action is here: https://fedoraproject.org/wiki/Fedora_Program_Management/HouseKeeping/Fedora24#Rawhide_Rebase This package has changed ownership in the Fedora Package Database. Reassigning to the new owner of this component. This package has changed ownership in the Fedora Package Database. Reassigning to the new owner of this component. please, can you report this to upstream? RHBZ is not appropriate place for upstream bugs. I have no plan to report upstream myself right now. This bug was created as part of a project to improve Fedora IPv6 support. Just for your information, I'm not taking any steps right now. Igor, I would like to ask you as the package maintainer to report this to upstream, as Fedora users are usually not interacting with upstream directly. This is an issue in the Fedora version. It is not up to the reporter, but up to the maintainer to work with the upstream to forward them the bug report. Also CLOSED UPSTREAM is used for closing bugs which were reported to upstream and are tracked there. Please provide a pointer to upstream bug, until then I'm reopening this bug. Thanks. This bug appears to have been reported against 'rawhide' during the Fedora 27 development cycle. Changing version to '27'. Any update on this bug? I did not find any upstream issue. This message is a reminder that Fedora 27 is nearing its end of life. On 2018-Nov-30 Fedora will stop maintaining and issuing updates for Fedora 27. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '27'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 27 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. Any update on this bug? It is annoying if upstream has ipv6 support in addresses, but network does not allow it. Cannot be overriden even by svn parameters. nlnetlabs uses it still for unbound and ldns project, but we have to use netresolve to work around it. Related upstream issue might be [1], but that should be already fixed in 1.3.3. 1. https://issues.apache.org/jira/browse/SERF-129?jql=text%20~%20%22ipv6%20serf%22 Unbound error: svn co https://nlnetlabs.nl/svn/unbound/trunk svn: E170013: Unable to connect to a repository at URL 'https://nlnetlabs.nl/svn/unbound/trunk' svn: E000113: Error running context: No route to host But it does not try IPv4 address at all. Filled a new issue on upstream tracker, https://issues.apache.org/jira/browse/SERF-190 Package: libserf-1.3.9-12.fc31 Petr any chance can you try -31 with SVN? https://koji.fedoraproject.org/koji/buildinfo?buildID=1250807 Unfortunately, I do not have any virtual with working IPv6 connection to reproduce this issue. My virtual rawhide works even without this upgrade, but it has different network configuration. Sorry, not able to test it yet. I have created copr build [1] for fixed version, without subversion rebuild. Unfortunately, the issue is still the same. $ rpm -q subversion libserf subversion-1.11.1-1.fc29.x86_64 libserf-1.3.9-12.fc29.x86_64 $ svn co https://nlnetlabs.nl/svn/unbound/trunk svn: E170013: Unable to connect to a repository at URL 'https://nlnetlabs.nl/svn/unbound/trunk' svn: E000113: Error running context: No route to host 1. https://copr.fedorainfracloud.org/coprs/pemensik/subversion/ Can you capture strace for that? Thank Petr, I can see the problem - it is catching POLLIN as well and the code is significantly different here on trunk (where my patch works) to 1.3.9 to distinguish this case. I'm going to have to wait for upstream to chime in, not trivial to backport the trunk code to 1.3.9 epoll_ctl(3, EPOLL_CTL_ADD, 4, {EPOLLIN|EPOLLOUT, {u32=2678110760, u64=93877778299432}}) = 0 epoll_wait(3, [{EPOLLIN|EPOLLOUT|EPOLLERR|EPOLLHUP, {u32=2678110760, u64=93877778299432}}], 16, 500) = 1 read(4, 0x55619fa16814, 8000) = -1 EHOSTUNREACH (No route to host) I'll revert my patch since it doesn't help and might have other regressions. Created attachment 1941271 [details]
trunk-multihome
I am waiting for apache jira account to submit this improved patch fixing the issue. In the meantime i will backport the fix to fedora rawhide.
Created attachment 1941296 [details]
trunk-multihome
fixed typo.
FEDORA-2023-5548a5f8b5 has been submitted as an update to Fedora 38. https://bodhi.fedoraproject.org/updates/FEDORA-2023-5548a5f8b5 FEDORA-2023-5548a5f8b5 has been pushed to the Fedora 38 stable repository. If problem still persists, please make note of it in this bug report. |