Bug 2291017 - wget2: Download from download.copr.fedorainfracloud.org sporadically hangs
Summary: wget2: Download from download.copr.fedorainfracloud.org sporadically hangs
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: wget2
Version: 40
Hardware: Unspecified
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Neal Gompa
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2024-06-08 11:58 UTC by Florian Weimer
Modified: 2024-07-13 01:13 UTC (History)
6 users (show)

Fixed In Version: wget2-2.1.0-11.fc40 wget2-2.1.0-11.el9
Clone Of:
Environment:
Last Closed: 2024-07-05 06:21:49 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Florian Weimer 2024-06-08 11:58:10 UTC
A noticeable fraction of invocations like this

wget -O- https://download.copr.fedorainfracloud.org/results/fweimer/aarch64-relr-f41/fedora-rawhide-aarch64/07566457-bind-dyndb-ldap/builder-live.log.gz | wc -c

hangs while waiting data on a TCP connection to one of the IP addresses for download.copr.fedorainfracloud.org. It's not the OCSP handling that is blocking, I think:

#0  0x00007ffff7d827ed in __GI___poll (fds=fds@entry=0x7fffe85ffa80, 
    nfds=nfds@entry=1, timeout=900000) at ../sysdeps/unix/sysv/linux/poll.c:29
#1  0x00007ffff7f6ace5 in poll (__fds=0x7fffe85ffa80, __nfds=1, 
    __timeout=<optimized out>) at /usr/include/bits/poll2.h:39
#2  wget_ready_2_transfer (fd=<optimized out>, timeout=<optimized out>, 
    mode=mode@entry=1)
    at /usr/src/debug/wget2-2.1.0-9.fc40.x86_64/libwget/io.c:251
#3  0x00007ffff7f6ad32 in wget_ready_2_read (fd=<optimized out>, 
    timeout=<optimized out>)
    at /usr/src/debug/wget2-2.1.0-9.fc40.x86_64/libwget/io.c:278
#4  0x00007ffff7f8b5c0 in wget_ssl_read_timeout (session=0x7fffe0083120, 
    buf=0x7fffe00915a0 "", count=102400, timeout=900000)
    at /usr/src/debug/wget2-2.1.0-9.fc40.x86_64/libwget/ssl_gnutls.c:1984
#5  0x00007ffff7f7d250 in wget_tcp_read (tcp=0x7fffe0000bc0, 
    buf=buf@entry=0x7fffe00915a0 "", count=count@entry=102400)
    at /usr/src/debug/wget2-2.1.0-9.fc40.x86_64/libwget/net.c:889
#6  0x00007ffff7f7d4b0 in wget_http_get_response_cb (conn=0x7fffe0000b70)
    at /usr/src/debug/wget2-2.1.0-9.fc40.x86_64/libwget/http.c:1036
#7  0x0000555555569261 in http_receive_response (conn=<optimized out>)
    at /usr/src/debug/wget2-2.1.0-9.fc40.x86_64/src/wget.c:4074
#8  downloader_thread (p=<optimized out>)
    at /usr/src/debug/wget2-2.1.0-9.fc40.x86_64/src/wget.c:2384
#9  0x00007ffff7d0e1e7 in start_thread (arg=<optimized out>)
    at pthread_create.c:447
#10 0x00007ffff7d9042c in clone3 ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78

It does not happen every time, but maybe one in four invocations.

Curl does not have this issue (with neither HTTP/1.1 nor HTTP/2). Invoking wget2 with --no-http2 does not fix it.

Seen with:

wget2-2.1.0-9.fc40.x86_64
openssl-3.2.1-2.fc40.x86_64


Reproducible: Sometimes

Comment 1 Florian Weimer 2024-06-08 20:39:23 UTC
The --no-tcp-fastopen option fixes it. This may be a compatibility issue with connection tracking in the Linux 6.1.90 kernel.

Comment 2 Florian Weimer 2024-06-08 21:35:14 UTC
Or it's a middlebox problem. I disabled receive offloading on the last box I have access to, and the packet captures are now more sensible. But the issue still occurs sporadically.

Comment 3 Florian Weimer 2024-06-08 21:43:44 UTC
It could be an MTU issue. The MSS advertisement does not take into account the size of the TCP fast open cookie, which is not included in the maximum segment size (which does not contained TCP header bytes). This is not discussed in RFC 7413 as far as I can see. Linux should probably reduce the MSS to 1442 bytes when using TCP Fast Open, to account for the size of the cookie.

Comment 4 Neal Gompa 2024-06-09 13:59:19 UTC
I'm not sure then if this is a wget2 issue or a Linux issue. If it's a wget2 issue, please file an issue here: https://gitlab.com/gnuwget/wget2/-/issues

If it's a Linux issue, should we move this to the kernel package then?

Comment 5 Florian Weimer 2024-06-10 06:11:27 UTC
(In reply to Neal Gompa from comment #4)
> I'm not sure then if this is a wget2 issue or a Linux issue. If it's a wget2
> issue, please file an issue here: https://gitlab.com/gnuwget/wget2/-/issues
> 
> If it's a Linux issue, should we move this to the kernel package then?

It could also be a Amazon Cloudfront issue. For example, some instances may not have working path MTU discovery and an interface MTU that is larger than standard Ethernet.

To be honest, I think it's a wget2 issue because I think it's wrong or inconvenient to enable TCP Fast Open by default. I don't think it's a good use of our time to figure out how to make it work in more cases.

Comment 6 Tim Rühsen 2024-06-14 14:59:23 UTC
Not a single standard is fully supported by all clients or servers. E.g. there are still servers talking HTTP 1.0 only. Should we change the default in wget2 because of that? I hope we can agree on a NO.

And before we start endless discussions for every incompatibility and who's fault it is and how to best fix it, a fast way forward for Fedora is to put this and other options into a system-wide configuration file like /etc/wget2rc. So by default, Fedora can keep a conservative wget2 behavior which is close to wget1.

Comment 7 Romain Geissler 2024-06-14 15:08:38 UTC
Hi Neal/Florian,

If we decide to disable this in the fedora package (Tim as wget/wget2 maintainer was made aware of this, see his answer above), what do you think is the best implementation ? Do we go for a /etc/wget2rc file inside the package, and then possibly it will enter in collision with users maybe replacing it and having issues during package upgrade, or do we prefer applying a patch in the C-code to invert the tcp fast open default ? Being affected by a tcp fast open issue myself inside my organization which uses a Palo Alto firewall, I have some time/will to submit the change in the wget2 package on fedora side.

Cheers,
Romain

Comment 8 François Rigault 2024-06-15 05:50:13 UTC
I think this there is an issue with copr infrastructure and raised the concern in https://app.element.io/#/room/#buildsys:fedoraproject.org
but maybe it's not the right channel for that.

> hi! there is an issue reported here https://bugzilla.redhat.com/show_bug.cgi?id=2291017 related to tcp-fastopen support on copr servers. I can reproduce it running twice:
curl --resolve download.copr.fedorainfracloud.org:443:13.249.9.96  --tcp-fastopen https://download.copr.fedorainfracloud.org/
tcpdump shows the server returning an ACK instead of a SYN-ACK.
The bz mentions wget2 (which is replacing wget in fedora 40, which I think should be reverted) but it seems something is wrong on copr infrastructure side (since tcp-fastopen works generally fine, at least when no palo alto fw is involved).

Comment 9 Romain Geissler 2024-06-15 10:13:01 UTC
@Francois maybe you should open a ticket here: https://pagure.io/fedora-infrastructure/issues

@Neal/Florian: For the disabling of tcp fastopen by default in the fedora package, I am proposing this: https://src.fedoraproject.org/rpms/wget2/pull-request/10

Comment 10 Tim Rühsen 2024-06-30 17:48:17 UTC
Just fyi, I decided to disable TFO by default (commit 7a945d31aeb34fc73cf86a494673ae97e069d84d).

Your comments here made me read more the current status of TFO. Due to the middle box issues, the possible tracking issue (TFO cookies) and the general decison of browser vendors to not use it by default made me rethink my position. Thank you for your tenacy :)

Comment 11 Neal Gompa 2024-07-01 08:14:18 UTC
(In reply to Tim Rühsen from comment #6)
> Not a single standard is fully supported by all clients or servers. E.g.
> there are still servers talking HTTP 1.0 only. Should we change the default
> in wget2 because of that? I hope we can agree on a NO.
> 
> And before we start endless discussions for every incompatibility and who's
> fault it is and how to best fix it, a fast way forward for Fedora is to put
> this and other options into a system-wide configuration file like
> /etc/wget2rc. So by default, Fedora can keep a conservative wget2 behavior
> which is close to wget1.

This is not a bad idea, but it would be ideal if we could have a config location in /usr for this (e.g. /usr/share/wget/wget2rc and /usr/share/wget/wget2rc.d/*.conf) that admins can override with either drop-in files in /etc/wget/wget2rc.d/ or /etc/wget2rc. If you are not opposed to it, I can file a ticket upstream about this.

Comment 12 Romain Geissler 2024-07-01 22:30:03 UTC
@Tim thanks for re-considering this !

@Neal: I have updated https://src.fedoraproject.org/rpms/wget2/pull-request/10, now it uses a backport of the upstream patch instead of introducing a global wget2rc file. Indeed as a packager myself for my organization, global config files are inconvenient for sharing global settings and still allowing users to customize if for their needs, some conf.d directory usually solve the issue of sharing the packaging of global configurations between multiple actors.

Comment 13 Fedora Update System 2024-07-03 11:15:51 UTC
FEDORA-EPEL-2024-b3a3477475 (wget2-2.1.0-11.el9) has been submitted as an update to Fedora EPEL 9.
https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2024-b3a3477475

Comment 14 Fedora Update System 2024-07-03 11:15:52 UTC
FEDORA-2024-9cb3210b56 (wget2-2.1.0-11.fc40) has been submitted as an update to Fedora 40.
https://bodhi.fedoraproject.org/updates/FEDORA-2024-9cb3210b56

Comment 15 Fedora Update System 2024-07-04 01:58:08 UTC
FEDORA-2024-9cb3210b56 has been pushed to the Fedora 40 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2024-9cb3210b56`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2024-9cb3210b56

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 16 Fedora Update System 2024-07-04 02:10:41 UTC
FEDORA-EPEL-2024-b3a3477475 has been pushed to the Fedora EPEL 9 testing repository.

You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2024-b3a3477475

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 17 Fedora Update System 2024-07-05 06:21:49 UTC
FEDORA-2024-9cb3210b56 (wget2-2.1.0-11.fc40) has been pushed to the Fedora 40 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 18 Fedora Update System 2024-07-13 01:13:43 UTC
FEDORA-EPEL-2024-b3a3477475 (wget2-2.1.0-11.el9) has been pushed to the Fedora EPEL 9 stable repository.
If problem still persists, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.