Bug 1175466 - add timeout option to repo conf
add timeout option to repo conf
Product: Fedora
Classification: Fedora
Component: dnf (Show other bugs)
Unspecified Unspecified
unspecified Severity unspecified
: ---
: ---
Assigned To: packaging-team-maint
Fedora Extras Quality Assurance
: 1185553 (view as bug list)
Depends On:
  Show dependency treegraph
Reported: 2014-12-17 14:15 EST by Wolfgang Rupprecht
Modified: 2015-02-20 03:32 EST (History)
12 users (show)

See Also:
Fixed In Version: hawkey-0.5.3-2.fc21
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2015-02-20 03:32:24 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description Wolfgang Rupprecht 2014-12-17 14:15:48 EST
Description of problem:
dnf hangs for a very long time banging on non-responsive mirrors.

Downloading Packages:
[SKIPPED] PackageKit-1.0.3-4.fc21.x86_64.rpm: Already downloaded                                                                                                                                                                                                                                                                               
[SKIPPED] PackageKit-glib-1.0.3-4.fc21.x86_64.rpm: Already downloaded                                                                                                                                                                                                                                                                          
[SKIPPED] PackageKit-cached-metadata-1.0.3-4.fc21.x86_64.rpm: Already downloaded                                                                                                                                                                                                                                                               
[MIRROR] PackageKit-gtk3-module-1.0.3-4.fc21.x86_64.rpm: Curl error: Timeout was reached for ftp://mirror.cs.pitt.edu/fedora/linux/updates/testing/21/x86_64/p/PackageKit-gtk3-module-1.0.3-4.fc21.x86_64.rpm [Connection timed out after 120002 milliseconds]                                                                                 
[MIRROR] PackageKit-gstreamer-plugin-1.0.3-4.fc21.x86_64.rpm: Curl error: Timeout was reached for ftp://mirror.cs.pitt.edu/fedora/linux/updates/testing/21/x86_64/p/PackageKit-gstreamer-plugin-1.0.3-4.fc21.x86_64.rpm [Connection timed out after 120002 milliseconds]                                                                       
[MIRROR] PackageKit-command-not-found-1.0.3-4.fc21.x86_64.rpm: Curl error: Timeout was reached for ftp://mirror.cs.pitt.edu/fedora/linux/updates/testing/21/x86_64/p/PackageKit-command-not-found-1.0.3-4.fc21.x86_64.rpm [Connection timed out after 120001 milliseconds]                                                                     
(4-6/59): PackageKit-gtk3-module-1.0.3-4.fc21.x86_64.rpm                                                                                             57% [=====================================================================================-                                                              ] ---  B/s |  66 MB     --:-- ETA

Version-Release number of selected component (if applicable):
dnf.noarch                         0.6.3-2.fc21                          @System

How reproducible:

Steps to Reproduce:
1. dnf upgrade -y

Actual results:
dnf hangs for a very long time, eventually moves onto another mirror for the first 3 downloads and then on the 4th to 6th download hangs again as it returns to the dead mirrors.

Expected results:
1) dnf has more reasonable timeouts.   10 seconds should do it.  If a mirror takes longer than that to respond we probably shouldn't be using it.  
2) past failures should be remembered and those mirrors are blacklisted for a certain length of time, certainly at least for this session, perhaps with the same timeout as the metadata.

Additional info:
Comment 1 Honza Silhan 2015-01-06 13:31:49 EST
Thanks for the report.

1) Setting constants for non-responsiveness is always subjective. I personally think that current behavior is good enough. Maybe it could re-initiate failed downloads again at the end of the queue instead.
2) Is this possible, Tomas? Or better could it be marked as the least preferred mirror?
Comment 2 Petr Spacek 2015-01-07 03:59:24 EST
(In reply to Jan Silhan from comment #1)
> 2) Is this possible, Tomas? Or better could it be marked as the least
> preferred mirror?

Personally I would be in favor of moving timing-our mirror to the last position in priority list. It could be just intermittent failure or just one package missing on that particular mirror.

Maybe DNF could be clever and remove mirror completely after large X failures (like 50)?
Comment 3 Tomas Mlcoch 2015-01-19 04:57:57 EST
Hi all,
Librepo has several option that can be used for fine-tuning of such behavior.

LRO_CONNECTTIMEOUT - Max time in sec for connection phase. (Default: 300 seconds)

LRO_LOWSPEEDLIMIT - The transfer speed in bytes per second that the transfer should be below during LRO_LOWSPEEDTIME seconds for the library to consider it too slow and abort. (Default: 0)

LRO_LOWSPEEDTIME - The time in seconds that the transfer should be below the LRO_LOWSPEEDLIMIT for the library to consider it too slow and abort. (Default: 120 seconds)

LRO_ALLOWEDMIRRORFAILURES - Max number of allowed failures per mirror. If a mirror outreach this number and there was no successful download, the mirror ignored for the rest of the session. (Default: 4)

LRO_ADAPTIVEMIRRORSORTING - After each finished transfer, the mirrors are resorted. - A mirror is moved forward or backward by one position depending on its rank (calculated as ration between successful and failed downloads) and ranks of its neighbors (Default: True)

JFYI, as you can see, in the Wolfgang's case, its the combination of lowpeedlimit and lowspeedtime what kills the transfer after 120sec (because the default connection timeout is far more higher - 300sec). So maybe it could be useful to also add these two options into repo conf.

Moving of non-responsive mirror at the end of the queue as suggested by Petr is possible and it could work.

Petr or Jan, could someone of you open me an RFE in bugzilla to get this thing tracked? Thanks

Comment 4 Honza Silhan 2015-01-26 06:20:59 EST
PR: https://github.com/rpm-software-management/dnf/pull/199
Comment 5 Honza Silhan 2015-01-26 06:21:07 EST
*** Bug 1185553 has been marked as a duplicate of this bug. ***
Comment 6 Zbigniew Jędrzejewski-Szmek 2015-01-26 17:57:51 EST
Making it configurable is a nice step. But c'mon, 300s timeout (or 120 as it currently seems to be)? This should be changed to some value that "just works" for most common cases, and not to have people discover this configuration option on their own. Long connection timeouts make sense for random pages on the web, but not for accessing mirrors which are supposed to be fast.
Comment 7 Tomas Mlcoch 2015-01-27 04:11:42 EST
It depends. Yes, mirrors are supposed to be fast but they are also supposed to be available most of the time.

The world is not perfect and there are still people with dial-up, GPRS and similar types of connection. Such connections are slow, lossy and have high latency. We need use values that works for majority of people and 120s looks like such value. It works for them (for people with slow connection with high loss rate and high latency) but also for others with reliable high-speed connection types. Only drawback is that the second group can sometimes hit two minutes delay. But I guess we could do some changes and use shorter timeout as default (maybe something like 30s).
Comment 8 Zbigniew Jędrzejewski-Szmek 2015-01-27 07:43:25 EST
> can sometimes hit two minutes delay
If one mirror is nonresponsive. Sometimes more than one fails.

People who are on "bad" connections usually have slow transfers and/or unreliable packet delivery, but they usually do not have an extreme latency. Even for countries connected through satellite networks, round-trip latencies are usually below half a second. Let's say that determining whether a connection is up or down might take 10 roundtrips, so 10s should be enough.

> 30s
Still rather high though, but certainly better then 120s.
Comment 9 Rodrigo de Farias Gomes 2015-01-29 15:30:40 EST
I believe that I am just unlucky :-)

[root@localhost ~]# ping
PING ( 56(84) bytes of data.
64 bytes from icmp_seq=13 ttl=42 time=947 ms
64 bytes from icmp_seq=14 ttl=41 time=1606 ms
64 bytes from icmp_seq=15 ttl=40 time=745 ms
64 bytes from icmp_seq=16 ttl=41 time=7849 ms
64 bytes from icmp_seq=17 ttl=41 time=6849 ms
64 bytes from icmp_seq=18 ttl=41 time=6027 ms
64 bytes from icmp_seq=19 ttl=41 time=7206 ms
64 bytes from icmp_seq=20 ttl=41 time=6386 ms
64 bytes from icmp_seq=21 ttl=41 time=6087 ms
64 bytes from icmp_seq=22 ttl=42 time=5105 ms
64 bytes from icmp_seq=23 ttl=42 time=4926 ms
64 bytes from icmp_seq=24 ttl=41 time=5506 ms
64 bytes from icmp_seq=25 ttl=41 time=5705 ms
64 bytes from icmp_seq=26 ttl=40 time=5466 ms
64 bytes from icmp_seq=27 ttl=41 time=10063 ms
64 bytes from icmp_seq=28 ttl=41 time=9105 ms
64 bytes from icmp_seq=29 ttl=41 time=8766 ms
--- ping statistics ---
37 packets transmitted, 17 received, 54% packet loss, time 36001ms
rtt min/avg/max/mdev = 745.763/5785.517/10063.948/2587.143 ms, pipe 11

I live in Brazil. I am using a 3g connection...

English is not my natural language, sorry...
Comment 10 Honza Silhan 2015-02-03 04:26:06 EST
Fixed in the upstream. The default timeout is 30s - the same as in yum.
Comment 11 Fedora Update System 2015-02-15 19:03:11 EST
dnf-plugins-core-0.1.5-1.fc21,hawkey-0.5.3-2.fc21,dnf-0.6.4-1.fc21 has been submitted as an update for Fedora 21.
Comment 12 Fedora Update System 2015-02-17 03:04:06 EST
Package hawkey-0.5.3-2.fc21, dnf-plugins-core-0.1.5-1.fc21, dnf-0.6.4-1.fc21:
* should fix your issue,
* was pushed to the Fedora 21 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing hawkey-0.5.3-2.fc21 dnf-plugins-core-0.1.5-1.fc21 dnf-0.6.4-1.fc21'
as soon as you are able to.
Please go to the following url:
then log in and leave karma (feedback).
Comment 13 Fedora Update System 2015-02-20 03:32:24 EST
hawkey-0.5.3-2.fc21, dnf-plugins-core-0.1.5-1.fc21, dnf-0.6.4-1.fc21 has been pushed to the Fedora 21 stable repository.  If problems still persist, please make note of it in this bug report.

Note You need to log in before you can comment on or make changes to this bug.