Bug 1678588 - Net install should retry package download if it fails, and not just give up
Summary: Net install should retry package download if it fails, and not just give up
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: dnf
Version: 8.0
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: rc
: 8.0
Assignee: Jaroslav Mracek
QA Contact: Radek Bíba
URL:
Whiteboard:
Depends On: 1681084
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-02-19 07:13 UTC by Jens Petersen
Modified: 2020-04-28 16:47 UTC (History)
7 users (show)

Fixed In Version: librepo-1.10.6-1.el8
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-04-28 16:47:40 UTC
Type: Bug
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2020:1823 None None None 2020-04-28 16:47:54 UTC

Description Jens Petersen 2019-02-19 07:13:00 UTC
Description of problem:
It sometimes happens during a netinstall that an rpm fails to download
(due to some temporary network glitch or http timeout).
This appears to cause Anaconda to stop the installation completely
without even the option to retry re-downloading the package,
which is rather painful and frustrating.  I am pretty sure
in the past anaconda used to do this, but I am not exactly sure
when it stopped doing so: I don't remember if RHEL7 anaconda still does that?
(If it does I consider this a regression, and in any case I feel it should
be addressed, giving up an installation like this in 2019 is
not acceptable really.)

Version-Release number of selected component (if applicable):
anaconda-29.19.0.34-1.el8

How reproducible:
easily

Steps to Reproduce:
1. Do a netinstall of RHEL8
2. Sometimes a rpm download error occurs
3. If no error, goto 1

Actual results:
2. Anaconda gives up and pops up a dialog saying a fatal error occurred
with for example:

"Failed to download the following packages: Cannot download Packages/libepoxy-1.5.2-1.el8.x86_64.rpm: All mirrors were tried"

Expected results:
2. Anaconda should retry downloading the package at least a couple of times
or show a dialog allowing the user to trigger a download retry.

Additional info:
I think this affects all RHEL8 images (and recent Fedora releases too probably).
I have already hit this half a dozen times during RHEL 8 development,
it is annoying and a time waster.

Comment 1 Jiri Konecny 2019-02-19 09:32:39 UTC
There should be retry for the download for a few times but download and installation are done by the DNF :).

Switching to DNF. Feel free to switch it back if we should just adjust some configuration.

Comment 2 Martin Kolman 2019-02-19 12:03:37 UTC
(In reply to Jiri Konecny from comment #1)
> There should be retry for the download for a few times but download and
> installation are done by the DNF :).
> 
> Switching to DNF. Feel free to switch it back if we should just adjust some
> configuration.

Yeah, I also think DNF does some retries, though unlike the Anaconda+YUM based solution on RHEL7 it might not be visible in the UI. Also the retry logic might be different & less crazy than in RHEL7 (IIRC theoretically up to 10 retries with exponential back-off, waiting up to 30 minutes per package).

Comment 3 Martin Kolman 2019-02-19 17:15:15 UTC
I looked into the code in more detail and this is how the retry logic works & is used in RHEL7 Anaconda:

- a progressive delay function is used to implement the delay computation before trying again:
-> implementaion: https://github.com/rhinstaller/anaconda/blob/rhel7-branch/pyanaconda/iutil.py#L1125
-> for 10 retries the delay start at 0.5 s and progressively goes up to 256 seconds

- the retry logic is used for individual package downloads:
https://github.com/rhinstaller/anaconda/blob/rhel7-branch/scripts/anaconda-yum#L296

- when populating the transaction set:
https://github.com/rhinstaller/anaconda/blob/rhel7-branch/scripts/anaconda-yum#L131

- when gathering repository metadata:
https://github.com/rhinstaller/anaconda/blob/rhel7-branch/pyanaconda/packaging/yumpayload.py#L704

- and even when downloading the .treeinfo:
https://github.com/rhinstaller/anaconda/blob/rhel7-branch/pyanaconda/packaging/__init__.py#L496


Also, .treeinfo download is apparently the only place Anaconda source code itself where the retry logic is currently being used in RHEL8 Anaconda:
https://github.com/rhinstaller/anaconda/blob/rhel-devel/pyanaconda/payload/install_tree_metadata.py#L84

Comment 4 Jens Petersen 2019-02-20 08:06:03 UTC
Okay, it seems single package failing to download due to temporary network glitch will prevent anaconda completing the install and user has to start all over again.

If the failure dialog had a retry button this could be averted.  With net installs over the internet, the probability of network issues occurring is not that small.
Maybe our customer CDN is more reliable, I don't know...

Comment 9 Jaroslav Mracek 2019-07-07 18:24:31 UTC
I create a pull request (https://github.com/rpm-software-management/librepo/pull/158) that allows to downoload targets multiple times even when one option is available. By default it will try to dawnload each target at least 4x.

Comment 18 Jaroslav Mracek 2020-03-04 17:37:25 UTC
librepo cannot add any sleep step because it will negatively affects all users (already reported on Fedora). When a network is down, this is not an issue of DNF. DNF at least give a try.

Comment 19 Radek Bíba 2020-03-05 10:59:21 UTC
Fair enough. It should be noted though that there will likely be cases where even repeated attempts won't make netinstall happy as they'll all fail.

Comment 22 errata-xmlrpc 2020-04-28 16:47:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:1823


Note You need to log in before you can comment on or make changes to this bug.