Bug 886021

Summary: instrepo redirect is evaluated every time
Product: [Fedora] Fedora Reporter: Kamil Páral <kparal>
Component: fedupAssignee: Will Woods <wwoods>
Status: CLOSED CANTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 18CC: wwoods
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-12-11 20:30:27 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Kamil Páral 2012-12-11 10:34:16 UTC
Description of problem:
If I use this fedup option:

--instrepo http://download.fedoraproject.org/pub/fedora/linux/releases/test/18-Beta/Fedora/x86_64/os

it should redirect me to the nearest mirror, which is our private Red Hat mirror, and the download should be very fast. Instead I see one second delay for every package downloaded. The delay is caused by asking download.fedoraproject.org server every time and then following the redirect to our private mirror.

That is very wrong. fedup should first evaluate the URL and find out whether it redirects to a mirror. Only after that is should be used to download metadata and for all package downloads. This is the approach yum uses.

There are several strong reasons to do that:
1. it is much faster
2. it doesn't overload download.fedoraproject.org needlessly
3. (most importantly) repository consistency is ensured. If you don't do that, you access different mirrors for each of your requests. The mirrors are not synchronized all the time. That means:
 a) the metadata of one mirror might not match package contents of a different mirror
 b) the requested package might not be available on one mirror, but might be available on a second mirror

Issue 3) might cause seemingly random issues in the download process or even corrupted upgrades (how exactly fedup behaves when a single package is missing on the mirror, let's say Xserver)? That might be a borderline blocker I think.


Version-Release number of selected component (if applicable):
fedup-0.7.2-0.git20121206


How reproducible:
always

Steps to Reproduce:
1. if you are a red hatter, provide a direct address in the --instrepo option. you'll see that all packages are downloaded very fast (200 packages in 10 seconds?)
2. now use download.fedoraproject.org in --instrepo. That still redirects to our private mirror (because large packages are downloaded fast), but there is a one-second delay between each package download. That's the redirect request.

Comment 1 Kamil Páral 2012-12-11 13:22:26 UTC
(In reply to comment #0)
> Issue 3) might cause seemingly random issues in the download process or even
> corrupted upgrades (how exactly fedup behaves when a single package is
> missing on the mirror, let's say Xserver)? That might be a borderline
> blocker I think.

Fortunately it seems fedup exits properly when a package can't be downloaded, I just simulated that:

> Downloading failed: Errors were encountered while downloading packages.
>   systemd-sysv-195-8.fc18.x86_64: failure: Packages/s/systemd-sysv-195-8.fc18.x86_64.rpm from cmdline-instrepo: [Errno 256] No more mirrors to try.
>   systemd-libs-195-8.fc18.x86_64: failure: Packages/s/systemd-libs-195-8.fc18.x86_64.rpm from cmdline-instrepo: [Errno 256] No more mirrors to try.
>   systemd-195-8.fc18.x86_64: failure: Packages/s/systemd-195-8.fc18.x86_64.rpm from cmdline-instrepo: [Errno 256] No more mirrors to try.
> [root@localhost ~]# echo $?
> 2

That means that issue 3) might not be a problem (but can't say for sure).

Comment 2 Will Woods 2012-12-11 20:30:27 UTC
Don't use download.fedoraproject.org. It's a round-robin redirect, not a mirrorlist, and it's not intended for this purpose.

It's useful for a single download of a single file, as long as you don't care which mirror you get. For multiple files (as you found out) it's horribly inefficient and error-prone.

Here's the deal: to rotate clients through the available mirrors, the redirector gives *one* of the possible mirror URLs, and explicitly sets "cache-control: no-cache". This forces the client to check again and get a new redirect (and possibly a new mirror) for every request. *This is by design*. 

In fact, this is the exact reason we have mirrorlists.

Under normal (non-test) circumstances, fedup should get a nice big mirrorlist from mirrormanager[1] and use that. Then it can pick the fastest mirror, skip known-broken mirrors, avoid repeated lookups, use site-specific mirrors when available, and generally work as you expect yum to work.

--instrepo, on the other hand, is only supposed to be used for testing purposes. Testers should provide the *actual* link to a *single* mirror URL they want to use.

And if you're not sure which single mirror to pick, you can do:
    curl -sI $ROUNDROBIN_URL | grep -i '^location:'
to get the URL the roundrobin would redirect you to.

[1] If memory serves, https://mirrors.fedoraproject.org/metalink?repo=fedora-install-$releasever&arch=$basearch was the URL that mdomsch/dgilmore proposed. It doesn't seem to be active as of this writing, though.

Comment 3 Kamil Páral 2012-12-12 10:01:30 UTC
I didn't know that --instrepo is intended just for test purposes. If it is not used for F18 Final and fedup supports mirrorlists, then this is definitely not an issue.