Bug 675590 - Fedora Project's metalink files don't use block checksums
Summary: Fedora Project's metalink files don't use block checksums
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: distribution
Version: rawhide
Hardware: All
OS: Linux
unspecified
unspecified
Target Milestone: ---
Assignee: Matt Domsch
QA Contact: Bill Nottingham
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-02-06 21:11 UTC by Andre Robatino
Modified: 2014-03-17 03:26 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-02-07 16:15:16 UTC
Type: ---


Attachments (Terms of Use)

Description Andre Robatino 2011-02-06 21:11:33 UTC
Description of problem:
Metalink files can use block checksums to provide BitTorrent-like robustness to direct downloads, so only bad blocks need to be re-downloaded. However, the automatically generated files only contain a single sha256 hash (like the regular checksum files). By generating the files using the "-d sha1pieces" option described in the metalink man page (in addition to "-d sha256"), it would be possible to repair bad downloads, not just detect them. It would make the metalink file bigger, but that's relatively small regardless.

For example, take a look at

http://mirrors.fedoraproject.org/metalink?path=pub/fedora/linux/releases/14/Fedora/i386/iso/Fedora-14-i386-DVD.iso

Disclaimer: I just starting looking into metalink files, so I might be misunderstanding something.

Comment 1 Bill Nottingham 2011-02-07 15:36:08 UTC
Assigning to MM maintainer; cc'ing yum folks for comments on what's supported there.

Comment 2 seth vidal 2011-02-07 15:41:31 UTC
yum doesn't download the entire iso. So that's out of scope for yum

For the data yum cares about I don't think storing block checksums of all packages or all metadata is a good way to go about providing partial downloads. Other than just creating a much larger file that must be downloaded everytime you check the repo.

Comment 3 Matt Domsch 2011-02-07 16:15:16 UTC
MirrorManager generates its own metalink documents, it doesn't use any stand-alone metalink creator.  You are correct, MM doesn't generate per-block SHA pieces for any file, including ISOs.  MM also "suggests" a single downloader connection (I say suggests because it can't force client behavior) per file, rather than encouraging multiple parallel connections to multiple mirrors each pulling partial files.  This is an intentional omission on my part.  Treating mirrors as if they were bittorrent servers, without the mirrors running as bittorrent seeds, serves to thrash the mirror buffer cache and cause increased disk I/O, reducing the capability to serve a larger number of users.  See "Bittorrent Considered Harmful" by John Hawley, kernel.org admin.

Comment 4 Andre Robatino 2011-02-07 16:45:44 UTC
But does providing block checksums make it more likely that a client will use parallel connections? (I'm ignorant in this area.)


Note You need to log in before you can comment on or make changes to this bug.