Red Hat Bugzilla – Bug 146500
concurrent download & install
Last modified: 2014-01-21 17:51:14 EST
Description of problem:
During a network install, downloading and installation phases strictly
alternate. Each can take considerable time, during which one or the
other resource (network, disk) is relatively idle (=> wasted). It
would be much more efficient (=> faster installation) if the process
was pipelined: downloads could proceed continuously, in an independent
thread from installation (rpm -i). This would keep both the machine
and the network busy, like they should be.
Yea, I also thought of that. The only problem is the dependency packages, either
should they all be installed with rpm --force, or they should only be installed
when their dependencies also were there. The first one would probably be the
fastest, but what if a user tries to run a package before its deps has been
This would hopefully make yum twice as fast, which is quite a lot, as it some
time takes half an hour or so, if you haven't been running it for a while.
btw, shouldn't the bug be changed from fc3 to fc4, and from normal to high priority?
I don't see how making it high priority makes any sense.
If the downloads are ordered in a suitable dependency order (topological sort),
there may well be points where it is correct to install subsets of the entire
session, without resorting to --nodeps etc.
Without knowing any of the guts of yum's resolver, it must already create an
ordered to do list. Perhaps an extra field could be created that indicates a
single rpm transaction, incrementing from 1 for packages for which a transaction
only requires itself to be updated (and other packages don't require the
currently installed version). Do this throughout the list leaving the harder
ones (like kernel with it's looping dependencies) at a higher value in the list.
Meanwhile (back in Gotham city), a second thread has begun downloading the
lowest numbered packages. Might as well get them - gonna need em in a few
minutes anyway - and the network/disk is idle, while the dependency calculation
goes near max CPU.
Once the depency list is resolved, the first of the downloaded packages can be
installed (since it had no other deps/reqs that would have been broken). (I
assume that yum performs single rpm transactions... rather than a single command
with all the packages it has worked out as needed for success).
Also, if the user has given the -y "go do it" switch (or scheduled install),
then there is no reason to wait to download any rpm...if the repo says there is
an update and you already have an earlier version installed then yum must
download it, the earlier the better (perhaps even save getting the headers
separately for these directly mentioned packages, since user has already said
"yes mate, stop stuffing around and get on with the job" ;) . Extract the header
from the rpm if need be.
Does the resolver open the rpm database to check the requires etc, if not, or
it's not blocking rpm, then yum could install each single package that completes
download (and keep the disk/cpu revving) that doesn't require other package
changes in another thread while the resolver is do it's thing.
When yum gets to somewhere where it needs say 10 packages to complete a single
update then the rpm process would pause until all rpms in that group are download.
This process could even help with rawhide when new packages created have
conflicted with other packages. Instead of all users being unable to
successfully complete yum update for a period of time until the repo is "brought
into line", yum would simply end up with installing on the resolvable dependency
packages, rather than totally bailing out. (eg couple times in 2006-01, eg
initscripts v udev v hal). This would be a nice improvement, and would help to
get community test machines more up2date so that resolvable updated packages are
being tested sooner.
To keep the yum CLI simple, downloading and installing isn't going to happen (it
makes the progress bar UI way too odd). The infrastructure is there and used in
cases where it helps a lot such as anaconda.