From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.0) Gecko/20020529 Description of problem: Package Install takes too long. The activity light isn't "On" all the time for the CD-ROM [or the NIC during a network install]. Also, on machines with small RAM (less than 64MB) Package Install can take hours even for Personal Workstation (531 packages, 1770MB installed). Independent measurements suggest that Package Install could be faster by a factor of 2 on current machines (1GHz, 512MB, UDMA100) and by a factor of 5 on old machines (90MHz, 32MB, PIO). For example, on a modern machine "time rpm --verify $(rpm -qa)" shows {122sec real, 26sec user, 10sec sys} while the media check for one 700MB CD-ROM takes 232sec [3MB/sec ==> 7MB/sec decompressed]. So Package Install ought to take about 1770/7 = 253 seconds, while it actually takes 605 seconds. On an old machine, the rpm verify takes {28min27sec real, 6min19sec user, 7min26sec sys} while the media check takes 6min10sec. So Package Install ought to take around 35 minutes (28.5 + 1770/(1.89*2.3)/60; CMD640 synchronizes IDE channels, too) but instead it takes 194 minutes. Observing carefully, the displayed behavior for each package has three phases. During the first phase, the CD-ROM activity light is On, but the progress bar shows 0%; this is unpacking the .rpm with gzip. During the second phase, the CD-ROM activity light is Off while the progress bar advances from 0% to 100%; this is upacking the internal .bz2 with bzip2. During the third phase, the CD-ROM activity light is Off, the progress bar shows 100%, and the harddrive activity light is On; this is %post processing. Then the three phases repeat for the next package. By overlapping the three phases (gzip, bzip2, %post) in a software pipeline, the elapsed time could be reduced nearly to the total time for the longest phase, plus small amounts of setup, teardown, and overhead. Most current code can be reused by isolating each phase with fork() [and no exec()] while adding a small amount of code for pipeline supervision. On machines with small RAM (less than 64MB), it looks like the installer might suffer significant demand paging activity [/proc/meminfo shows Committed_AS of 61MB, swap used of 52MB], which should be reduced by keeping the memory footprint smaller than available RAM. Be sure to use files instead of memory arrays for storage with sequential access, such as the list of packages after topological sorting, and for unpacking compressed data (both gzip and bzip2). Wrap significant isolatable tasks with fork()+wait(), which guarantees that the parent returns to its pre-fork() footprint when the child terminates. Split the list of packages into multiple RPM transactions (one transaction per feature group; possibly even one transaction per leaf package or "knot" of interdependent packages.) Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: 1. Install 2nd limbo beta, Personal Workstation. 2. Time the media check for one CD-ROM, and the Package Install. 3. $ time rpm --verify $(rpm -qa) Actual Results: Modern machine: 232sec media check, 10 minutes Package Install, 2 minutes rpm verify. Old machine: 370sec media check, 194 minutes Package Install, 28.5 minutes rpm verify. Expected Results: Less than 5 minutes Package Install on modern macine (2X faster), 35 to 40 minutes Package Install on old machine (5X faster). Additional info:
Too late to make any large changes for this release
This appears to be still relevant
After 4 years the mininum RAM on a box has grown from 64MB to 192MB or 256MB, and journalled ext3 [twice the disk traffic for writes] has almost always replaced ext2. Therefore, the first stage of the pipeline should unpack each .rpm into tmpfs residing in 50% of RAM (/dev/shm). [Every .rpm except openoffice.org-core can be unpacked into 90MB.] Keep the installer memory footprint below 50% of RAM to avoid paging. gconftool-2 is a CPU hog during install or update, taking several seconds of CPU time per invocation on a GHz machine. Most gnome packages invoke gconftool-2, and some use it more than once.
The RULE Project (Run Up2date Linux Everywhere) has an installer "slinky" that can put FC5 onto a machine with only 32MB RAM. The installer lives at http://www.fzk.at/SLINKY/ ; the main project web site http://www.rule-project.org/ has been having trouble with hardware+administration for some time.
Platter placement optimization: The average seek on a CD or DVD drive takes about 150 milliseconds (15 times slower than a harddrive); so 600 packages is 90 seconds if each package requires one average seek. Most drives have a read-ahead buffer of 1MB or 2MB. Tune platter placement to minimize seeks and maximize buffer effectiveness. Of course, the two most important global strategies [driven mostly by human interface considerations] are: load each platter at most once during any install, and minimize the number of platters required by each pre-grouped set of packages [Internet, Personal Productivity, Software development, Server.] After the assignment of packages to platters has been chosen using the global strategies, here is a local strategy for placement+install which tries to minimize install time. Topologically sort the packages on a given platter by dependencies among the packages on that platter. Gather all the packages with no predecessors, sort by ascending on-platter size, and place them sequentially on the platter in that order. Now a new subset of un-placed packages has no remaining predecessors. Repeat the placement process for this and following subsets. The runtime install strategy parallels the placement. Topological sorting minimizes seeks and transaction overhead for rpm. Sorting by size within each batch tends to maximize effectiveness of the read-ahead buffer, even though any particular install may skip some packages. The ISO9660 filesystem may require "funky" names in TRANS.TBL in order to allow non-alphabetical placement of packages. For network installs, use two logical threads to fetch packages. Within a subset of packages to be installed that has no un-fetched predecessors, one thread fetches by ascending order of size, and the other thread fetches by descending order of size; they meet in the middle. The fetch by descending size tends to increase average network utilization that would otherwise be lost due to setup latencies for the smaller packages.
Jeremy, any comments on this issue? It's been open for more than 5 years already. I'd really like to see huge speed improvements in installation and upgrades.
I'd like to contribute to the code, and I'd appreciate some help with getting started. Please suggest a good develop+debug setup for working on this (for instance, how to debug without requiring a bare machine) and point out exactly where in the source is the loop which chooses the next package and installs it. Thanks.
I think that yum is the correct component here not anaconda.
John, can you please bring the issue to yum list and anaconda-devel list and request comments and feedback. I'm all for getting the installer (and upgrade) speed up but you need to reach to the relevant parties in order to get this going. Also yum could use threaded downloads (not sure if mentioned before). IIRC Debian's apt-get is able to download using several threads.
This isn't going to be implemented solely in anaconda. IF someone is interested in working on it, we can have a discussion on anaconda-devel-list, but otherwise it's not likely to really happen.