+++ This bug was initially created as a clone of Bug #720481 +++ Description of problem: Anaconda fails if a yum repo is updated with new packages and metadata while a system is installing from that repo. For example, if a kickstart file contains repo --name=rhel6.1 --baseurl=http://www.example.com/rhel6.1/ repo --name=custom --baseurl=http://www.example.com/custom/ %packages @Base custom-pkg1 custom-pkg2 custom-pkg3 ... And, while a system is installing, the admin adds or updates a new package to the 'custom' repo and runs 'createrepo', Anaconda on that system will fail with: NoMoreMirrorsRepoError: failure: repodata/filelists.sqlite.bz2 from custom: [Errno 256] No more mirrors to try. Full traceback is below. Please update Anaconda to handle this situation. Version-Release number of selected component (if applicable): anaconda-13.21.117-1.el6 How reproducible: every time Steps to Reproduce: 1. kickstart a RHEL 6.1 system with a custom repo 2. update the repo with a new package Actual results: anaconda traceback Expected results: anaconda reloads the yum repo metadata and continues installing Additional info: Anaconda logs: 19:24:25,752 WARNING : Try 1/10 for http://www.example.com/custom/repodata/filelists.sqlite.bz2 failed: [Errno -1] Metadata file does not match checksum 19:24:26,582 WARNING : Try 2/10 for http://www.example.com/custom/repodata/filelists.sqlite.bz2 failed: [Errno -1] Metadata file does not match checksum 19:24:27,347 WARNING : Try 3/10 for http://www.example.com/custom/repodata/filelists.sqlite.bz2 failed: [Errno -1] Metadata file does not match checksum 19:24:28,804 WARNING : Try 4/10 for http://www.example.com/custom/repodata/filelists.sqlite.bz2 failed: [Errno -1] Metadata file does not match checksum 19:24:30,842 WARNING : Try 5/10 for http://www.example.com/custom/repodata/filelists.sqlite.bz2 failed: [Errno -1] Metadata file does not match checksum 19:24:34,885 WARNING : Try 6/10 for http://www.example.com/custom/repodata/filelists.sqlite.bz2 failed: [Errno -1] Metadata file does not match checksum 19:24:43,188 WARNING : Try 7/10 for http://www.example.com/custom/repodata/filelists.sqlite.bz2 failed: [Errno -1] Metadata file does not match checksum 19:24:59,242 WARNING : Try 8/10 for http://www.example.com/custom/repodata/filelists.sqlite.bz2 failed: [Errno -1] Metadata file does not match checksum 19:25:31,510 WARNING : Try 9/10 for http://www.example.com/custom/repodata/filelists.sqlite.bz2 failed: [Errno -1] Metadata file does not match checksum 19:26:35,822 WARNING : Try 10/10 for http://www.example.com/custom/repodata/filelists.sqlite.bz2 failed: [Errno -1] Metadata file does not match checksum 19:26:35,822 WARNING : Failed to get http://www.example.com/custom/repodata/filelists.sqlite.bz2 from mirror 1/1, or downloaded file is corrupt 19:26:35,928 INFO : Running kickstart %%traceback script(s) 19:26:35,928 INFO : All kickstart %%traceback script(s) have been run 19:26:35,933 CRITICAL: anaconda 13.21.117 exception report Traceback (most recent call first): File "/usr/lib/python2.6/site-packages/yum/yumRepo.py", line 842, in _getFile raise Errors.NoMoreMirrorsRepoError, errstr File "/usr/lib/python2.6/site-packages/yum/yumRepo.py", line 1607, in _retrieveMD size=thisdata.size) File "/usr/lib/python2.6/site-packages/yum/yumRepo.py", line 172, in populate db_fn = repo._retrieveMD(mydbtype) File "/usr/lib/python2.6/site-packages/yum/sqlitesack.py", line 294, in _loadFiles self.sack.populate(self.repo, mdtype='filelists') File "/usr/lib/python2.6/site-packages/yum/sqlitesack.py", line 372, in returnFileTypes self._loadFiles() File "/usr/lib/anaconda/yuminstall.py", line 993, in getDownloadPkgs for filetype in txmbr.po.returnFileTypes(): File "/usr/lib/anaconda/yuminstall.py", line 1603, in doPostSelection (self.dlpkgs, self.totalSize, self.totalFiles) = self.ayum.getDownloadPkgs() File "/usr/lib/anaconda/backend.py", line 233, in doPostSelection return anaconda.backend.doPostSelection(anaconda) File "/usr/lib/anaconda/dispatch.py", line 208, in moveStep rc = stepFunc(self.anaconda) File "/usr/lib/anaconda/dispatch.py", line 126, in gotoNext self.moveStep() File "/usr/lib/anaconda/dispatch.py", line 231, in currentStep self.gotoNext() File "/usr/lib/anaconda/text.py", line 601, in run (step, instance) = anaconda.dispatch.currentStep() File "/usr/bin/anaconda", line 1116, in <module> anaconda.intf.run(anaconda) NoMoreMirrorsRepoError: failure: repodata/filelists.sqlite.bz2 from custom: [Errno 256] No more mirrors to try. --- Additional comment from jbastian on 2011-07-11 14:14:40 CDT --- This problem was hit internally with early RHEL 6.0 alpha testing and reported in bug 527345 which was closed not-a-bug. --- Additional comment from jbastian on 2011-07-11 17:22:44 CDT --- Created attachment 512300 [details] script to build RPMs and update yum repo This script generates 1024 RPMs with a 10 second pause between each build. With each package, the script copies it into /var/www/html/custom-repo and then runs createrepo. --- Additional comment from jbastian on 2011-07-11 17:26:22 CDT --- Created attachment 512302 [details] kickstart file to use custom repo Install a system using this kickstart file while the custom repo is updating with the script from comment 2. Be sure to change www.example.com to your system's hostname in the kickstart file and in the virt-install instructions below. Starting the rpm generation script: [rpmbuild@termite ~]$ ./rapid-rebuild.sh Starting with fresh /var/www/html/custom-repo Building packages Building foo-conf-1.0-1.noarch.rpm Copying package to /var/www/html/custom-repo Updating repo metadata ... Meanwhile, fire up a virtual machine: virt-install --name=rhel61-custom-repo --ram=768 --vcpus=1 \ --os-type=linux --os-variant=rhel6 \ --disk=path=/var/lib/libvirt/images/rhel61-custom-repo.img,size=10,bus=virtio \ --network=bridge=br0,model=virtio \ --graphics=vnc --noautoconsole \ --location=http://www.example.com/RHEL-6/6.1/Server/x86_64/os/ \ --extra-args="ks=http://www.example.com/kickstart/rhel61-custom-repo.ks" The installation should fail as described above. --- Additional comment from jbastian on 2011-07-22 15:32:36 CDT --- We met with the customer this morning and he brought up two specific problems that, if resolved, would go a long way towards fixing this problem. 1. If the checksum on the repo metadata doesn't match, Anaconda assumes there must have been an error in transmit and it attempts to re-download the metadata up to 10 times. From anaconda.log: 19:24:25,752 WARNING : Try 1/10 for http://www.example.com/repo/custom-repo/repodata/filelists.sqlite.bz2 failed: [Errno -1] Metadata file does not match checksum ... 19:26:35,822 WARNING : Try 10/10 for http://www.example.com/repo/custom-repo/repodata/filelists.sqlite.bz2 failed: [Errno -1] Metadata file does not match checksum Errors in transmit are very unlikely with TCP and modern network gear, thus, if the checksum doesn't match, it's far more likely that the yum repo has changed and thus it should attempt to re-download all the metadata, not just the one filelists.sqlite.bz2 file. Note: They do keep old versions of packages in the repo so it's safe to assume that packages haven't been yanked out from underneath Anaconda. It's just that new packages have been added and thus the checksums and metadata has also changed.) 2. Download all the yum metadata at once. The logs show that repomd.xml is downloaded first, then other activity (e.g., looking at DUD for module updates), and finally the rest of the metadata. If this window can be shortened, it's less likely that the checksums won't match. From anaconda.log: 19:23:47,590 INFO : added repository custom-repo with URL http://www.example.com/repo/custom-repo ... 19:23:47,613 DEBUG : Grabbing http://www.example.com/mirror/rhel6-x86_64/repodata/repomd.xml ... 19:24:02,128 DEBUG : Checking for DUD module /lib/modules/2.6.32-131.0.15.el6.x86_64/kernel/fs/ext4/ext4.ko.gz ... 19:24:25,752 WARNING : Try 1/10 for http://www.example.com/repo/custom-repo/repodata/filelists.sqlite.bz2 failed: [Errno -1] Metadata file does not match checksum As you can see, there's 38 seconds between the time it adds the custom-repo repo and when it finally downloads the filelists.sqlite.bz2 for that repo. If it grabs the filelists metadata immediately when it adds the repo, there's a much lower chance of it changing due to a repo update in that 38 second window. Note 1: I believe this is a simple race-condition, so there's always a small chance of hitting the problem, but reducing the window from 38 seconds to just 1 or 2 seconds should significantly reduce the odds of hitting it. Note 2: the repomod.xml file is for the base OS repo, whereas the problematic filelists.sqlite.bz2 is from their custom custom-repo repo. I'm not sure why the logs do not contain the "Grabbing repomd.xml" for the custom-repo repo.
Since all metadata parsing and repo handling is done in Yum, I suppose Yum people should decide if they want to handle this situation. My opinion is that if you are recreating repository, you should be doing it in different directory and then do two atomic renames (I know it is only somewhat useful while fetching through http, but still...).
I've added code to createrepo upstream which does. --retain-old-md=RETAIN_OLD_MD keep around the latest (by timestamp) N copies of the old repodata So if you are updating a repo, we can keep around the last N copies of the primary, filelists and changelog metadata, so situations like this will be less likely to happen since the files will still exist. Now - the pkgs they refer to may not exist anymore but the metadata won't abort.
Can maybe just close this out with NaB ... or we could maybe change anaconda to set mdpolicy = "group:all", so this can't happen (just for the MD). Or rely on the createrepo thing (maybe even change the default to keep around the last 2 or something).