Bug 728323 - anaconda fails if yum repo changes during installation
Summary: anaconda fails if yum repo changes during installation
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: anaconda
Version: rawhide
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Anaconda Maintenance Team
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On: 720481 814099
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-08-04 17:28 UTC by Jeff Bastian
Modified: 2014-01-21 23:18 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of: 720481
Environment:
Last Closed: 2012-04-20 18:14:07 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Jeff Bastian 2011-08-04 17:28:07 UTC
+++ This bug was initially created as a clone of Bug #720481 +++

Description of problem:
Anaconda fails if a yum repo is updated with new packages and metadata while a system is installing from that repo.

For example, if a kickstart file contains
  repo --name=rhel6.1 --baseurl=http://www.example.com/rhel6.1/
  repo --name=custom --baseurl=http://www.example.com/custom/

  %packages
  @Base
  custom-pkg1
  custom-pkg2
  custom-pkg3
  ...

And, while a system is installing, the admin adds or updates a new package to the 'custom' repo and runs 'createrepo', Anaconda on that system will fail with:

NoMoreMirrorsRepoError: failure: repodata/filelists.sqlite.bz2 from custom: [Errno 256] No more mirrors to try.

Full traceback is below.

Please update Anaconda to handle this situation.


Version-Release number of selected component (if applicable):
anaconda-13.21.117-1.el6

How reproducible:
every time

Steps to Reproduce:
1. kickstart a RHEL 6.1 system with a custom repo
2. update the repo with a new package
  
Actual results:
anaconda traceback

Expected results:
anaconda reloads the yum repo metadata and continues installing

Additional info:

Anaconda logs:
19:24:25,752 WARNING : Try 1/10 for http://www.example.com/custom/repodata/filelists.sqlite.bz2 failed: [Errno -1] Metadata file does not match checksum
19:24:26,582 WARNING : Try 2/10 for http://www.example.com/custom/repodata/filelists.sqlite.bz2 failed: [Errno -1] Metadata file does not match checksum
19:24:27,347 WARNING : Try 3/10 for http://www.example.com/custom/repodata/filelists.sqlite.bz2 failed: [Errno -1] Metadata file does not match checksum
19:24:28,804 WARNING : Try 4/10 for http://www.example.com/custom/repodata/filelists.sqlite.bz2 failed: [Errno -1] Metadata file does not match checksum
19:24:30,842 WARNING : Try 5/10 for http://www.example.com/custom/repodata/filelists.sqlite.bz2 failed: [Errno -1] Metadata file does not match checksum
19:24:34,885 WARNING : Try 6/10 for http://www.example.com/custom/repodata/filelists.sqlite.bz2 failed: [Errno -1] Metadata file does not match checksum
19:24:43,188 WARNING : Try 7/10 for http://www.example.com/custom/repodata/filelists.sqlite.bz2 failed: [Errno -1] Metadata file does not match checksum
19:24:59,242 WARNING : Try 8/10 for http://www.example.com/custom/repodata/filelists.sqlite.bz2 failed: [Errno -1] Metadata file does not match checksum
19:25:31,510 WARNING : Try 9/10 for http://www.example.com/custom/repodata/filelists.sqlite.bz2 failed: [Errno -1] Metadata file does not match checksum
19:26:35,822 WARNING : Try 10/10 for http://www.example.com/custom/repodata/filelists.sqlite.bz2 failed: [Errno -1] Metadata file does not match checksum
19:26:35,822 WARNING : Failed to get http://www.example.com/custom/repodata/filelists.sqlite.bz2 from mirror 1/1, or downloaded file is corrupt
19:26:35,928 INFO    : Running kickstart %%traceback script(s)
19:26:35,928 INFO    : All kickstart %%traceback script(s) have been run
19:26:35,933 CRITICAL: anaconda 13.21.117 exception report
Traceback (most recent call first):
  File "/usr/lib/python2.6/site-packages/yum/yumRepo.py", line 842, in _getFile
    raise Errors.NoMoreMirrorsRepoError, errstr
  File "/usr/lib/python2.6/site-packages/yum/yumRepo.py", line 1607, in _retrieveMD
    size=thisdata.size)
  File "/usr/lib/python2.6/site-packages/yum/yumRepo.py", line 172, in populate
    db_fn = repo._retrieveMD(mydbtype)
  File "/usr/lib/python2.6/site-packages/yum/sqlitesack.py", line 294, in _loadFiles
    self.sack.populate(self.repo, mdtype='filelists')
  File "/usr/lib/python2.6/site-packages/yum/sqlitesack.py", line 372, in returnFileTypes
    self._loadFiles()
  File "/usr/lib/anaconda/yuminstall.py", line 993, in getDownloadPkgs
    for filetype in txmbr.po.returnFileTypes():
  File "/usr/lib/anaconda/yuminstall.py", line 1603, in doPostSelection
    (self.dlpkgs, self.totalSize, self.totalFiles)  = self.ayum.getDownloadPkgs()
  File "/usr/lib/anaconda/backend.py", line 233, in doPostSelection
    return anaconda.backend.doPostSelection(anaconda)
  File "/usr/lib/anaconda/dispatch.py", line 208, in moveStep
    rc = stepFunc(self.anaconda)
  File "/usr/lib/anaconda/dispatch.py", line 126, in gotoNext
    self.moveStep()
  File "/usr/lib/anaconda/dispatch.py", line 231, in currentStep
    self.gotoNext()
  File "/usr/lib/anaconda/text.py", line 601, in run
    (step, instance) = anaconda.dispatch.currentStep()
  File "/usr/bin/anaconda", line 1116, in <module>
    anaconda.intf.run(anaconda)
NoMoreMirrorsRepoError: failure: repodata/filelists.sqlite.bz2 from custom: [Errno 256] No more mirrors to try.

--- Additional comment from jbastian on 2011-07-11 14:14:40 CDT ---

This problem was hit internally with early RHEL 6.0 alpha testing and reported in bug 527345 which was closed not-a-bug.

--- Additional comment from jbastian on 2011-07-11 17:22:44 CDT ---

Created attachment 512300 [details]
script to build RPMs and update yum repo

This script generates 1024 RPMs with a 10 second pause between each build.  With each package, the script copies it into /var/www/html/custom-repo and then runs createrepo.

--- Additional comment from jbastian on 2011-07-11 17:26:22 CDT ---

Created attachment 512302 [details]
kickstart file to use custom repo

Install a system using this kickstart file while the custom repo is updating with the script from comment 2.

Be sure to change www.example.com to your system's hostname in the kickstart file and in the virt-install instructions below.

Starting the rpm generation script:

[rpmbuild@termite ~]$ ./rapid-rebuild.sh 
Starting with fresh /var/www/html/custom-repo
Building packages
Building foo-conf-1.0-1.noarch.rpm
Copying package to /var/www/html/custom-repo
Updating repo metadata
...

Meanwhile, fire up a virtual machine:

virt-install --name=rhel61-custom-repo --ram=768 --vcpus=1 \
  --os-type=linux --os-variant=rhel6 \
  --disk=path=/var/lib/libvirt/images/rhel61-custom-repo.img,size=10,bus=virtio \
  --network=bridge=br0,model=virtio \
  --graphics=vnc --noautoconsole \
  --location=http://www.example.com/RHEL-6/6.1/Server/x86_64/os/ \
  --extra-args="ks=http://www.example.com/kickstart/rhel61-custom-repo.ks"

The installation should fail as described above.


--- Additional comment from jbastian on 2011-07-22 15:32:36 CDT ---

We met with the customer this morning and he brought up two specific
problems that, if resolved, would go a long way towards fixing this problem.

1. If the checksum on the repo metadata doesn't match, Anaconda assumes there
   must have been an error in transmit and it attempts to re-download the
   metadata up to 10 times.

   From anaconda.log:
19:24:25,752 WARNING : Try 1/10 for http://www.example.com/repo/custom-repo/repodata/filelists.sqlite.bz2 failed: [Errno -1] Metadata file does not match checksum
...
19:26:35,822 WARNING : Try 10/10 for http://www.example.com/repo/custom-repo/repodata/filelists.sqlite.bz2 failed: [Errno -1] Metadata file does not match checksum

   Errors in transmit are very unlikely with TCP and modern network gear,
   thus, if the checksum doesn't match, it's far more likely that the
   yum repo has changed and thus it should attempt to re-download all the
   metadata, not just the one filelists.sqlite.bz2 file.

   Note: They do keep old versions of packages in the repo so it's safe to
   assume that packages haven't been yanked out from underneath Anaconda.
   It's just that new packages have been added and thus the checksums and
   metadata has also changed.)

2. Download all the yum metadata at once.  The logs show that repomd.xml is
   downloaded first, then other activity (e.g., looking at DUD for module
   updates), and finally the rest of the metadata.  If this window can be
   shortened, it's less likely that the checksums won't match.

   From anaconda.log:
19:23:47,590 INFO    : added repository custom-repo with URL http://www.example.com/repo/custom-repo
...
19:23:47,613 DEBUG   : Grabbing  http://www.example.com/mirror/rhel6-x86_64/repodata/repomd.xml
...
19:24:02,128 DEBUG   : Checking for DUD module /lib/modules/2.6.32-131.0.15.el6.x86_64/kernel/fs/ext4/ext4.ko.gz
...
19:24:25,752 WARNING : Try 1/10 for http://www.example.com/repo/custom-repo/repodata/filelists.sqlite.bz2 failed: [Errno -1] Metadata file does not match checksum

   As you can see, there's 38 seconds between the time it adds the custom-repo
   repo and when it finally downloads the filelists.sqlite.bz2 for that repo.
   If it grabs the filelists metadata immediately when it adds the repo,
   there's a much lower chance of it changing due to a repo update in that
   38 second window.

   Note 1: I believe this is a simple race-condition, so there's always a
   small chance of hitting the problem, but reducing the window from 38
   seconds to just 1 or 2 seconds should significantly reduce the odds of
   hitting it.

   Note 2: the repomod.xml file is for the base OS repo, whereas the
   problematic filelists.sqlite.bz2 is from their custom custom-repo repo.
   I'm not sure why the logs do not contain the "Grabbing repomd.xml" for the
   custom-repo repo.

Comment 1 Martin Sivák 2011-08-11 11:03:53 UTC
Since all metadata parsing and repo handling is done in Yum, I suppose Yum people should decide if they want to handle this situation.

My opinion is that if you are recreating repository, you should be doing it in different directory and then do two atomic renames (I know it is only somewhat useful while fetching through http, but still...).

Comment 2 seth vidal 2011-08-11 15:01:14 UTC
I've added code to createrepo upstream which does.


--retain-old-md=RETAIN_OLD_MD
                        keep around the latest (by timestamp) N copies of the
                        old repodata


So if you are updating a repo, we can keep around the last N copies of the primary, filelists and changelog metadata, so situations like this will be less likely to happen since the files will still exist.

Now - the pkgs they refer to may not exist anymore but the metadata won't abort.

Comment 3 James Antill 2012-04-20 16:17:12 UTC
 Can maybe just close this out with NaB ... or we could maybe change anaconda to set mdpolicy = "group:all", so this can't happen (just for the MD). Or rely on the createrepo thing (maybe even change the default to keep around the last 2 or something).


Note You need to log in before you can comment on or make changes to this bug.