Bug 923951 - yum is getting stuck in urlgrabber
Summary: yum is getting stuck in urlgrabber
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: python-urlgrabber
Version: 19
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Packaging Maintenance Team
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: AcceptedBlocker
: 923459 924178 928985 (view as bug list)
Depends On:
Blocks: F19Alpha, F19AlphaBlocker
TreeView+ depends on / blocked
 
Reported: 2013-03-20 18:58 UTC by Brian Lane
Modified: 2013-04-02 15:47 UTC (History)
14 users (show)

Fixed In Version: python-urlgrabber-3.9.1-26.fc19
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-04-02 15:47:06 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Brian Lane 2013-03-20 18:58:42 UTC
While trying to complete an install using the anaconda-19.12 boot iso from http://alt.fedoraproject.org/pub/alt/anaconda/ I am hitting a problem with it getting stuck retrieving files from the network.

The iso was built with:
  yum-3.4.3-77
  python-pycurl-7.19.0-15
  python-urlgrabber-3.9.1-25

The backtrace of the stuck thread looks like this:


Thread 140700039259904
  File "/usr/lib64/python2.7/threading.py", line 524, in __bootstrap
    self.__bootstrap_inner()
  File "/usr/lib64/python2.7/threading.py", line 551, in __bootstrap_inner
    self.run()
  File "/tmp/updates/pyanaconda/threads.py", line 131, in run
    threading.Thread.run(self, *args, **kwargs)
  File "/usr/lib64/python2.7/threading.py", line 504, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/tmp/updates/pyanaconda/ui/gui/spokes/source.py", line 419, in getRepoMetadata
    self.payload.updateBaseRepo(fallback=False, checkmount=False)
  File "/tmp/updates/pyanaconda/packaging/yumpayload.py", line 461, in updateBaseRepo
    self._configureBaseRepo(self.storage, checkmount=checkmount)
  File "/tmp/updates/pyanaconda/packaging/yumpayload.py", line 749, in _configureBaseRepo
    self._yumCacheDirHack()
  File "/tmp/updates/pyanaconda/packaging/yumpayload.py", line 281, in _yumCacheDirHack
    if not self._yum.conf.cachedir.startswith(self._yum.conf.installroot):
  File "/usr/lib/python2.7/site-packages/yum/__init__.py", line 1001, in <lambda>
    conf = property(fget=lambda self: self._getConfig(),
  File "/usr/lib/python2.7/site-packages/yum/__init__.py", line 338, in _getConfig
    startupconf = config.readStartupConfig(fn, root, releasever)
  File "/usr/lib/python2.7/site-packages/yum/config.py", line 1000, in readStartupConfig
    confpp_obj = ConfigPreProcessor(configfile)
  File "/usr/lib/python2.7/site-packages/yum/parser.py", line 91, in __init__
    fo = self._pushfile( url )
  File "/usr/lib/python2.7/site-packages/yum/parser.py", line 196, in _pushfile
    fo = urlgrabber.grabber.urlopen(absurl)
  File "/usr/lib/python2.7/site-packages/urlgrabber/grabber.py", line 727, in urlopen
    return default_grabber.urlopen(url, **kwargs)
  File "/usr/lib/python2.7/site-packages/urlgrabber/grabber.py", line 1087, in urlopen
    return self._retry(opts, retryfunc, url)
  File "/usr/lib/python2.7/site-packages/urlgrabber/grabber.py", line 1044, in _retry
    r = apply(func, (opts,) + args, {})
  File "/usr/lib/python2.7/site-packages/urlgrabber/grabber.py", line 1086, in retryfunc
    return PyCurlFileObject(url, filename=None, opts=opts)
  File "/usr/lib/python2.7/site-packages/urlgrabber/grabber.py", line 1238, in __init__
    self._do_open()
  File "/usr/lib/python2.7/site-packages/urlgrabber/grabber.py", line 1558, in _do_open
    self._do_grab()
  File "/usr/lib/python2.7/site-packages/urlgrabber/grabber.py", line 1692, in _do_grab
    self._do_perform()
  File "/usr/lib/python2.7/site-packages/urlgrabber/grabber.py", line 1449, in _do_perform
    self.curl_obj.perform()

strace of the anaconda process shows it stuck trying to recvfrom:
restart_syscall(<... resuming interrupted call ...>) = 0
clock_gettime(CLOCK_MONOTONIC, {1137, 1197718}) = 0
clock_gettime(CLOCK_MONOTONIC, {1137, 1338858}) = 0
recvfrom(33, 0x156ac84, 4096, 0, 0, 0)  = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=24, events=POLLIN}, {fd=33, events=POLLIN}, {fd=31, events=POLLIN}, {fd=34, events=POLLIN}, {fd=39, events=POLLIN}, {fd=38, events=0}, {fd=27, events=POLLIN}], 7, 999) = 0 (Timeout)
clock_gettime(CLOCK_MONOTONIC, {1138, 1638731}) = 0
clock_gettime(CLOCK_MONOTONIC, {1138, 1746082}) = 0
recvfrom(33, 0x156ac84, 4096, 0, 0, 0)  = -1 EAGAIN (Resource temporarily unavailable)

Comment 1 Adam Williamson 2013-03-20 19:45:58 UTC
I've been seeing some weird behaviour in yum downloads in general in F19, which may be related. There just seems to be something borked somewhere. Maybe multiple somethings.

When I'm composing F19 live images on an F19 host and the livecd-creator package cache is populated and there are changes since the last compose, it seems like it goes out and grabs the changed packages, and then - for no apparent reason - redownloads _all the existing packages_ multiple times each, then discards them. So if I did a live image with packages a-1.0, b-1.0, c-1.0 etc etc etc then b-1.1 came out, and I tried to compose a live image, I'd see this:

yum downloads repo data
yum downloads b-1.1
yum downloads a-1.0
yum downloads a-1.0 again
yum downloads a-1.0 again
yum downloads a-1.0 again
yum downloads a-1.0 again
yum downloads a-1.0 again
yum downloads a-1.0 again
yum downloads c-1.0
yum downloads c-1.0 again
yum downloads c-1.0 again
yum downloads c-1.0 again
yum downloads c-1.0 again
yum downloads c-1.0 again
yum downloads c-1.0 again
yum downloads d-1.0
yum downloads d-1.0 again...

Also, on my production F19 desktop, I hit a case where yum would keep throwing weird HTTP errors trying to download certain packages - 'Requested range not available', IIRC. I worked around it by yum localinstalling the offending packages, but someone else was reporting the same thing in #fedora-releng yesterday.

Comment 2 Adam Williamson 2013-03-20 19:50:14 UTC
*** Bug 923459 has been marked as a duplicate of this bug. ***

Comment 3 Adam Williamson 2013-03-20 19:51:12 UTC
Transferring blocker status from dupe: Discussed at 2013-03-20 blocker review meeting: http://meetbot.fedoraproject.org/fedora-blocker-review/2013-03-20/f19alpha-blocker-review-2.2013-03-20-16.00.log.txt . Accepted as a blocker per criterion https://fedoraproject.org/wiki/Fedora_19_Alpha_Release_Criteria#Remote_package_sources :

"When using the dedicated installer images, the installer must be able to use either HTTP or FTP repositories (or both) as package sources. The network install image must default to a valid publicly-accessible package source."

Comment 4 Zdeněk Pavlas 2013-03-21 09:38:46 UTC
>    startupconf = config.readStartupConfig(fn, root, releasever)
>    ..
>    fo = urlgrabber.grabber.urlopen(absurl)

So urlgrabber is trying to urlopen() yum.conf.  It's almost certainly not a remote file.  One reason this might fail is that some other thread is using urlgrabber simultenously.

> Also, on my production F19 desktop, I hit a case where yum would keep throwing 
> weird HTTP errors trying to download certain packages - 'Requested range not 
> available', IIRC.

This occurs when a Yum API client calls yumbase.downloadPkgs(pkg_list), and some packages (more specific, their local paths) are listed more than once.

Usually this was not a problem, because before trying to DL the 2nd copy, we saw the file is already there, so we checksummed it and when ok, the 2nd DL was skipped.

But this no longer works when downloading files in parallel, so we may start the 2nd DL while 1st one is still running.  It's also possible that the file is incomplete when checked, but complete when we calculate the reget lenght, we may send a range request for "0 remaining bytes".  Instead of sending an empty content, the server responds with this.

Since the DRPMS support was merged in (rawhide -2 weeks) an early duplicate check has been added to downloadPkgs(), and warn(_("ignoring a dupe of %s") % po) is logged, and the duplicated package is ignored.

Comment 5 Chris Lumens 2013-03-21 13:44:11 UTC
*** Bug 924178 has been marked as a duplicate of this bug. ***

Comment 6 Brian Lane 2013-03-21 17:51:02 UTC
Here is a different example of it getting stuck, this time it was actually trying to download packages (it didn't get past the first one though):

Thread 139898963576576
  File "/usr/lib64/python2.7/threading.py", line 524, in __bootstrap
    self.__bootstrap_inner()
  File "/usr/lib64/python2.7/threading.py", line 551, in __bootstrap_inner
    self.run()
  File "/tmp/updates/pyanaconda/threads.py", line 131, in run
    threading.Thread.run(self, *args, **kwargs)
  File "/usr/lib64/python2.7/threading.py", line 504, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/tmp/updates/pyanaconda/install.py", line 132, in doInstall
    payload.install()
  File "/tmp/updates/pyanaconda/packaging/yumpayload.py", line 1513, in install
    self._yum.runTransaction(cb=rpmcb)
  File "/usr/lib/python2.7/site-packages/yum/__init__.py", line 1702, in runTransaction
    errors = self.ts.run(cb.callback, '')
  File "/usr/lib64/python2.7/site-packages/rpm/transaction.py", line 86, in run
    rc = TransactionSetCore.run(self, callback, data, self._probFilter)
  File "/tmp/updates/pyanaconda/packaging/yumpayload.py", line 1696, in callback
    package_path = repo.getPackage(txmbr.po, checkfunc=checkfunc)
  File "/usr/lib/python2.7/site-packages/yum/yumRepo.py", line 1031, in getPackage
    **kwargs
  File "/usr/lib/python2.7/site-packages/yum/yumRepo.py", line 997, in _getFile
    **kwargs
  File "/usr/lib/python2.7/site-packages/urlgrabber/mirror.py", line 447, in urlgrab
    return self._mirror_try(func, url, kw)
  File "/usr/lib/python2.7/site-packages/urlgrabber/mirror.py", line 424, in _mirror_try
    return func_ref( *(fullurl,), opts=opts, **kw )
  File "/usr/lib/python2.7/site-packages/urlgrabber/grabber.py", line 1152, in urlgrab
    return self._retry(opts, retryfunc, url, filename)
  File "/usr/lib/python2.7/site-packages/urlgrabber/grabber.py", line 1044, in _retry
    r = apply(func, (opts,) + args, {})
  File "/usr/lib/python2.7/site-packages/urlgrabber/grabber.py", line 1137, in retryfunc
    fo = PyCurlFileObject(url, filename, opts)
  File "/usr/lib/python2.7/site-packages/urlgrabber/grabber.py", line 1238, in __init__
    self._do_open()
  File "/usr/lib/python2.7/site-packages/urlgrabber/grabber.py", line 1558, in _do_open
    self._do_grab()
  File "/usr/lib/python2.7/site-packages/urlgrabber/grabber.py", line 1692, in _do_grab
    self._do_perform()
  File "/usr/lib/python2.7/site-packages/urlgrabber/grabber.py", line 1449, in _do_perform
    self.curl_obj.perform()

Comment 7 Adam Williamson 2013-03-21 17:57:47 UTC
zdenek: I can't quite parse if you're saying 'ok, we accept there are bugs here, we're fixing them' or 'everything is working as expected and other tools are doing things wrong'? thanks!

Comment 8 Zdeněk Pavlas 2013-03-22 08:15:54 UTC
I'm pretty sure it's a bug in anaconda.  It's very likely using urlgrabber in multiple threads. Referring to the original traceback: probably some metadata or package download was in progress, and thread 140700039259904 at yumpayload.py:281 touched the .conf property of a not yet initialised YumBase instance.  This triggers loading of yum.conf and *.repo files through urlgrabber.

The second part of my comment #4 is a completely different Yum bug.

Comment 9 Adam Williamson 2013-03-22 20:35:33 UTC
zdenek: thanks! is there a report for the second bug? I'm still seeing it or something like it here today (a bunch of "Package does not match intended download. Suggestion: run yum --enablerepo=fedora clean metadata" errors on webkitgtk3 this morning).

Comment 10 Brian Lane 2013-03-23 16:23:42 UTC
We are locking our yum use so that only one thread at a time can touch it. In one of the failure cases i have seen metadata download has been finished for quite some time (used to populate the software spoke).

I have also attempted to move the yum transaction execution into a multiprocessing process instead of python thread, with similar problems. It is looking to me like yum/urlgrabber/rpm cannot be reliably used in a threaded environment anymore.

In F18 we didn't have any of these issues so something has changed (and from my perspective, broken) since then. At this point I'm pretty well stuck. Anaconda can't install packages, so any suggestions for debugging would be appreciated.

Comment 11 Mark Hamzy 2013-03-24 13:59:36 UTC
I am also wondering about a report on a second bug. I see:

Error Downloading Packages:
  kmod-12-3.fc19.ppc64: Downloaded package kmod-12-3.fc19.ppc64, from fedora, but it was invalid.
  kmod-12-3.fc19.ppc64: [Errno 256] No more mirrors to try.

inside a mock environment. And when I rerun the command, it works.

Here is another example:

[build@bluebill ~]$ mock -r fedora-19branched-ppc64 --install @buildsys-build vim-enhanced pungi startup-notification basesystem automake llvm-libs glibc.ppc yum yum-utils vi wget less fpaste pungi MAKEDEV fedora-kickstarts fedora-packager
...
Error Downloading Packages:
  libgudev1-198-6.fc19.ppc64: [Errno 256] No more mirrors to try.
  libgudev1-198-6.fc19.ppc64: Downloaded package libgudev1-198-6.fc19.ppc64, from local, but it was invalid.
  dracut-026-54.git20130316.fc19.ppc64: [Errno 256] No more mirrors to try.
  dracut-026-54.git20130316.fc19.ppc64: Downloaded package dracut-026-54.git20130316.fc19.ppc64, from local, but it was invalid.

Comment 12 Zdeněk Pavlas 2013-03-25 09:38:36 UTC
re Comment #11:  What version of Yum are you using?  Could you check that this patch http://yum.baseurl.org/gitweb?p=yum.git;a=commitdiff;h=fe21657d863 is applied?

If not, it's probably a new yum bug.  "Downloaded package %s, from %s, but it was invalid." error is raised when (after downloading) the po.verifyLocalPkg() method fails.  Please fill a BZ, and attach your /etc/mock/fedora-19branched-ppc64.cfg so I can try to reproduce it. (I'm not that familiar with mock).

Comment 13 Mark Hamzy 2013-03-25 12:53:07 UTC
I'm running mock within Fedora 18 ppc64 client.  So the version of yum is either:
yum-3.4.3-47.fc18.noarch.rpm or yum-3.4.3-51.fc18.noarch.
I've run yum update again to be sure that I am running with the latest.

Looking at both source rpms, it does not seem that this patch has been applied.  So I will apply it and see if I can reproduce the problem.

Can you please push that patch (and any other critical fixes) out to Fedora 18 updates?

Comment 14 Mark Hamzy 2013-03-25 13:26:02 UTC
created bug 927240 after more problems.

Comment 15 Brian Lane 2013-03-25 23:55:56 UTC
I've made a bit of progress on this. I've been able to get my multiprocessing thread to at least try to download the file consistently. Now I'm hitting the "Package does not match intended download" error consistently.

This is because it doesn't match. The downloaded file's sha256 is wrong.

It (urlgrabber or something below it) is using range requests to grab the file, starting at byte 1384, which is the same offset it uses for the headers. I'm speculating that after the header download is done the range isn't reset. I'm currently trying to get it to reset it before we call repo.getPackage()

eg. Wireshark capture:

GET /pub/fedora/linux/development/19/x86_64/os/Packages/f/filesystem-3.2-7.fc19.x86_64.rpm HTTP/1.1
Range: bytes=1384-175015
User-Agent: Fedora (anaconda)/19 yum/3.4.3
Host: dl.fedoraproject.org
Accept: */*

HTTP/1.1 206 Partial Content
Date: Mon, 25 Mar 2013 23:49:53 GMT
Server: Apache/2.2.15 (Red Hat)
Last-Modified: Wed, 13 Mar 2013 07:10:53 GMT
ETag: "9805ed-1038b8-4d7c91b48d940"
Accept-Ranges: bytes
Content-Length: 173632
ProxyTime: D=11333
ProxyServer: download01.phx2.fedoraproject.org
Content-Range: bytes 1384-175015/1063096
Content-Type: application/x-rpm

Comment 16 Zdeněk Pavlas 2013-03-26 08:51:48 UTC
> Range: bytes=1384-175015
                    ^^^^^^
This is the problem- Yum thinks the package size is 175015, but it should be
(AIUI) 1063096.  And Yes, 1384 is exactly the header size.

wget http://download.eng.brq.redhat.com/pub/fedora/linux/development/19/x86_64/os/Packages/f/filesystem-3.2-7.fc19.x86_64.rpm

  <size package="1063096" installed="0" archive="183464"/>
  <location href="filesystem-3.2-7.fc19.x86_64.rpm"/>
  <format>
    <rpm:license>Public Domain</rpm:license>
    <rpm:vendor>Fedora Project</rpm:vendor>
    <rpm:group>System Environment/Base</rpm:group>
    <rpm:buildhost>buildvm-04.phx2.fedoraproject.org</rpm:buildhost>
    <rpm:sourcerpm>filesystem-3.2-7.fc19.src.rpm</rpm:sourcerpm>
    <rpm:header-range start="1384" end="1056572"/>

Comment 17 Brian Lane 2013-03-26 13:23:32 UTC
Actually yum knows the right size and the call to urlgrab in _getFile is passing range=(None, None) but PyCurlFilePbj.set_opts doesn't reset the pycurl.RANGE whent he range is non, leaving it set to whatever the last request was.

If I hack in setting it to "" (which isn't right, it still sends a Range: header) or if I call grabber.reset_curl_obj() before the Anaconda call to _yum.runTransaction(cb=rpmcb) it does actually download the correct file and it passes the sha256sum check.

Now it is failing someplace in RPM with a 'BAD PARAMETER' error,that's what I'm going to be working on tracking down today.

Calling reset_curl_obj() seems like a total hack, and I can't see anything that is obviously be related to all of this mess in the commit logs of yum, urlgrabber, pycul or curl.

urlgrabber does have one commit to fix range handling with result code 200, which I have applied, but it didn't make any difference.

Comment 18 Zdeněk Pavlas 2013-03-27 10:01:37 UTC
> if I call grabber.reset_curl_obj() .. it does actually download

Uh, that's me to blame.  The curl_obj.reset() was breaking curl error reporting.  http://yum.baseurl.org/gitweb?p=urlgrabber.git;a=commitdiff;h=6c07f3dd

"Pass a NULL to this option to disable the use of ranges.".. but setopt(pycurl.RANGE, None) does not work (I'll submit a curl bug for this), so we have to revert back to resetting the curl object.

> Calling reset_curl_obj() seems like a total hack,

Yes, there's the "just in case" comment.. unfortunately it is/was pretty important, as it's the only way to clear CURLOPT_RANGE.. :(

Comment 19 Brian Lane 2013-03-28 21:22:33 UTC
Using python-urlgrabber-3.9.1-26.fc19 in my build has fixed this for me, thanks.

Comment 20 Neil 2013-03-28 22:54:42 UTC
*** Bug 928985 has been marked as a duplicate of this bug. ***

Comment 21 Jens Petersen 2013-03-29 06:08:22 UTC
Thanks, I also confirm that python-urlgrabber-3.9.1-26.fc19 in Alpha TC3 fixes this.

Comment 22 Jens Petersen 2013-04-01 03:17:01 UTC
Moving to VERIFIED.

Comment 23 Adam Williamson 2013-04-02 15:47:06 UTC
If Alpha TC3 fixed it, we can close it.


Note You need to log in before you can comment on or make changes to this bug.