Bug 1004089 - yumdownloader can't download package when there's an oversize RPM in the destdir.
yumdownloader can't download package when there's an oversize RPM in the dest...
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: yum-utils (Show other bugs)
6.6
Unspecified Unspecified
unspecified Severity medium
: rc
: ---
Assigned To: packaging-team-maint
Karel Srot
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-09-03 18:13 EDT by Dmitry Borodaenko
Modified: 2014-10-14 00:38 EDT (History)
6 users (show)

See Also:
Fixed In Version: yum-utils-1.1.30-23.el6
Doc Type: Bug Fix
Doc Text:
Cause: try to download an rpm using yumdownloader when there's already the same rmp of bigger size in the destination directory Consequence: rpm doesn't get redownloaded, user sees a confusing error Fix: patch for handling already downloaded rpms Result: if the checksum of the rpm doesn't match the intended, yumdownloader will redownload the rpm
Story Points: ---
Clone Of:
: 1104995 (view as bug list)
Environment:
Last Closed: 2014-10-14 00:38:21 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Check if local file exists even when cache is disabled (1.01 KB, patch)
2013-09-03 18:13 EDT, Dmitry Borodaenko
no flags Details | Diff

  None (edit)
Description Dmitry Borodaenko 2013-09-03 18:13:57 EDT
Created attachment 793379 [details]
Check if local file exists even when cache is disabled

Description of problem:
When repo cache is disabled (e.g. as used in yumdownloader), YumRepository._getFile() would never check if a local file exists (both for metadata files and for RPMs), so the url is always passed to urlgrabber. The latter then finds that the target file on the local file system already exists and generates a GET with Range: header set to the size of the local file.

This of itself is a problem as it generates unnecessary HTTP load on the repo server, but it gets worse because curl doesn't check Content-Length header on receiving HTTP status 416 (https://tools.ietf.org/html/rfc2616#section-10.4.17), not to mention that some HTTP servers wouldn't add Content-Length header when returning 416. In that case, yum download fails and the requested packages are not installed.

Version-Release number of selected component (if applicable):
yum-3.4.3

How reproducible:
Always.

Steps to Reproduce:
1. Install yum-utils.
2. Download a package.rpm into a local directory dir.
3. Run yumdownloader --destdir=dir package

Actual results:
http://.../package.rpm: [Errno 14] PYCURL ERROR 22 - "The requested URL returned error: 416 Requested Range Not Satisfiable"

Expected results:
dir/package.rpm already exists and appears to be complete

Additional info:
Attached patch makes this problem go away. I didn't check what the impact on use cases outside of the yumdownloader scenario outlined above is going to be.
Comment 1 Zdeněk Pavlas 2013-09-04 08:33:45 EDT
IMO the "if self.cache == 1" branch in _getFile is just a leftover, and it should never run when the local file exists. Eg. getPackage() or yumdownloader handle it, and it does the checksum stuff, too.  I can't reproduce this bug, have you any idea why this yumdownloader code didn't run? Can you check that the file size indeed matches?

                local = os.path.join(opts.destdir, local)
                if (os.path.exists(local) and
                    os.path.getsize(local) == int(download.returnSimple('packagesize'))):
                    self.logger.error("%s already exists and appears to be complete" % local)
                    continue

It might be possible that the local file is actually larger, in this case we should unlink it (as we do in getPackage or retriveveMD).
Comment 2 Dmitry Borodaenko 2013-09-04 18:23:37 EDT
Good call on the local file being larger, turns out most RPMs in my local cache have a second copy of the 8 xz footer magic bytes. I'm still investigating how that happened, but that's most likely unrelated to this bug.

yumdownloader does use getPackage(), but the latter only checks the system cache via _preload_pkg_from_system_cache(), so it wouldn't have any impact on files in the non-default destination directory used by yumdownloader.

Maybe the verifyLocalPkg() call in getPackage() can be decoupled from _preload_pkg_from_system_cache() so that it would check the local file even if system cache was missed? That way, the code you quoted (which is essentially an inferior duplication of functionality of verifyLocalPkg()) wouldn't be necessary.
Comment 3 Zdeněk Pavlas 2013-09-09 09:47:27 EDT
Not quite.  getPackage() calls YumBase.verifyPkg().. This figures out that checksum does not match, and the following code runs:

            cursize = os.stat(fo)[6]
            totsize = long(po.size)
            if cursize >= totsize and not po.repo.cache:
                # if the path to the file is NOT inside the cachedir then don't
                # unlink it b/c it is probably a file:// url and possibly
                # unlinkable
                if fo.startswith(po.repo.cachedir):
                   ^^^^^^^^^^^^^^
                    os.unlink(fo)

The root of the problem is that since "fo" does not point to the cachedir, we don't dare to unlink it (it might be a nfs-mounted file:// repository). I can fix this test upstream, but EL6 uses very different yumdownloader.py, it'd be much easier to just handle the "os.path.getsize(local) > packagesize" case here.

--- yum-utils-1.1.30/yumdownloader.py.old	2013-09-09 15:39:21.451378664 +0200
+++ yum-utils-1.1.30/yumdownloader.py	2013-09-09 15:46:35.099294723 +0200
@@ -241,10 +241,13 @@ class YumDownloader(YumUtilBase):
                 if not os.path.exists(opts.destdir):
                     os.makedirs(opts.destdir)
                 local = os.path.join(opts.destdir, local)
-                if (os.path.exists(local) and 
-                    os.path.getsize(local) == int(download.returnSimple('packagesize'))):
+                c = os.path.exists(local) and cmp(os.path.getsize(local), int(download.returnSimple('packagesize')))
+                if c == 0:
                     self.logger.error("%s already exists and appears to be complete" % local)
                     continue
+                elif c == 1:
+                    self.logger.error("%s already exists and over the size. Unlinking." % local)
+                    yum.misc.unlink_f(local)
                 # Disable cache otherwise things won't download
                 repo.cache = 0
                 download.localpath = local # Hack: to set the localpath we want.
Comment 5 Karel Srot 2014-03-26 07:53:59 EDT
Testing the current behaviour with yum-utils-1.1.20-17 I am getting:

1. for local file smaller than the download:
Package does not match intended download. Suggestion: run yum clean metadata 

the message is not very descriptive and yum clean metadata won't help here.
Something like: rpm already exists but doesn't match the intended download but be better I guess

2. for already downloaded file:
xy.rpm already downloaded and appears to be complete

This is IMO OK

3. for a local file larger than the download:
The requested URL returned error: 416 Requested Range Not Satisfiable.


Could you please clarify how this is supposed to be fixed? I guess that the best option for 1. and 3. would be something similar to 2., e.g. just write that the local copy already exists but doesn't match
Comment 6 Jan Zeleny 2014-03-26 09:08:06 EDT
Please set needinfo to concrete people, it's more likely you get answers that way. Directing to Valentina ...
Comment 7 Valentina Mukhamedzhanova 2014-03-28 09:47:31 EDT
This is supposed to work pretty much like in upstream, so in cases 1 and 3 the existing file will be deleted and a new one will be downloaded.

By the way, the above patch is buggy, here is the fixed version:
--- /usr/bin/yumdownloader	2014-02-03 10:03:04.000000000 +0100
+++ ./yumdownloader	2014-03-28 16:44:37.976004627 +0100
@@ -241,10 +241,14 @@ class YumDownloader(YumUtilBase):
                 if not os.path.exists(opts.destdir):
                     os.makedirs(opts.destdir)
                 local = os.path.join(opts.destdir, local)
-                if (os.path.exists(local) and 
-                    os.path.getsize(local) == int(download.returnSimple('packagesize'))):
+                path_exists = os.path.exists(local)
+                size_diff = path_exists and cmp(os.path.getsize(local), int(download.returnSimple('packagesize')))
+                if path_exists and size_diff == 0:
                     self.logger.error("%s already exists and appears to be complete" % local)
                     continue
+                elif size_diff != 0:
+                    self.logger.error("%s already exists, but the size doesn't match. Unlinking." % local)
+                    yum.misc.unlink_f(local)
                 # Disable cache otherwise things won't download
                 repo.cache = 0
                 download.localpath = local # Hack: to set the localpath we want.
Comment 12 errata-xmlrpc 2014-10-14 00:38:21 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-1411.html

Note You need to log in before you can comment on or make changes to this bug.