Bug 442165 - urlgrab broken when specifying range with reget turned on
urlgrab broken when specifying range with reget turned on
Status: CLOSED UPSTREAM
Product: Fedora
Classification: Fedora
Component: python-urlgrabber (Show other bugs)
12
All Linux
low Severity medium
: ---
: ---
Assigned To: James Antill
Fedora Extras Quality Assurance
:
: 443886 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2008-04-12 05:17 EDT by Tim Wegener
Modified: 2014-01-21 01:10 EST (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2010-09-03 17:00:42 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Unit tests to verify urlgrabber range reget (2.26 KB, text/x-python)
2010-09-03 21:00 EDT, Tim Wegener
no flags Details

  None (edit)
Description Tim Wegener 2008-04-12 05:17:01 EDT
Description of problem:
Running urlgrabber.urlgrab with reget turned on (reget='simple') and specifying
a byte range results in incorrect behaviour, and generally a corrupted download
file, when resuming a previous partial download.

Version-Release number of selected component (if applicable):
python-urlgrabber-3.0.0-3.fc8

How reproducible:
Always

Steps to Reproduce:
1. Run the following python code:
{{{
import urlgrabber
url = 'http://www.ietf.org/rfc/rfc2822.txt'
# Grab a small chunk
urlgrabber.urlgrab(url, 
                   range=(0, 5000), 
                   reget='simple',
                   )
# Grab a larger chunk, resuming from the end of the smaller chunk
urlgrabber.urlgrab(url, 
                   range=(0, 10000), 
                   reget='simple',
                   )
}}}

urlgrabber.urlgrab(url, 
                   filename=dest_filename,
                   range=(0, end_byte), 
                   reget='simple',
                   )
  
Actual results:
$ ls -l rfc2822.txt
-rw-rw-r-- 1 tim tim 15000 2001-04-19 01:13 rfc2822.txt
(i.e. 5000+10000 bytes downloaded)

Expected results:
$ ls -l rfc2822.txt
-rw-rw-r-- 1 tim tim 10000 2001-04-19 01:13 rfc2822.txt
(i.e. 5000+5000 bytes downloaded)

Additional info:
Problem appears to be in line 1154 of
/usr/lib/python2.5/site-packages/urlgrabber/grabber.py
if rt[0]: rt = (rt[0] + reget_length, rt[1])

It probably should be something like:
if rt[0] is None:
    rt = (0, rt[1])
rt = (rt[0] + reget_length, rt[1])

There does not seem to be a workaround that can be used without fixing urlgrabber.
Comment 1 Jeremy Katz 2008-05-28 16:03:45 EDT
*** Bug 443886 has been marked as a duplicate of this bug. ***
Comment 2 James Antill 2008-05-28 23:00:54 EDT
 I might have accidentally fixed this recently, can you check the latest
urlgrabber in Fed-9/rawhide?
Comment 3 Matteo Castellini 2008-05-30 07:58:46 EDT
I can confirm that this bug is still present both in F9
(python-urlgrabber-3.0.0-8.fc9) and rawhide (python-urlgrabber-3.0.0-8.fc10).
Comment 4 Tim Wegener 2008-06-07 09:25:03 EDT
I can also confirm that this bug is still present in F9
(python-urlgrabber-3.0.0-8.fc9.noarch). 
Comment 5 Bug Zapper 2008-11-26 05:28:50 EST
This message is a reminder that Fedora 8 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 8.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '8'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 8's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 8 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Comment 6 Matteo Castellini 2008-12-23 07:20:17 EST
I update the 'version' of this bug as it is still present in F10 (python-urlgrabber-3.0.0-10.fc10.noarch).
Comment 7 Bug Zapper 2009-11-18 05:11:27 EST
This message is a reminder that Fedora 10 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 10.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '10'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 10's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 10 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Comment 8 Matteo Castellini 2009-11-18 09:25:36 EST
Updating again the 'version' of this bug as it is still present in F12 (python-urlgrabber-3.9.1-2.fc12.noarch).
Comment 9 seth vidal 2010-09-03 17:00:42 EDT
okay I think you're right about the cases here. I've taken your patch and tested it out and it seems to produce sane results and doesn't break the test cases.

I'm going to commit it and credit you for it. Please test out urlgrabber from latest git to make sure it right for you.

thanks
Comment 10 Tim Wegener 2010-09-03 21:00:43 EDT
Created attachment 443014 [details]
Unit tests to verify urlgrabber range reget

Thanks!

I've attached some unit tests that verify the correct behaviour.
One test fails for python-urlgrabber-3.9.1-4.2.fc12.noarch
All the tests pass using the urlgrabber git snapshot from 2010-09-04.

Perhaps this is worth including with the main urlgrabber test suite.

BTW, shouldn't this bug remain open until the updated Fedora python-urlgrabber package is pushed out?

Note You need to log in before you can comment on or make changes to this bug.