Bug 442165 - urlgrab broken when specifying range with reget turned on
Summary: urlgrab broken when specifying range with reget turned on
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: Fedora
Classification: Fedora
Component: python-urlgrabber
Version: 12
Hardware: All
OS: Linux
low
medium
Target Milestone: ---
Assignee: James Antill
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 443886 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-04-12 09:17 UTC by Tim Wegener
Modified: 2014-01-21 06:10 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-09-03 21:00:42 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
Unit tests to verify urlgrabber range reget (2.26 KB, text/x-python)
2010-09-04 01:00 UTC, Tim Wegener
no flags Details

Description Tim Wegener 2008-04-12 09:17:01 UTC
Description of problem:
Running urlgrabber.urlgrab with reget turned on (reget='simple') and specifying
a byte range results in incorrect behaviour, and generally a corrupted download
file, when resuming a previous partial download.

Version-Release number of selected component (if applicable):
python-urlgrabber-3.0.0-3.fc8

How reproducible:
Always

Steps to Reproduce:
1. Run the following python code:
{{{
import urlgrabber
url = 'http://www.ietf.org/rfc/rfc2822.txt'
# Grab a small chunk
urlgrabber.urlgrab(url, 
                   range=(0, 5000), 
                   reget='simple',
                   )
# Grab a larger chunk, resuming from the end of the smaller chunk
urlgrabber.urlgrab(url, 
                   range=(0, 10000), 
                   reget='simple',
                   )
}}}

urlgrabber.urlgrab(url, 
                   filename=dest_filename,
                   range=(0, end_byte), 
                   reget='simple',
                   )
  
Actual results:
$ ls -l rfc2822.txt
-rw-rw-r-- 1 tim tim 15000 2001-04-19 01:13 rfc2822.txt
(i.e. 5000+10000 bytes downloaded)

Expected results:
$ ls -l rfc2822.txt
-rw-rw-r-- 1 tim tim 10000 2001-04-19 01:13 rfc2822.txt
(i.e. 5000+5000 bytes downloaded)

Additional info:
Problem appears to be in line 1154 of
/usr/lib/python2.5/site-packages/urlgrabber/grabber.py
if rt[0]: rt = (rt[0] + reget_length, rt[1])

It probably should be something like:
if rt[0] is None:
    rt = (0, rt[1])
rt = (rt[0] + reget_length, rt[1])

There does not seem to be a workaround that can be used without fixing urlgrabber.

Comment 1 Jeremy Katz 2008-05-28 20:03:45 UTC
*** Bug 443886 has been marked as a duplicate of this bug. ***

Comment 2 James Antill 2008-05-29 03:00:54 UTC
 I might have accidentally fixed this recently, can you check the latest
urlgrabber in Fed-9/rawhide?


Comment 3 Matteo Castellini 2008-05-30 11:58:46 UTC
I can confirm that this bug is still present both in F9
(python-urlgrabber-3.0.0-8.fc9) and rawhide (python-urlgrabber-3.0.0-8.fc10).

Comment 4 Tim Wegener 2008-06-07 13:25:03 UTC
I can also confirm that this bug is still present in F9
(python-urlgrabber-3.0.0-8.fc9.noarch). 


Comment 5 Bug Zapper 2008-11-26 10:28:50 UTC
This message is a reminder that Fedora 8 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 8.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '8'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 8's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 8 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 6 Matteo Castellini 2008-12-23 12:20:17 UTC
I update the 'version' of this bug as it is still present in F10 (python-urlgrabber-3.0.0-10.fc10.noarch).

Comment 7 Bug Zapper 2009-11-18 10:11:27 UTC
This message is a reminder that Fedora 10 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 10.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '10'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 10's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 10 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 8 Matteo Castellini 2009-11-18 14:25:36 UTC
Updating again the 'version' of this bug as it is still present in F12 (python-urlgrabber-3.9.1-2.fc12.noarch).

Comment 9 seth vidal 2010-09-03 21:00:42 UTC
okay I think you're right about the cases here. I've taken your patch and tested it out and it seems to produce sane results and doesn't break the test cases.

I'm going to commit it and credit you for it. Please test out urlgrabber from latest git to make sure it right for you.

thanks

Comment 10 Tim Wegener 2010-09-04 01:00:43 UTC
Created attachment 443014 [details]
Unit tests to verify urlgrabber range reget

Thanks!

I've attached some unit tests that verify the correct behaviour.
One test fails for python-urlgrabber-3.9.1-4.2.fc12.noarch
All the tests pass using the urlgrabber git snapshot from 2010-09-04.

Perhaps this is worth including with the main urlgrabber test suite.

BTW, shouldn't this bug remain open until the updated Fedora python-urlgrabber package is pushed out?


Note You need to log in before you can comment on or make changes to this bug.