Bug 443886

Summary: broken urlgrab when regetting an already downloaded file
Product: [Fedora] Fedora Reporter: Matteo Castellini <self>
Component: python-urlgrabberAssignee: Jeremy Katz <katzj>
Status: CLOSED DUPLICATE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: low Docs Contact:
Priority: low    
Version: 8   
Target Milestone: ---   
Target Release: ---   
Hardware: noarch   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-05-28 20:03:44 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Matteo Castellini 2008-04-23 21:06:54 UTC
Description of problem:
While using an instance of urlgrabber.grabber.URLGrabber with reget turned on
(both reget='simple' and reget='check_timestamp') if I try to get an already
completely downloaded file it raises an exception.

Version-Release number of selected component (if applicable):
python-urlgrabber-3.0.0-3.fc8

How reproducible:
Always

Steps to Reproduce:
1.Execute the following Python code:
from urlgrabber import grabber
grab = grabber.URLGrabber(reget='simple')
grab.urlgrab('http://www.ietf.org/rfc/rfc1.txt')
grab.urlgrab('http://www.ietf.org/rfc/rfc1.txt')
  
Actual results:
>>> from urlgrabber import grabber
>>> grab = grabber.URLGrabber(reget='simple')
>>> grab.urlgrab('http://www.ietf.org/rfc/rfc1.txt')
'rfc1.txt'
>>> grab.urlgrab('http://www.ietf.org/rfc/rfc1.txt')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.5/site-packages/urlgrabber/grabber.py", line 928, in
urlgrab
    return self._retry(opts, retryfunc, url, filename)
  File "/usr/lib/python2.5/site-packages/urlgrabber/grabber.py", line 846, in _retry
    r = apply(func, (opts,) + args, {})
  File "/usr/lib/python2.5/site-packages/urlgrabber/grabber.py", line 914, in
retryfunc
    fo = URLGrabberFileObject(url, filename, opts)
  File "/usr/lib/python2.5/site-packages/urlgrabber/grabber.py", line 1002, in
__init__
    self._do_open()
  File "/usr/lib/python2.5/site-packages/urlgrabber/grabber.py", line 1069, in
_do_open
    fo, hdr = self._make_request(req, opener)
  File "/usr/lib/python2.5/site-packages/urlgrabber/grabber.py", line 1174, in
_make_request
    raise URLGrabError(9, str(e))
urlgrabber.grabber.URLGrabError: [Errno 9] Requested Range Not Satisfiable

Expected results:
>>> from urlgrabber import grabber
>>> grab = grabber.URLGrabber(reget='simple')
>>> grab.urlgrab('http://www.ietf.org/rfc/rfc1.txt')
'rfc1.txt'
>>> grab.urlgrab('http://www.ietf.org/rfc/rfc1.txt')
'rfc1.txt'

Additional info:
Everything works fine with reget=None. This behaviour doesn't happen for ftp://
and file:// urls. Can be reproduced in rawhide too (python-urlgrabber-3.0.0-6.fc9).

Comment 1 Jeremy Katz 2008-05-28 20:03:44 UTC

*** This bug has been marked as a duplicate of 442165 ***