Bug 443886

Summary:	broken urlgrab when regetting an already downloaded file
Product:	[Fedora] Fedora	Reporter:	Matteo Castellini <self>
Component:	python-urlgrabber	Assignee:	Jeremy Katz <katzj>
Status:	CLOSED DUPLICATE	QA Contact:	Fedora Extras Quality Assurance <extras-qa>
Severity:	low	Docs Contact:
Priority:	low
Version:	8
Target Milestone:	---
Target Release:	---
Hardware:	noarch
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2008-05-28 20:03:44 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Matteo Castellini 2008-04-23 21:06:54 UTC

Description of problem:
While using an instance of urlgrabber.grabber.URLGrabber with reget turned on
(both reget='simple' and reget='check_timestamp') if I try to get an already
completely downloaded file it raises an exception.

Version-Release number of selected component (if applicable):
python-urlgrabber-3.0.0-3.fc8

How reproducible:
Always

Steps to Reproduce:
1.Execute the following Python code:
from urlgrabber import grabber
grab = grabber.URLGrabber(reget='simple')
grab.urlgrab('http://www.ietf.org/rfc/rfc1.txt')
grab.urlgrab('http://www.ietf.org/rfc/rfc1.txt')
  
Actual results:
>>> from urlgrabber import grabber
>>> grab = grabber.URLGrabber(reget='simple')
>>> grab.urlgrab('http://www.ietf.org/rfc/rfc1.txt')
'rfc1.txt'
>>> grab.urlgrab('http://www.ietf.org/rfc/rfc1.txt')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.5/site-packages/urlgrabber/grabber.py", line 928, in
urlgrab
    return self._retry(opts, retryfunc, url, filename)
  File "/usr/lib/python2.5/site-packages/urlgrabber/grabber.py", line 846, in _retry
    r = apply(func, (opts,) + args, {})
  File "/usr/lib/python2.5/site-packages/urlgrabber/grabber.py", line 914, in
retryfunc
    fo = URLGrabberFileObject(url, filename, opts)
  File "/usr/lib/python2.5/site-packages/urlgrabber/grabber.py", line 1002, in
__init__
    self._do_open()
  File "/usr/lib/python2.5/site-packages/urlgrabber/grabber.py", line 1069, in
_do_open
    fo, hdr = self._make_request(req, opener)
  File "/usr/lib/python2.5/site-packages/urlgrabber/grabber.py", line 1174, in
_make_request
    raise URLGrabError(9, str(e))
urlgrabber.grabber.URLGrabError: [Errno 9] Requested Range Not Satisfiable

Expected results:
>>> from urlgrabber import grabber
>>> grab = grabber.URLGrabber(reget='simple')
>>> grab.urlgrab('http://www.ietf.org/rfc/rfc1.txt')
'rfc1.txt'
>>> grab.urlgrab('http://www.ietf.org/rfc/rfc1.txt')
'rfc1.txt'

Additional info:
Everything works fine with reget=None. This behaviour doesn't happen for ftp://
and file:// urls. Can be reproduced in rawhide too (python-urlgrabber-3.0.0-6.fc9).

Comment 1 Jeremy Katz 2008-05-28 20:03:44 UTC


*** This bug has been marked as a duplicate of 442165 ***