Bug 1000841

Summary: python-urlgrabber incorrectly parses headers from googledrive.com
Product: [Fedora] Fedora Reporter: Krzysztof Pawlik <krzysiek.pawlik>
Component: python-urlgrabberAssignee: Packaging Maintenance Team <packaging-team-maint>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: 19CC: jzeleny, packaging-team-maint, vmukhame
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-03-12 15:36:28 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Krzysztof Pawlik 2013-08-25 21:08:56 UTC
Description of problem:
python-urlgrabber when grabbing repomd.xml file from repository hosted on https://googledrive.com/ (public URL to directories shared via Google Drive) gets confused about headers, it tries to look for Content-Length:

1292             if self.scheme in ['http','https']:
1293                 if buf.lower().find('content-length') != -1:
1294                     length = buf.split(':')[1]
1295                     self.size = int(length)

But the headers from googledrive.com contain:

Content-Length: 2973
...
Access-Control-Allow-Origin: *
Access-Control-Allow-Credentials: false
Access-Control-Allow-Headers: Accept, Accept-Language, Authorization, Cache-Control, Content-Disposition, Content-Encoding, Content-Language, Content-Length, Content-MD5, Content-Range, Content-Type, Date, GData-Version, Host, If-Match, If-Modified-Since, If-None-Match, If-Unmodified-Since, Origin, OriginToken, Pragma, Range, Slug, Transfer-Encoding, X-ClientDetails, X-GData-Client, X-GData-Key, X-Goog-AuthUser, X-Goog-Encode-Response-If-Executable, X-Goog-Correlation-Id, X-Goog-Upload-Command, X-Goog-Upload-Content-Disposition, X-Goog-Upload-Content-Length, X-Goog-Upload-Content-Type, X-Goog-Upload-Offset, X-Goog-Upload-Protocol, X-HTTP-Method-Override, X-JavaScript-User-Agent, X-Origin, X-Referer, X-Upload-Content-Length, X-Upload-Content-Type, X-Use-HTTP-Status-Code-Override, X-YouTube-VVT, X-YouTube-Page-CL, X-YouTube-Page-Timestamp

Because Access-Control-Allow-Headers contains 'Content-Length' the above code get's confused and breaks with:

~ # yum update
Loaded plugins: langpacks, ps, refresh-packagekit, remove-with-leaves
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/urlgrabber/grabber.py", line 1295, in _hdr_retrieve
    self.size = int(length)
ValueError: invalid literal for int() with base 10: 'Accept, Accept-Language, Authorization, Cache-Control, Content-Disposition, Content-Encoding, Content-Language, Content-Length, Content-MD5, Content-Range, Content-Type, Date, GData-Version, Host, If-'

Version-Release number of selected component (if applicable):
3.9.1-27.fc19

How reproducible:
Always.

Steps to Reproduce:
1. Upload repomd.xmlto shared directory on Google Drive, share the directory with everyone
2. Configure as repository in Yum
3. Try updating

Actual results:
Above exception.

Expected results:
Successful download of repository metadata.

Additional info:
-- none --

Comment 1 Zdeněk Pavlas 2013-08-26 07:11:50 UTC
Thanks for the report!  Fixed upstream http://yum.baseurl.org/gitweb?p=urlgrabber.git;a=commitdiff;h=4374b6b4c6196f1c770954d2ceb136daa62fbb66

Comment 2 Valentina Mukhamedzhanova 2014-03-12 15:36:28 UTC
Closing, as this is fixed in Fedora 19.