Bug 669339

Summary: rpmlint seems to complain about valid source urls.
Product: [Fedora] Fedora Reporter: Terje Røsten <terjeros>
Component: rpmlintAssignee: Tom "spot" Callaway <tcallawa>
Status: CLOSED NOTABUG QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: 14CC: a.badger, tcallawa, tmz, ville.skytta, wolfy
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-01-13 14:22:46 EST Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Description Terje Røsten 2011-01-13 07:29:04 EST
Description of problem:

When running rpmlint on a srpm, with fully valid sources urls
(as far as I can understand), rpmlint still issue a warning.

Version-Release number of selected component (if applicable):

rpmlint-1.0-2.fc14.noarch

How reproducible:

run rpmlint on srpm in review request:

 https://bugzilla.redhat.com/show_bug.cgi?id=665853


Source is valid (wget works just fine), however rpmlint
still outputs:

 Source0:
http://h5py.googlecode.com/files/h5py-1.3.1.tar.gz HTTP Error 404: Not Found

Any ideas?
Comment 1 Tom "spot" Callaway 2011-01-13 11:46:52 EST
I'm not sure, but I've definitely seen this from googlecode links before.
Comment 2 Terje Røsten 2011-01-13 13:54:43 EST
Some more digging, seems to be something with the Google server, using HEAD I get:

$ HEAD http://h5py.googlecode.com/files/h5py-1.3.1.tar.gz
404 Not Found
Connection: close
Date: Thu, 13 Jan 2011 18:47:27 GMT
Server: codesite_downloads
Content-Length: 1377
Content-Type: text/html; charset=UTF-8
Client-Date: Thu, 13 Jan 2011 18:47:27 GMT
Client-Peer: 74.125.79.82:80
Client-Response-Num: 1
Set-Cookie: PREF=ID=b4aa6f58e6d8ec71:TM=1294944447:LM=1294944447:S=yoO_OYm6izYGqdo3; expires=Sat, 12-Jan-2013 18:47:27 GMT; path=/; domain=h5py.googlecode.com
X-Content-Type-Options: nosniff
X-XSS-Protection: 1; mode=block


rpmlint is using a similar HEAD trick in  AbstractCheck.py, I have simplified the code to:

from sys import argv
import urllib2
import socket
class _HeadRequest(urllib2.Request):
    def get_method(self):
        return "HEAD"

class _HeadRedirectHandler(urllib2.HTTPRedirectHandler):
    def redirect_request(*args):
        res = urllib2.HTTPRedirectHandler.redirect_request(*args)
        if res:
            res = _HeadRequest(res.get_full_url())
        return res

def check_url(url):
    timeout = 10
    print 'checking-url', url, '(timeout %s seconds)' % (timeout, )
    # Could use timeout kwarg to urlopen, but that's python >= 2.6 only
    socket.setdefaulttimeout(timeout)
    res = None
    try:
        opener = urllib2.build_opener(_HeadRedirectHandler())
        opener.addheaders = [('User-Agent',
                              'rpmlint/%s' % '1.0')]
        res = opener.open(_HeadRequest(url))
    except Exception, e:
        errstr = str(e) or repr(e) or type(e)
        print 'invalid-url', url, errstr
    else:
        print 'success'
    res and res.close()

if __name__ == '__main__':
    url = argv[1]
    check_url(url)

A very, very  quick review of the HTTP spec indicate a bug in the Google server as GET and HEAD
requests should return identical headers, not so here.
Comment 3 Tom "spot" Callaway 2011-01-13 14:22:46 EST
Thanks to Chris Dibona and Ali Pasha for letting me know that this is a known upstream issue with Googlecode here:

http://code.google.com/p/support/issues/detail?id=660