From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.6) Gecko/20011120 Description of problem: When retrieving an https web page using URLopener.open() in urllib, the object returned is missing the protocol in its 'url' member. Specifically, the class method open_https(), which is called from URLopener.open(),drops the protocol. Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: The following example is from an interactive python session: $ python Python 2.1.1 (#1, Aug 13 2001, 19:37:40) [GCC 2.96 20000731 (Red Hat Linux 7.1 2.96-96)] on linux2 Type "copyright", "credits" or "license" for more information. >>> import urllib >>> retriever = urllib.URLopener() >>> webpage = retriever.open('http://www.yahoo.com') >>> print webpage.url http://www.yahoo.com >>> webpage = retriever.open('https://aims.parl.com') >>> print webpage.url //aims.parl.com >>> Actual Results: Notice that when retrieving a web page with http protocol, the object returned has its protocol intact in its 'url' class member. When retrieving a page using the https protocol, the protocol is missing in the 'url' class member. Expected Results: The object returned from the https web page should have the protocol included in its 'url' class member. Additional info: Bug is caused in urllib module located in /usr/lib/python2.1/urllib.py. Comparing the open_http() and open_https() URLopener class methods, you can see the difference (and cause of bug) in the return statement that calls addinfourl(). I will upload a patch to fix this.
Created attachment 40188 [details] patches /usr/lib/python2.1/urllib.py
Fixed in 2.2.1-16, and most likely other 2.2 versions as well.