57295 – Python 2.1.1 urllib.py open_https() drops protocol

Bug 57295 - Python 2.1.1 urllib.py open_https() drops protocol

Summary: Python 2.1.1 urllib.py open_https() drops protocol

Keywords:
Status:	CLOSED RAWHIDE
Alias:	None
Product:	Red Hat Linux
Classification:	Retired
Component:	python
Sub Component:
Version:	7.2
Hardware:	i386
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Trond Eivind Glomsrxd
QA Contact:	Brock Organ
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2001-12-09 02:35 UTC by Joe Antao
Modified:	2007-04-18 16:38 UTC (History)
CC List:	0 users
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2002-08-14 16:35:57 UTC
Embargoed:

Attachments	(Terms of Use)
patches /usr/lib/python2.1/urllib.py (634 bytes, patch) 2001-12-09 02:36 UTC, Joe Antao	no flags	Details \| Diff
View All

Description Joe Antao 2001-12-09 02:35:07 UTC

From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.6) Gecko/20011120

Description of problem:
When retrieving an https web page using URLopener.open() in urllib, the
object returned is missing the protocol in its 'url' member.

Specifically, the class method open_https(), which is called from
URLopener.open(),drops the protocol.




Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
The following example is from an interactive python session:

$ python
Python 2.1.1 (#1, Aug 13 2001, 19:37:40) 
[GCC 2.96 20000731 (Red Hat Linux 7.1 2.96-96)] on linux2
Type "copyright", "credits" or "license" for more information.
>>> import urllib
>>> retriever = urllib.URLopener()
>>> webpage = retriever.open('http://www.yahoo.com')
>>> print webpage.url
http://www.yahoo.com
>>> webpage = retriever.open('https://aims.parl.com')
>>> print webpage.url
//aims.parl.com
>>> 


Actual Results:  Notice that when retrieving a web page with http
protocol, the object returned has its protocol intact in its 'url' class
member.

When retrieving a page using the https protocol, the protocol is missing
in the 'url' class member.

Expected Results:  The object returned from the https web page should
have the protocol included in its 'url' class member.

Additional info:

Bug is caused in urllib module located in /usr/lib/python2.1/urllib.py.

Comparing the open_http() and open_https() URLopener class methods, you can
see the difference (and cause of bug) in the return statement that calls
addinfourl(). 

I will upload a patch to fix this.

Comment 1 Joe Antao 2001-12-09 02:36:22 UTC

Created attachment 40188 [details]
patches /usr/lib/python2.1/urllib.py

Comment 2 Trond Eivind Glomsrxd 2002-08-14 16:50:39 UTC

Fixed in 2.2.1-16, and most likely other 2.2 versions as well.

Note You need to log in before you can comment on or make changes to this bug.