Issue 7920: urllib2.HTTPRedirectHandler incorrect redirect

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/52168

classification

Title:	urllib2.HTTPRedirectHandler incorrect redirect
Type:		Stage:
Components:	Library (Lib)	Versions:	Python 2.7, Python 2.6, Python 2.5

process

Status:	closed	Resolution:	wont fix
Dependencies:		Superseder:
Assigned To:	orsenthil	Nosy List:	andresriancho, l0nwlf, orsenthil
Priority:	normal	Keywords:

Created on 2010-02-12 21:44 by andresriancho, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (4)
msg99292 - (view)	Author: Andres Riancho (andresriancho)	Date: 2010-02-12 21:44
Buggy code: """ if 'location' in headers: newurl = headers.getheaders('location')[0] elif 'uri' in headers: newurl = headers.getheaders('uri')[0] else: return newurl = urlparse.urljoin(req.get_full_url(), newurl) """ You might end up being redirected to some "strange" location if for some reason the value of "location" is C:\boot.ini, and you urlparse.urljoin the current URL with that one, you end up with C:\boot.ini . When the urllib2 library opens that, it will open a local file. What I did to fix it, is to verify that the protocol of the newurl is http or https. """ correct_protocol = newurl.startswith('http://') or newurl.startswith('https://') if not correct_protocol: return """ The fix should be applied just below the dangerous urlparse.urljoin.
msg99294 - (view)	Author: Shashwat Anand (l0nwlf)	Date: 2010-02-12 23:02
In Python 2.7a3+, trunk:78165 the code is as follows: (note that there is a segment of code to fix malformed URLs) """ if 'location' in headers: newurl = headers.getheaders('location')[0] elif 'uri' in headers: newurl = headers.getheaders('uri')[0] else: return # fix a possible malformed URL urlparts = urlparse.urlparse(newurl) if not urlparts.path: urlparts = list(urlparts) urlparts[2] = "/" newurl = urlparse.urlunparse(urlparts) newurl = urlparse.urljoin(req.get_full_url(), newurl) """
msg99318 - (view)	Author: Andres Riancho (andresriancho)	Date: 2010-02-13 14:43
The problem is still there in 2.7: >>> urlparts = urlparse.urlparse('C:\\boot.ini') >>> urlparts ('c', '', '\\boot.ini', '', '', '') >>> if not urlparts.path: ... urlparts = list(urlparts) ... urlparts[2] = "/" ... >>> urlparts ('c', '', '\\boot.ini', '', '', '') >>> newurl = urlparse.urlunparse(urlparts) >>> newurl 'c:\\boot.ini' >>> newurl = urlparse.urljoin( 'http://host.tld/spam/eggs.py', newurl) >>> newurl 'c:\\boot.ini' >>>
msg99818 - (view)	Author: Senthil Kumaran (orsenthil) *	Date: 2010-02-22 19:48
I disagree with this bug report. First the responsibility of checking if something is not malicious URL lies at the application/client end, urllib in its redirection never urlopen()'s the redirected url, so you are not harmed in anyway. If you wanted to be extra careful of the BAD Server, you can always subclass the redirection methods and use those handlers. Closing this bug as won't fix.

History
Date	User	Action	Args
2022-04-11 14:56:57	admin	set	github: 52168
2010-02-22 19:48:50	orsenthil	set	status: open -> closed assignee: orsenthil resolution: wont fix messages: + msg99818
2010-02-13 14:43:12	andresriancho	set	messages: + msg99318 versions: + Python 2.7
2010-02-13 04:33:37	l0nwlf	set	nosy: + orsenthil
2010-02-12 23:02:42	l0nwlf	set	nosy: + l0nwlf messages: + msg99294
2010-02-12 21:44:43	andresriancho	create