classification
Title: urllib2 doesn't escape spaces in http requests
Type: behavior Stage: test needed
Components: Library (Lib) Versions: Python 3.3, Python 3.2, Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: davide.rizzo, ezio.melotti, kiilerix, krisys, maker, orsenthil, ramchandra.apte, sandro.tosi
Priority: normal Keywords: patch

Created on 2011-11-06 20:13 by davide.rizzo, last changed 2012-01-12 15:35 by kiilerix.

Files
File name Uploaded Description Edit
issue13359.patch krisys, 2011-11-09 11:26 percent encoding of urls to fix the issue reported.
issue13359.patch maker, 2012-01-12 15:04 review
issue13359_py2.patch maker, 2012-01-12 15:30 review
Messages (7)
msg147180 - (view) Author: Davide Rizzo (davide.rizzo) Date: 2011-11-06 20:13
urllib2.urlopen('http://foo/url and spaces') will send a HTTP request line like this to the server:

GET /url and spaces HTTP/1.1

which the server obviously does not understand. This contrasts with urllib's behaviour which replaces the spaces (' ') in the url with '%20'.

Related: #918368 #1153027
msg147349 - (view) Author: Krishna Bharadwaj (krisys) Date: 2011-11-09 11:26
I have used the quote method to percent encode the url for spaces and similar characters. This is my first patch. Please let me know if there is anything wrong. I will correct and re-submit it. I ran the test_urllib2.py which gave an OK for 34 tests.

Changes are made in two instances:
1. in the open method.
2. in the __init__ of Request class to ensure that the same issue is addressed at the time of creating Request objects.
msg149441 - (view) Author: Ramchandra Apte (ramchandra.apte) Date: 2011-12-14 12:08
Seems good.
msg151126 - (view) Author: Michele OrrĂ¹ (maker) * Date: 2012-01-12 15:04
Patch attached for python3, with unit tests.
msg151127 - (view) Author: Mads Kiilerich (kiilerix) Date: 2012-01-12 15:10
FWIW, I don't think it is a good idea to escape automatically. It will change the behaviour in a non-backward compatible way for existing applications that pass encoded urls to this function.

I think the existing behaviour is better. The documentation and the failure mode for passing URLs with spaces could however be improved.
msg151129 - (view) Author: Michele OrrĂ¹ (maker) * Date: 2012-01-12 15:30
Here the patch for python2.


kiilerix, RFC 1738 explicitly says that the space character shall not be used.
msg151131 - (view) Author: Mads Kiilerich (kiilerix) Date: 2012-01-12 15:35
Yes, the url sent by urllib2 must not contain spaces. In my opinion the only way to handle that correctly is to not pass urls with spaces to urlopen. Escaping the urls is not a good solution - even if the API was to be designed from scratch. It would be better to raise an exception if it is passed an invalid url.

Note for example that '/' and the %-encoding of '/' are different, and it must thus be possible to pass an url containing both to urlopen. That is not possible if it automically escapes.
History
Date User Action Args
2012-01-12 15:35:57kiilerixsetmessages: + msg151131
2012-01-12 15:30:04makersetfiles: + issue13359_py2.patch

messages: + msg151129
2012-01-12 15:10:58kiilerixsetnosy: + kiilerix
messages: + msg151127
2012-01-12 15:04:33makersetfiles: + issue13359.patch
nosy: + maker
messages: + msg151126

2011-12-14 12:08:23ramchandra.aptesetnosy: + ramchandra.apte
messages: + msg149441
2011-12-14 10:55:00sandro.tosisetnosy: + sandro.tosi
2011-11-09 11:26:12krisyssetfiles: + issue13359.patch

nosy: + krisys
messages: + msg147349

keywords: + patch
2011-11-06 20:14:48ezio.melottisetnosy: + ezio.melotti

stage: test needed
2011-11-06 20:13:46davide.rizzocreate