classification
Title: urllib2 passes fragment identifier to server
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.1, Python 3.2, Python 2.7, Python 2.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: orsenthil Nosy List: dstanek, eric.araujo, naoki, orsenthil
Priority: normal Keywords: patch

Created on 2010-04-01 16:08 by naoki, last changed 2010-08-08 11:47 by orsenthil. This issue is now closed.

Files
File name Uploaded Description Edit
fragment.patch dstanek, 2010-08-03 22:38
Messages (4)
msg102103 - (view) Author: INADA Naoki (naoki) * Date: 2010-04-01 16:08
>>> urllib2.urlopen("http://wave-robot-python-client.googlecode.com/svn/trunk/pydocs/index.html#module-wavelet")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\usr\Python2.6\lib\urllib2.py", line 126, in urlopen
    return _opener.open(url, data, timeout)
  File "C:\usr\Python2.6\lib\urllib2.py", line 398, in open
    response = meth(req, response)
  File "C:\usr\Python2.6\lib\urllib2.py", line 511, in http_response
    'http', request, response, code, msg, hdrs)
  File "C:\usr\Python2.6\lib\urllib2.py", line 436, in error
    return self._call_chain(*args)
  File "C:\usr\Python2.6\lib\urllib2.py", line 370, in _call_chain
    result = func(*args)
  File "C:\usr\Python2.6\lib\urllib2.py", line 519, in http_error_default
    raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 400: Bad Request

This happens when redirected URL contains fragment.

>>> urllib2.urlopen("http://goo.gl/z1d5")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\usr\Python2.6\lib\urllib2.py", line 126, in urlopen
    return _opener.open(url, data, timeout)
  File "C:\usr\Python2.6\lib\urllib2.py", line 398, in open
    response = meth(req, response)
  File "C:\usr\Python2.6\lib\urllib2.py", line 511, in http_response
    'http', request, response, code, msg, hdrs)
  File "C:\usr\Python2.6\lib\urllib2.py", line 430, in error
    result = self._call_chain(*args)
  File "C:\usr\Python2.6\lib\urllib2.py", line 370, in _call_chain
    result = func(*args)
  File "C:\usr\Python2.6\lib\urllib2.py", line 606, in http_error_302
    return self.parent.open(new, timeout=req.timeout)
  File "C:\usr\Python2.6\lib\urllib2.py", line 398, in open
    response = meth(req, response)
  File "C:\usr\Python2.6\lib\urllib2.py", line 511, in http_response
    'http', request, response, code, msg, hdrs)
  File "C:\usr\Python2.6\lib\urllib2.py", line 436, in error
    return self._call_chain(*args)
  File "C:\usr\Python2.6\lib\urllib2.py", line 370, in _call_chain
    result = func(*args)
  File "C:\usr\Python2.6\lib\urllib2.py", line 519, in http_error_default
    raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 400: Bad Request

urllib2.Request.get_selector() should be:

    def get_selector(self):
        return self.__r_host.split('#')[0]
msg112712 - (view) Author: David Stanek (dstanek) Date: 2010-08-03 22:38
Added a patch to fix this behavior.
msg112721 - (view) Author: David Stanek (dstanek) Date: 2010-08-03 23:19
I have also uploaded my patch to http://codereview.appspot.com/1918042 so easier viewing.
msg113250 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2010-08-08 11:47
Fixed in revision 83818(py3k),  83819 (release31-maint) and 83820 (release27-maint).

David, a couple of comments on your patch.
- Request method was from urllib2, so the proper place of tests were test_urllib2. This already had Requests test so some additional tests were only required.
- Also, now the fragments are removed and sent to the server, a proper response could be obtained, it can tested via 'network' tests in test_urllib2net.

You might check the svn diffs to see the changes made. 
Thanks for the patch.
History
Date User Action Args
2010-08-08 11:47:46orsenthilsetstatus: open -> closed
resolution: accepted -> fixed
stage: resolved
2010-08-08 11:47:21orsenthilsetmessages: + msg113250
2010-08-03 23:19:44dstaneksetmessages: + msg112721
2010-08-03 22:38:18dstaneksetfiles: + fragment.patch

nosy: + dstanek
messages: + msg112712

keywords: + patch
2010-04-06 05:08:14orsenthilsetassignee: orsenthil
resolution: accepted
2010-04-01 16:16:10eric.araujosetnosy: + eric.araujo
2010-04-01 16:12:31brian.curtinsetnosy: + orsenthil

versions: - Python 2.5, Python 3.3
2010-04-01 16:09:24naokisettype: behavior
components: + Library (Lib)
versions: + Python 2.6, Python 2.5, Python 3.1, Python 2.7, Python 3.2, Python 3.3
2010-04-01 16:08:31naokicreate