classification
Title: http_error_302() crashes with 'HTTP/1.1 400 Bad Request
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 2.7, Python 2.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: orsenthil Nosy List: ajaksu2, jepler, jhylton, jjlee, orsenthil, pristine777, robzed, vstinner
Priority: normal Keywords: easy

Created on 2005-02-27 20:16 by pristine777, last changed 2019-04-10 10:19 by vstinner. This issue is now closed.

Files
File name Uploaded Description Edit
302_with_spaces.diff ajaksu2, 2009-02-09 01:18 Replace spaces with "%20"
Messages (9)
msg60681 - (view) Author: pristine777 (pristine777) Date: 2005-02-27 20:16
I was able to get to a website by using both IE and 
FireFox but my Python code kept giving HTTP 400 Bad 
request error. To debug, I set set_http_debuglevel(1) as 
in the following code:

hh = urllib2.HTTPHandler() 
hh.set_http_debuglevel(1)
opener = urllib2.build_opener
(hh,urllib2.HTTPCookieProcessor(self.cj))

The printed debug messages show that this crash 
happens when there is a space in the redirected 
location. Here's a cut-and-paste of the relevant debug 
messages (note the line starting with send that 
http_error_302 is sending):

reply: 'HTTP/1.1 302 Moved Temporarily\r\n'
header: Connection: close
header: Date: Sun, 27 Feb 2005 19:52:51 GMT
header: Server: Microsoft-IIS/6.0
<---other header data-->
send: 'GET /myEmail/User?asOf=02/26/2005 11:38:12 
PM&
ddn=87cb51501730
<---remaining header data-->
reply: 'HTTP/1.1 400 Bad Request\r\n'
header: Content-Type: text/html
header: Date: Sun, 27 Feb 2005 19:56:45 GMT
header: Connection: close
header: Content-Length: 20

To fix this, I first tried to encode the redirected location 
in the function http_error_302() in urllib2 using the 
methods urllib.quote and urllib.urlencode but to no avail 
(they encode other data as well). 

A temporary solution that works is to replace any space 
in the redirected URL by'%20'. Below is a snippet of the 
function http_error_302 in urllib2 with this suggested fix:


def http_error_302(self, req, fp, code, msg, headers):
        # Some servers (incorrectly) return multiple 
Location headers
        # (so probably same goes for URI).  Use first 
header.
        if 'location' in headers:
            newurl = headers.getheaders('location')[0]
        elif 'uri' in headers:
            newurl = headers.getheaders('uri')[0]
        else:
            return
        newurl=newurl.replace(' ','%20') # <<< TEMP FIX - 
inserting this line temporarily fixes this problem
        newurl = urlparse.urljoin(req.get_full_url(), newurl)
       <--- remainder of this function -->


Thanks!

msg60682 - (view) Author: Jeff Epler (jepler) Date: 2005-03-01 17:41
Logged In: YES 
user_id=2772

When the server sends the 302 response with 'Location:
http://example.com/url%20with%20whitespace', urllib2 seems
to work just fine.

I believe based on reading rfc2396 that a URL that contains
spaces must contain quoted spaces (%20) not literal spaces,
because space is not an "unreserved character" [2.3] and
"[d]ata must be escaped if it does not have a representation
using an unreserved character" [2.4].
msg60683 - (view) Author: John J Lee (jjlee) Date: 2005-05-19 19:30
Logged In: YES 
user_id=261020

Sure, but if Firefox and IE do it, probably we should do the
same.

I think cookielib.escape_path(), or something similar
(perhaps without the case normalisation) is probably the
right thing to do.  That's not part of any documented API; I
suppose that function or a similar one should be added to
module urlparse, and used by urllib2 and urllib when
redirecting.
msg81430 - (view) Author: Daniel Diniz (ajaksu2) (Python triager) Date: 2009-02-09 01:18
As always with urllib, the fix is trivial but adding a test is hard.
msg81474 - (view) Author: Rob Probin (robzed) Date: 2009-02-09 19:10
Appears to be the same as issue 918368
msg81491 - (view) Author: John J Lee (jjlee) Date: 2009-02-09 21:13
This bug refers to urllib2.  Issue 918368 refers to urllib.  It's the
same problem in each case, though.
msg87269 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2009-05-05 18:43
fixed in revision 43132 ( smaller 'r' for the roundup to auto-hyperlink). :)
msg87271 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2009-05-05 18:46
Sorry, I meant fixed in revision 72351.
msg339843 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-04-10 10:19
> Sorry, I meant fixed in revision 72351.

Commit in Git:

commit 690ce9b353bc0a86d0886470adbaa50e813de3b8
Author: Senthil Kumaran <orsenthil@gmail.com>
Date:   Tue May 5 18:41:13 2009 +0000

    Fix for issue1153027, making Py3k changes similar to fix in issue918368.
    This will address:
    a) urllib/ in py3k,
    b) urllib in py2x is addressed by issue918368.
    c) urllib2 in py2x was already addressed in Revision 43132.
History
Date User Action Args
2019-04-10 10:19:19vstinnersetnosy: + vstinner
messages: + msg339843
2009-05-05 18:46:08orsenthilsetmessages: + msg87271
2009-05-05 18:45:28orsenthilsetmessages: - msg87268
2009-05-05 18:43:57orsenthilsetmessages: + msg87269
2009-05-05 18:42:12orsenthilsetstatus: open -> closed

assignee: orsenthil

nosy: + orsenthil
messages: + msg87268
resolution: fixed
stage: test needed -> resolved
2009-04-22 12:46:56ajaksu2setkeywords: + easy, - patch
2009-03-28 12:24:55jhyltonsetnosy: + jhylton
2009-02-12 18:28:48ajaksu2linkissue918368 dependencies
2009-02-12 00:45:04ajaksu2setversions: + Python 2.6
2009-02-12 00:43:57ajaksu2settype: behavior
stage: test needed
2009-02-09 21:13:12jjleesetmessages: + msg81491
2009-02-09 19:10:09robzedsetnosy: + robzed
messages: + msg81474
2009-02-09 01:18:19ajaksu2setfiles: + 302_with_spaces.diff
nosy: + ajaksu2
messages: + msg81430
keywords: + patch
versions: + Python 2.7, - Python 2.4
2005-02-27 20:16:58pristine777create