Message 60681 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	pristine777
Recipients
Date	2005-02-27.20:16:58
SpamBayes Score
Marked as misclassified
Message-id
In-reply-to

Content
I was able to get to a website by using both IE and FireFox but my Python code kept giving HTTP 400 Bad request error. To debug, I set set_http_debuglevel(1) as in the following code: hh = urllib2.HTTPHandler() hh.set_http_debuglevel(1) opener = urllib2.build_opener (hh,urllib2.HTTPCookieProcessor(self.cj)) The printed debug messages show that this crash happens when there is a space in the redirected location. Here's a cut-and-paste of the relevant debug messages (note the line starting with send that http_error_302 is sending): reply: 'HTTP/1.1 302 Moved Temporarily\r\n' header: Connection: close header: Date: Sun, 27 Feb 2005 19:52:51 GMT header: Server: Microsoft-IIS/6.0 <---other header data--> send: 'GET /myEmail/User?asOf=02/26/2005 11:38:12 PM& ddn=87cb51501730 <---remaining header data--> reply: 'HTTP/1.1 400 Bad Request\r\n' header: Content-Type: text/html header: Date: Sun, 27 Feb 2005 19:56:45 GMT header: Connection: close header: Content-Length: 20 To fix this, I first tried to encode the redirected location in the function http_error_302() in urllib2 using the methods urllib.quote and urllib.urlencode but to no avail (they encode other data as well). A temporary solution that works is to replace any space in the redirected URL by'%20'. Below is a snippet of the function http_error_302 in urllib2 with this suggested fix: def http_error_302(self, req, fp, code, msg, headers): # Some servers (incorrectly) return multiple Location headers # (so probably same goes for URI). Use first header. if 'location' in headers: newurl = headers.getheaders('location')[0] elif 'uri' in headers: newurl = headers.getheaders('uri')[0] else: return newurl=newurl.replace(' ','%20') # <<< TEMP FIX - inserting this line temporarily fixes this problem newurl = urlparse.urljoin(req.get_full_url(), newurl) <--- remainder of this function --> Thanks!

I was able to get to a website by using both IE and 
FireFox but my Python code kept giving HTTP 400 Bad 
request error. To debug, I set set_http_debuglevel(1) as 
in the following code:

hh = urllib2.HTTPHandler() 
hh.set_http_debuglevel(1)
opener = urllib2.build_opener
(hh,urllib2.HTTPCookieProcessor(self.cj))

The printed debug messages show that this crash 
happens when there is a space in the redirected 
location. Here's a cut-and-paste of the relevant debug 
messages (note the line starting with send that 
http_error_302 is sending):

reply: 'HTTP/1.1 302 Moved Temporarily\r\n'
header: Connection: close
header: Date: Sun, 27 Feb 2005 19:52:51 GMT
header: Server: Microsoft-IIS/6.0
<---other header data-->
send: 'GET /myEmail/User?asOf=02/26/2005 11:38:12 
PM&
ddn=87cb51501730
<---remaining header data-->
reply: 'HTTP/1.1 400 Bad Request\r\n'
header: Content-Type: text/html
header: Date: Sun, 27 Feb 2005 19:56:45 GMT
header: Connection: close
header: Content-Length: 20

To fix this, I first tried to encode the redirected location 
in the function http_error_302() in urllib2 using the 
methods urllib.quote and urllib.urlencode but to no avail 
(they encode other data as well). 

A temporary solution that works is to replace any space 
in the redirected URL by'%20'. Below is a snippet of the 
function http_error_302 in urllib2 with this suggested fix:


def http_error_302(self, req, fp, code, msg, headers):
        # Some servers (incorrectly) return multiple 
Location headers
        # (so probably same goes for URI).  Use first 
header.
        if 'location' in headers:
            newurl = headers.getheaders('location')[0]
        elif 'uri' in headers:
            newurl = headers.getheaders('uri')[0]
        else:
            return
        newurl=newurl.replace(' ','%20') # <<< TEMP FIX - 
inserting this line temporarily fixes this problem
        newurl = urlparse.urljoin(req.get_full_url(), newurl)
       <--- remainder of this function -->


Thanks!

History
Date	User	Action	Args
2008-01-20 09:57:34	admin	link	issue1153027 messages
2008-01-20 09:57:34	admin	create