classification
Title: http client error
Type: compile error Stage:
Components: Library (Lib) Versions: Python 3.1
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: jhylton Nosy List: cober, jhylton
Priority: normal Keywords:

Created on 2009-02-19 09:05 by cober, last changed 2009-03-27 21:20 by jhylton. This issue is now closed.

Files
File name Uploaded Description Edit
client.py cober, 2009-02-19 09:04 http.client
Messages (5)
msg82465 - (view) Author: cober J (cober) Date: 2009-02-19 09:04
Try to use http to send multi-byte utf8 data.

File "E:\DEVELOP\python\lib\http\client.py", line 904, in _send_request
    self.endheaders(body.encode('ascii'))
UnicodeEncodeError: 'ascii' codec can't encode character '\u7231' in 
position 119: ordinal not in range(128)


I modified the lib/http/client.py document, and fix this problem.
msg84227 - (view) Author: Jeremy Hylton (jhylton) (Python triager) Date: 2009-03-26 21:58
I'm not sure what to do here.  I guess changing to utf-8 is safe insofar
as the current code only accepts ascii, so the only code that breaks
will be code that depends on the encode() call raising an exception.  It
seems like the client out to specify the encoding, though.  We could fix
that via documentation.
msg84279 - (view) Author: Jeremy Hylton (jhylton) (Python triager) Date: 2009-03-27 19:08
The documentation is pretty vague on this point.  If you send something
other than plain ascii, it gets a bit tricky to figure out what other
headers need to be added.  It would be safer for the client to pick an
encoding (e.g. utf-8) and encode the string before calling request(). 
It affects the content-length and presumably also the content-type.
msg84280 - (view) Author: Jeremy Hylton (jhylton) (Python triager) Date: 2009-03-27 20:22
Ok.  Discovered that RFC 2616 says that iso-8859-1 is the default
charset, so I will use that to encode strings instead of ascii.  If you
want utf-8, you could encode the string yourself before calling
request().  Presumably, you should also add a content-type that explains
the charset.  I'll clarified this in the docs.
msg84281 - (view) Author: Jeremy Hylton (jhylton) (Python triager) Date: 2009-03-27 21:20
Committed revision 70638.
History
Date User Action Args
2009-03-27 21:20:32jhyltonsetstatus: open -> closed
resolution: fixed
messages: + msg84281
2009-03-27 20:22:46jhyltonsetassignee: jhylton
messages: + msg84280
2009-03-27 19:08:29jhyltonsetmessages: + msg84279
2009-03-26 21:58:22jhyltonsetnosy: + jhylton
messages: + msg84227
2009-02-19 09:05:00cobercreate