This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author Emil Stenström
Recipients Emil Stenström, ezio.melotti, vstinner
Date 2016-01-07.22:27:16
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1452205637.36.0.0359255361227.issue26045@psf.upfronthosting.co.za>
In-reply-to
Content
This issue is in response to this thread on python-ideas: https://mail.python.org/pipermail/python-ideas/2016-January/037678.html

Note that Cory did a lot of encoding background work here:
https://mail.python.org/pipermail/python-ideas/2016-January/037680.html

---
Bug description:

When posting an unencoded unicode string directly with python-requests you get the following stacktrace:

import requests
r = requests.post("http://example.com", data="Celebrate 🎉") 
...
  File "../lib/python3.4/http/client.py", line 1127, in _send_request
    body = body.encode('iso-8859-1')
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 14-15: ordinal not in range(256) 

This is because requests uses http.client, and http.client assumes the encoding to be latin-1 if given a unicode string. This is a very common source of bugs for beginners who assume sending in unicode would automatically encode it in utf-8, like in the libraries of many other languages.

The simplest fix here is to catch the UnicodeEncodeError and improve the error message to something that points beginners in the right direction.

Another option would be to:
- Keep encoding in latin-1 first, and if that fails try utf-8

Other possible solutions (that would be backwards incompatible) includes:
- Changing the default encoding to utf-8 instead of latin-1
- Detect an unencoded unicode string and fail without encoding it with a descriptive error message

---

Just to show that this is a problem that exists in the wild, here are a few examples that all crashes on the same line in http.client (not all going through the requests library:

- https://github.com/kennethreitz/requests/issues/2838
- https://github.com/kennethreitz/requests/issues/1822
- http://stackoverflow.com/questions/34618149/post-unicode-string-to-web-service-using-python-requests-library
- https://www.reddit.com/r/learnpython/comments/3violw/unicodeencodeerror_when_searching_ebay_with/
- https://github.com/codecov/codecov-python/issues/35
- https://github.com/google/google-api-python-client/issues/145
- https://bugs.launchpad.net/ubuntu/+source/lazr.restfulclient/+bug/1414063
History
Date User Action Args
2016-01-07 22:27:17Emil Stenströmsetrecipients: + Emil Stenström, vstinner, ezio.melotti
2016-01-07 22:27:17Emil Stenströmsetmessageid: <1452205637.36.0.0359255361227.issue26045@psf.upfronthosting.co.za>
2016-01-07 22:27:17Emil Stenströmlinkissue26045 messages
2016-01-07 22:27:16Emil Stenströmcreate