Title: Content-Type when sending data with urlopen()
Components: Documentation Versions: Python 3.6, Python 3.5, Python 2.7
Created on 2015-02-01 12:43 by martin.panter, last changed 2022-04-11 14:58 by admin.

msg235166 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-02-01 12:43
Currently the documentation gives the impression that the “data” parameter to Request() has to be in the application/x-www-form-urlencoded format. However I suspect that you can override the type by supplying a Content-Type header, and I would like to document this; see uploaded patch.

I noticed that test_urllib2.HandlerTests.test_http() already seems to test the default Content-Type and a custom Content-Type with a Request() object, although I did not see a test for the default Content-Type when supplying “data” directly to urlopen().

Also I understand the “charset” parameter on application/x-www-form-urlencoded is not standardized. Would it correspond to the encoding of the %XX codes from urlencode(), which is typically UTF-8, not Latin-1? Or would it correspond to the subsequent string-to-bytes encoding stage, which could just be ASCII since non-ASCII characters are already encoded? Maybe it would be best to drop the advice to set a “charset” parameter. It was added for Issue 11082.
msg235219 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-02-02 02:18
Updated patch to explain that a Request object is generated internally for urlopen(data=...), and added a test to confirm. Also removed some confusing dead code.
msg238811 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-03-21 15:53
The documentation looks contradictory. "The *data* argument must be", but "The *data* argument may also be". "must be a bytes object", but "If *data* is a buffer".

Why not write just "The data argument must be a bytes-like object, an iterable of bytes-like objects, or None"? It doesn't depend if url is a string or a Request object.

AFAIK the data argument of Request can be an iterable of bytes-like objects in additional to a bytes-like object or None.

The note about the application/x-www-form-urlencoded format is applied not only to a bytes object, but to an iterable of bytes-like objects too. I.e. to any acceptable value except None.
msg239684 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-03-31 11:50
I think we should avoid mentioning bytes-like objects until Issue 23740 (http.client support), Issue 23756 (clarify definition), and/or SSLSocket.sendall() support are sorted out.

Changes in non-urlencoded.3.patch:
* Removed iterable object as direct urlopen() argument, since that would require a custom Content-Length and therefore a custom Request object
* Removed Content-Type discussion from urlopen() for similar reasons
* Added iterable object to Request constructor (already tested)
* Clarified default Content-Type whenever data is not None
* Added a test for default Content-Type with iterable object
msg254258 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-11-07 07:01
Patch 4 is just updated to avoid conflicts with the current code. Changes are the same.
msg254259 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-11-07 07:32
Spotted a docstring that needed updating
msg268708 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-06-17 05:33
Fixed conflicts with recent changes
msg407143 - (view) Author: Irit Katriel (iritkatriel) * (Python committer) Date: 2021-11-27 13:08
Martin, I think you fixed this in
