This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Content-Type when sending data with urlopen()
Type: Stage: resolved
Components: Documentation Versions: Python 3.6, Python 3.5, Python 2.7
process
Status: closed Resolution: out of date
Dependencies: Superseder:
Assigned To: docs@python Nosy List: demian.brecht, docs@python, iritkatriel, martin.panter, orsenthil, r.david.murray, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2015-02-01 12:43 by martin.panter, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
non-urlencoded.patch martin.panter, 2015-02-01 12:43 review
non-urlencoded.2.patch martin.panter, 2015-02-02 02:18 review
non-urlencoded.3.patch martin.panter, 2015-03-31 11:50 review
non-urlencoded.4.patch martin.panter, 2015-11-07 07:01 review
non-urlencoded.5.patch martin.panter, 2015-11-07 07:32 review
non-urlencoded.6.patch martin.panter, 2016-06-17 05:33 review
Messages (8)
msg235166 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-02-01 12:43
Currently the documentation gives the impression that the “data” parameter to Request() has to be in the application/x-www-form-urlencoded format. However I suspect that you can override the type by supplying a Content-Type header, and I would like to document this; see uploaded patch.

I noticed that test_urllib2.HandlerTests.test_http() already seems to test the default Content-Type and a custom Content-Type with a Request() object, although I did not see a test for the default Content-Type when supplying “data” directly to urlopen().

Also I understand the “charset” parameter on application/x-www-form-urlencoded is not standardized. Would it correspond to the encoding of the %XX codes from urlencode(), which is typically UTF-8, not Latin-1? Or would it correspond to the subsequent string-to-bytes encoding stage, which could just be ASCII since non-ASCII characters are already encoded? Maybe it would be best to drop the advice to set a “charset” parameter. It was added for Issue 11082.
msg235219 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-02-02 02:18
Updated patch to explain that a Request object is generated internally for urlopen(data=...), and added a test to confirm. Also removed some confusing dead code.
msg238811 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-03-21 15:53
The documentation looks contradictory. "The *data* argument must be", but "The *data* argument may also be". "must be a bytes object", but "If *data* is a buffer".

Why not write just "The data argument must be a bytes-like object, an iterable of bytes-like objects, or None"? It doesn't depend if url is a string or a Request object.

AFAIK the data argument of Request can be an iterable of bytes-like objects in additional to a bytes-like object or None.

The note about the application/x-www-form-urlencoded format is applied not only to a bytes object, but to an iterable of bytes-like objects too. I.e. to any acceptable value except None.
msg239684 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-03-31 11:50
I think we should avoid mentioning bytes-like objects until Issue 23740 (http.client support), Issue 23756 (clarify definition), and/or SSLSocket.sendall() support are sorted out.

Changes in non-urlencoded.3.patch:
* Removed iterable object as direct urlopen() argument, since that would require a custom Content-Length and therefore a custom Request object
* Removed Content-Type discussion from urlopen() for similar reasons
* Added iterable object to Request constructor (already tested)
* Clarified default Content-Type whenever data is not None
* Added a test for default Content-Type with iterable object
msg254258 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-11-07 07:01
Patch 4 is just updated to avoid conflicts with the current code. Changes are the same.
msg254259 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-11-07 07:32
Spotted a docstring that needed updating
msg268708 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016-06-17 05:33
Fixed conflicts with recent changes
msg407143 - (view) Author: Irit Katriel (iritkatriel) * (Python committer) Date: 2021-11-27 13:08
Martin, I think you fixed this in 

https://github.com/python/cpython/commit/3c0d0baf2badfad7deb346d1043f7d83bb92691f#diff-533bd604631e0e26ce55dfa75a878788f3c4d7d7ccb3bbaeaa2ee2a9c956ffe8
History
Date User Action Args
2022-04-11 14:58:12adminsetgithub: 67549
2021-12-06 00:08:26iritkatrielsetstatus: pending -> closed
stage: patch review -> resolved
2021-11-27 13:08:22iritkatrielsetstatus: open -> pending

nosy: + iritkatriel
messages: + msg407143

resolution: out of date
2016-06-17 10:50:56serhiy.storchakasetnosy: + orsenthil, r.david.murray
2016-06-17 05:33:04martin.pantersetfiles: + non-urlencoded.6.patch

messages: + msg268708
versions: + Python 2.7, - Python 3.4
2015-11-07 07:32:30martin.pantersetfiles: + non-urlencoded.5.patch

messages: + msg254259
2015-11-07 07:01:44martin.pantersetfiles: + non-urlencoded.4.patch

messages: + msg254258
versions: + Python 3.6
2015-03-31 16:47:28demian.brechtsetnosy: + demian.brecht
2015-03-31 11:50:39martin.pantersetfiles: + non-urlencoded.3.patch

stage: patch review
messages: + msg239684
versions: + Python 3.5
2015-03-21 15:53:45serhiy.storchakasetmessages: + msg238811
2015-03-21 13:58:04serhiy.storchakasetnosy: + serhiy.storchaka
2015-02-02 02:18:22martin.pantersetfiles: + non-urlencoded.2.patch

messages: + msg235219
2015-02-01 12:43:32martin.pantercreate