Message 272118 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	martin.panter
Recipients	axitkhurana, demian.brecht, martin.panter, r.david.murray, serhiy.storchaka
Date	2016-08-07.12:37:50
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1470573471.39.0.0984201692485.issue23740@psf.upfronthosting.co.za>
In-reply-to

Content
I’ve decided I would prefer deprecating the Latin-1 encoding, rather than adding more encoding support for iterables and text files. If not deprecating it altogether, at least prefer just ASCII encoding (like Python 2). The urlopen() function already rejects text str, and in Issue 26045 people often want or expect UTF-8. Here is a summary I made of the different data types handled for the body object, from highest priority to lowest priority. HTTPConnection HTTPConn. default body _send_output() Content-Length urlopen() =========== =============== ================= ================== None No body 0 or unset #23539 C-L unset Has read() read() #1065257 * #5038 - Text file encode() #5314 str encode() r56128 len() TypeError #11082 Byte seq. sendall() len() C-L: nbytes Bytes-like * #27340 * C-L: nbytes #3243 Iterable iter() #3243 C-L required #3243 Sized len()† #23350 fstat() st_size #1065257 Otherwise TypeError Unset #1065257 C-L optional† * Possible bugs † Degenerate cases not worth supporting IMO C-L = Content-Length header field, set by default or required to be specified in various cases Byte seq. = Sequence of bytes, including “bytes” type, bytearray, array("B"), etc Bytes-like = Any C-contiguous buffer, including zero- and multi-dimensional arrays, and with items other than bytes Iterable = Iterable of bytes or bytes-like objects (depending on Issue 27340 about SSL)

I’ve decided I would prefer deprecating the Latin-1 encoding, rather than adding more encoding support for iterables and text files. If not deprecating it altogether, at least prefer just ASCII encoding (like Python 2). The urlopen() function already rejects text str, and in Issue 26045 people often want or expect UTF-8.

Here is a summary I made of the different data types handled for the body object, from highest priority to lowest priority.

             HTTPConnection   HTTPConn. default                    
body         _send_output()   Content-Length     urlopen()         
===========  ===============  =================  ==================
None         No body          0 or unset #23539  C-L unset         
Has read()   read() #1065257                     * #5038           
- Text file  encode() #5314                                        
str          encode() r56128  len()              TypeError #11082  
Byte seq.    sendall()        len()              C-L: nbytes       
Bytes-like   * #27340         *                  C-L: nbytes #3243 
Iterable     iter() #3243                        C-L required #3243
Sized                         len()† #23350                        
fstat()                       st_size #1065257                     
Otherwise    TypeError        Unset #1065257     C-L optional†     

* Possible bugs
† Degenerate cases not worth supporting IMO
C-L = Content-Length header field, set by default or required to be specified in various cases
Byte seq. = Sequence of bytes, including “bytes” type, bytearray, array("B"), etc
Bytes-like = Any C-contiguous buffer, including zero- and multi-dimensional arrays, and with items other than bytes
Iterable = Iterable of bytes or bytes-like objects (depending on Issue 27340 about SSL)

History
Date	User	Action	Args
2016-08-07 12:37:51	martin.panter	set	recipients: + martin.panter, r.david.murray, axitkhurana, serhiy.storchaka, demian.brecht
2016-08-07 12:37:51	martin.panter	set	messageid: <1470573471.39.0.0984201692485.issue23740@psf.upfronthosting.co.za>
2016-08-07 12:37:51	martin.panter	link	issue23740 messages
2016-08-07 12:37:50	martin.panter	create