Issue 36299: array: Deprecate 'u' type in array module

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/80480

classification

Title:	array: Deprecate 'u' type in array module
Type:		Stage:
Components:	Library (Lib)	Versions:	Python 3.8

process

Status:	open	Resolution:
Dependencies:		Superseder:
Assigned To:		Nosy List:	methane, ncoghlan, serhiy.storchaka, skrah, terry.reedy
Priority:	normal	Keywords:	patch

Created on 2019-03-15 05:50 by methane, last changed 2022-04-11 14:59 by admin.

Pull Requests
URL	Status	Linked	Edit
PR 12497	closed	methane, 2019-03-22 10:43

Messages (12)
msg337967 - (view)	Author: Inada Naoki (methane) *	Date: 2019-03-15 05:50
The doc says: > 'u' will be removed together with the rest of the Py_UNICODE API. > Deprecated since version 3.3, will be removed in version 4.0. > https://docs.python.org/3/library/array.html But DeprecationWarning is not raised yet. Let's raise it. * 3.8 -- PendingDeprecationWarning * 3.9 -- DeprecationWarning * 4.0 or 3.10 -- Remove it.
msg338031 - (view)	Author: Terry J. Reedy (terry.reedy) *	Date: 2019-03-15 20:45
'4.0' is a stand-in for 'sometime after 2.7.final', scheduled for Jan 2020. A Pending... for 3.8.0, scheduled for Oct 2019, seems reasonable to me. Perhaps we should have a pydev discussion for the general issue of post 2.7 removals of already deprecated items.
msg338595 - (view)	Author: Inada Naoki (methane) *	Date: 2019-03-22 09:13
https://mail.python.org/pipermail/python-dev/2019-March/156807.html We may able to convert 'u' to wchar_t to int32_t and un-deprecate it.
msg338598 - (view)	Author: Inada Naoki (methane) *	Date: 2019-03-22 10:49
I found converting Py_UNICODE to Py_UCS4 wad happened, and reverted. ref: https://bugs.python.org/issue13072
msg338607 - (view)	Author: Stefan Krah (skrah) *	Date: 2019-03-22 14:44
I think the problem is still whether to use 'u' == UCS2 and 'w' == UCS4 like in PEP-3118. For the project I'm currently working on I'd need these for buffer exports: >>> from xnd import * >>> x = xnd(["abc", "xyz"], dtype="fixed_string(10, 'utf16')") >>> y = xnd(["abc", "xyz"], dtype="fixed_string(10, 'utf32')") >>> >>> memoryview(x) Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: type is not supported by the buffer protocol The use case is not an array that represents a single utf16 string, but an array of fixed strings with different encodings. So x would be exported with format 'u' and y with format 'w'.
msg338608 - (view)	Author: Stefan Krah (skrah) *	Date: 2019-03-22 15:01
Just to demonstrate what the format would look like, this is working for an array of fixed bytes: >>> x = xnd([b"123", b"23456"], dtype="fixed_bytes(size=10)") >>> memoryview(x).format '10s' So the formats in the previous message would be '10u' and '10w'.
msg338609 - (view)	Author: Serhiy Storchaka (serhiy.storchaka) *	Date: 2019-03-22 15:03
array('u') is not tied with the legacy Unicode C API. It is possible to use the modern wchar_t based Unicode C API for it. See issue36346. There are benefits from getting rid of the legacy Unicode C API, but not from array('u').
msg338610 - (view)	Author: Stefan Krah (skrah) *	Date: 2019-03-22 15:10
array() uses struct module characters except for 'u'. PEP-3118 was supposed to be implemented in the struct module. If array() continues to use 'u', the only sensible thing would be to remove (or rename) 'a', 'u' and 'w' from PEP-3118.
msg338611 - (view)	Author: Stefan Krah (skrah) *	Date: 2019-03-22 15:25
The funny thing is that array() already knows this: >>> import array >>> a = array.array("u", "123") >>> memoryview(a).format 'w'
msg367000 - (view)	Author: Inada Naoki (methane) *	Date: 2020-04-22 13:16
I closed GH-12497 (Py_UNICODE -> Py_UCS4). I created GH-19653 (Py_UNICODE -> wchar_t) instead.
msg367044 - (view)	Author: Terry J. Reedy (terry.reedy) *	Date: 2020-04-22 19:15
Should this issue be closed, possibly as superseded by #36346, the issue for the new PR-19653?
msg367065 - (view)	Author: Inada Naoki (methane) *	Date: 2020-04-23 00:47
While array('u') doesn't use deprecated API with GH-19653, I still don't like 'u' because: * I don't have any reason to use platform dependant wchar_t. [1] * It is not consistent with PEP-3118. [1]: https://mail.python.org/pipermail/python-dev/2019-March/156807.html How about this plan? * Add 'w' for Py_UCS4. * Deprecate 'u', and remove it in the future.

History
Date	User	Action	Args
2022-04-11 14:59:12	admin	set	github: 80480
2020-04-23 00:47:43	methane	set	messages: + msg367065
2020-04-22 19:15:06	terry.reedy	set	messages: + msg367044
2020-04-22 13:16:26	methane	set	messages: + msg367000
2019-03-22 15:56:45	vstinner	set	nosy: - vstinner
2019-03-22 15:25:27	skrah	set	messages: + msg338611
2019-03-22 15:10:07	skrah	set	messages: + msg338610
2019-03-22 15:03:02	serhiy.storchaka	set	nosy: + serhiy.storchaka messages: + msg338609
2019-03-22 15:01:08	skrah	set	messages: + msg338608
2019-03-22 14:44:21	skrah	set	messages: + msg338607
2019-03-22 11:26:51	methane	set	nosy: + ncoghlan, vstinner, skrah stage: patch review -> title: Deprecate 'u' type in array module -> array: Deprecate 'u' type in array module
2019-03-22 10:49:09	methane	set	messages: + msg338598
2019-03-22 10:43:34	methane	set	keywords: + patch stage: patch review pull_requests: + pull_request12447
2019-03-22 09:13:15	methane	set	messages: + msg338595
2019-03-15 20:45:47	terry.reedy	set	nosy: + terry.reedy messages: + msg338031
2019-03-15 05:50:02	methane	create