This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: 'u' formatted arrays mostly prevent appends of 4 byte characters
Type: behavior Stage:
Components: Library (Lib) Versions: Python 3.8, Python 3.7, Python 3.6, Python 3.5
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: bup, iritkatriel
Priority: normal Keywords:

Created on 2019-10-24 10:31 by bup, last changed 2022-04-11 14:59 by admin.

Messages (2)
msg355319 - (view) Author: Dan Snider (bup) * Date: 2019-10-24 10:31
Unicode characters with code points above u+ffff can only be added to the end of an array, and only from a call to the "fromunicode" method. This is because "fromunicode" uses a different procedure to modify the array compared to __new__, __setitem__, append, and extend array methods, all of which eventually call u_setitem routine, which calls PyArg_Parse with a format spec of "u#". The error occurs in that call, from what at first glance appears to be an incorrect length determination for unicode objects of the 4 byte kind.
msg407932 - (view) Author: Irit Katriel (iritkatriel) * (Python committer) Date: 2021-12-07 13:01
Can you include a code snippet to demonstrate the problem?
History
Date User Action Args
2022-04-11 14:59:22adminsetgithub: 82760
2021-12-07 13:01:07iritkatrielsetnosy: + iritkatriel
messages: + msg407932
2019-10-24 10:31:24bupcreate