Message 225860 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	vstinner
Recipients	loewis, vstinner
Date	2014-08-25.01:41:15
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1408930876.51.0.0157656751737.issue22271@psf.upfronthosting.co.za>
In-reply-to

Content
With the PEP 393 implemented in Python 3.3, PyUnicode_AsUnicode() and all similar functions have to convert the compact string to a heavy wchar_t* string (UCS-4 on Linux: 4 bytes per character, UTF-16 on Windows: 2 or 4 bytes per character) which is stored in the string. The heavy "Py_UNICODE*" storage is kept until the string is destroyed, which may only occur at Python exit. To reduce the memory footprint, it would be nice to promote the usage of the PEP 393 and start to emit a DeprecationWarning warning. The Py_UNICODE type and all related functions are already deprecated in the documentation. The deprecate PyUnicode_AsUnicode(), we should stop using it in Python itself. For example, PyUnicode_AsWideCharString() can be used to encode filenames on Windows, it doesn't store the encoded string in the Python object. See for example path_converter() in posixmodule.c.

With the PEP 393 implemented in Python 3.3, PyUnicode_AsUnicode() and all similar functions have to convert the compact string to a heavy wchar_t* string (UCS-4 on Linux: 4 bytes per character, UTF-16 on Windows: 2 or 4 bytes per character) which is stored in the string. The heavy "Py_UNICODE*" storage is kept until the string is destroyed, which may only occur at Python exit.

To reduce the memory footprint, it would be nice to promote the usage of the PEP 393 and start to emit a DeprecationWarning warning. The Py_UNICODE type and all related functions are already deprecated in the documentation.

The deprecate PyUnicode_AsUnicode(), we should stop using it in Python itself. For example, PyUnicode_AsWideCharString() can be used to encode filenames on Windows, it doesn't store the encoded string in the Python object. See for example path_converter() in posixmodule.c.

History
Date	User	Action	Args
2014-08-25 01:41:16	vstinner	set	recipients: + vstinner, loewis
2014-08-25 01:41:16	vstinner	set	messageid: <1408930876.51.0.0157656751737.issue22271@psf.upfronthosting.co.za>
2014-08-25 01:41:16	vstinner	link	issue22271 messages
2014-08-25 01:41:15	vstinner	create