Message 226244 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	vstinner
Recipients	loewis, vstinner
Date	2014-09-01.21:32:13
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1409607134.14.0.210351686888.issue22323@psf.upfronthosting.co.za>
In-reply-to

Content
I would like to deprecate PyUnicode_AsUnicode(), see the issue #22271 for the rationale (hint: memory footprint). The first step is to rewrite PyUnicode_AsWideChar() and PyUnicode_AsWideCharString() to not call PyUnicode_AsUnicode() anymore. Attached patch implements this. The code is based on PyUnicode_AsUnicode(), but it's more tricky because PyUnicode_AsWideChar() can truncate the string, and PyUnicode_AsUnicode() does no copy characters if kind == sizeof(wchar_t), PyASCIIObject.wstr "just" points to data. I hate PyUnicode_AsWideChar(), but we must keep it for backward compatibility :-) It would be possible to write an optimized PyUnicode_AsWideCharString() which computes the length, allocate memory and write wide characters, but I don't want to have 3 functions converting a Python string to a wide character string. There are already PyUnicode_AsUnicodeAndSize() and unicode_aswidechar() (+ unicode_aswidechar_len()).

I would like to deprecate PyUnicode_AsUnicode(), see the issue #22271 for the rationale (hint: memory footprint). The first step is to rewrite PyUnicode_AsWideChar() and PyUnicode_AsWideCharString() to not call PyUnicode_AsUnicode() anymore.

Attached patch implements this.

The code is based on PyUnicode_AsUnicode(), but it's more tricky because PyUnicode_AsWideChar() can truncate the string, and PyUnicode_AsUnicode() does no copy characters if kind == sizeof(wchar_t), PyASCIIObject.wstr "just" points to data.

I hate PyUnicode_AsWideChar(), but we must keep it for backward compatibility :-)

It would be possible to write an optimized PyUnicode_AsWideCharString() which computes the length, allocate memory and write wide characters, but I don't want to have 3 functions converting a Python string to a wide character string. There are already PyUnicode_AsUnicodeAndSize() and unicode_aswidechar() (+ unicode_aswidechar_len()).

History
Date	User	Action	Args
2014-09-01 21:32:14	vstinner	set	recipients: + vstinner, loewis
2014-09-01 21:32:14	vstinner	set	messageid: <1409607134.14.0.210351686888.issue22323@psf.upfronthosting.co.za>
2014-09-01 21:32:14	vstinner	link	issue22323 messages
2014-09-01 21:32:13	vstinner	create