Title: Rewrite PyUnicode_AsWideChar() and PyUnicode_AsWideCharString(): don't cache the result
Components: Versions: Python 3.5
Created on 2014-09-01 21:32 by vstinner, last changed 2022-04-11 14:58 by admin.

unicode_aswidechar.patch vstinner, 2014-09-02 07:52
Author: STINNER Victor (vstinner) Date: 2014-09-01 21:32
I would like to deprecate PyUnicode_AsUnicode(), see the issue #22271 for the rationale (hint: memory footprint). The first step is to rewrite PyUnicode_AsWideChar() and PyUnicode_AsWideCharString() to not call PyUnicode_AsUnicode() anymore.

Attached patch implements this.

The code is based on PyUnicode_AsUnicode(), but it's more tricky because PyUnicode_AsWideChar() can truncate the string, and PyUnicode_AsUnicode() does no copy characters if kind == sizeof(wchar_t), PyASCIIObject.wstr "just" points to data.

I hate PyUnicode_AsWideChar(), but we must keep it for backward compatibility :-)

It would be possible to write an optimized PyUnicode_AsWideCharString() which computes the length, allocate memory and write wide characters, but I don't want to have 3 functions converting a Python string to a wide character string. There are already PyUnicode_AsUnicodeAndSize() and unicode_aswidechar() (+ unicode_aswidechar_len()).
Author: Antoine Pitrou (pitrou) Date: 2014-09-02 02:37
> Attached patch implements this.

There is no patch.
Author: STINNER Victor (vstinner) Date: 2014-09-02 07:52
> There is no patch.

You're right. Here it is.
Author: Antoine Pitrou (pitrou) Date: 2014-09-06 20:17
Hmm... sorry for the delay, there's no review link. Perhaps the patch is not against the latest default?
Author: Serhiy Storchaka (serhiy.storchaka) Date: 2018-07-23 20:12
Oh, I have reimplemented this in issue30863.
