This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vstinner
Recipients georg.brandl, indygreg, methane, petr.viktorin, serhiy.storchaka, vstinner
Date 2021-08-30.15:24:40
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1630337080.58.0.861233648777.issue45025@roundup.psfhosted.org>
In-reply-to
Content
> The macro PyUnicode_KIND is part of the documented public C API.

IMO it was a mistake to expose it as part of the public C API. This is an implementation detail which should not be exposed. The C API should not expose *directly* how characters are stored in memory, but provide an abstract way to read and write Unicode characters.

The PEP 393 implementation broke the old C API in many ways because it exposed too many implementation details. Sadly, the new C API is... not better :-(

If tomorrow, CPython is modified to use UTF-8 internally (as PyPy does), the C API will likely be broken *again* in many (new funny) ways.

11 years after the PEP 393 (Python 3.3), we only start fixing the old C API :-( The work will be completed in 2 or 3 Python releases (Python 3.12 or 3.13):

* https://www.python.org/dev/peps/pep-0623/
* https://www.python.org/dev/peps/pep-0624/

The C API for Unicode strings is causing a lot of issues in PyPy which uses UTF-8 internally. C extensions can fail to build on PyPy if they use functions (macros) like PyUnicode_KIND().
History
Date User Action Args
2021-08-30 15:24:40vstinnersetrecipients: + vstinner, georg.brandl, petr.viktorin, methane, serhiy.storchaka, indygreg
2021-08-30 15:24:40vstinnersetmessageid: <1630337080.58.0.861233648777.issue45025@roundup.psfhosted.org>
2021-08-30 15:24:40vstinnerlinkissue45025 messages
2021-08-30 15:24:40vstinnercreate