This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Deprecate immortal interned strings: PyUnicode_InternImmortal()
Type: Stage: resolved
Components: Interpreter Core Versions: Python 3.10
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: corona10, methane, serhiy.storchaka, shihai1991, vstinner
Priority: normal Keywords: patch

Created on 2020-09-02 13:59 by vstinner, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 22486 merged vstinner, 2020-10-01 22:58
Messages (9)
msg376237 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-09-02 13:59
Python has the concept of "immortal" interned strings: PyUnicode_InternImmortal().

The feature was first introduced in the Python 2 "str" (bytes) type, bpo-576101 (commit 45ec02aed14685c353e55841b5acbc0dadee76f8). New PyString_InternImmortal() function.

commit 45ec02aed14685c353e55841b5acbc0dadee76f8
Author: Guido van Rossum <guido@python.org>
Date:   Mon Aug 19 21:43:18 2002 +0000

    SF patch 576101, by Oren Tirosh: alternative implementation of
    interning.  I modified Oren's patch significantly, but the basic idea
    and most of the implementation is unchanged.  Interned strings created
    with PyString_InternInPlace() are now mortal, and you must keep a
    reference to the resulting string around; use the new function
    PyString_InternImmortal() to create immortal interned strings.

Later, the feature was added to the PyUnicodeObject type, new PyUnicode_InternImmortal() function:

commit 1680713e524016d93a94114c4a874ad71a090b95
Author: Walter Dörwald <walter@livinglogic.de>
Date:   Fri May 25 13:52:07 2007 +0000

    Add interning of unicode strings by copying the functionality from
    stringobject.c.
    
    Intern "True" and "False" in bool_repr() again as it was in the
    8bit string era.

Since Python 3.10, (mortal) interned strings are cleared at Python exit in Py_Finalize(). It avoids leaking memory when Python is embedded in an application: bpo-1635741.

commit 666ecfb0957a2fa0df5e2bd03804195de74bdfbf
Author: Victor Stinner <vstinner@python.org>
Date:   Thu Jul 2 01:19:57 2020 +0200

    bpo-1635741: Release Unicode interned strings at exit (GH-21269)
    
    * PyUnicode_InternInPlace() now ensures that interned strings are
      ready.
    * Add _PyUnicode_ClearInterned().
    * Py_Finalize() now releases Unicode interned strings:
      call _PyUnicode_ClearInterned().

--

PyUnicode_InternImmortal() is not used in the Python standard library. I propose to start deprecating the function and remove it in Python 3.12 (PEP 387 requires a deprecation for 2 releases). In Python 3.10, calling the function will emit a DeprecationWarning at runtime.

Note: PyString_InternImmortal() (for bytes strings) has been removed from Python 3.0.
msg376271 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2020-09-03 01:19
+1
msg376302 - (view) Author: Dong-hee Na (corona10) * (Python committer) Date: 2020-09-03 14:19
+1
msg376428 - (view) Author: Hai Shi (shihai1991) * (Python triager) Date: 2020-09-05 10:40
+1
msg377784 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-10-01 23:00
I proposed PR 22486 to deprecate the function.
msg377808 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-10-02 12:49
New changeset 583ee5a5b1971a18ebeb877948ce6264da0cc8aa by Victor Stinner in branch 'master':
bpo-41692: Deprecate PyUnicode_InternImmortal() (GH-22486)
https://github.com/python/cpython/commit/583ee5a5b1971a18ebeb877948ce6264da0cc8aa
msg377809 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-10-02 12:49
The function is now deprecated. Thanks for the review INADA-san, I close the issue. Let's meet in Python 3.12 to remove it ;-)
msg393350 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2021-05-10 04:20
For the record, I noticed PyUnicode_InternImmortal() is a stable ABI.

We may need to keep the function to avoid dynamic link errors.
But we can still change its implementation to just raise an exception.
msg408011 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-12-08 11:28
I cannot find "PyUnicode_InternImmortal" pattern in the source code of the PyPI top 5000 projects (December 1, 2021).

I only found a false positive in frozendict-2.1.1:

frozendict/src/3_10/cpython_src/Include/unicodeobject.h: // PyUnicode_InternImmortal() is deprecated since Python 3.10
frozendict/src/3_10/cpython_src/Include/unicodeobject.h: Py_DEPRECATED(3.10) PyAPI_FUNC(void) PyUnicode_InternImmortal(PyObject **);
frozendict/src/3_6/cpython_src/Include/unicodeobject.h: PyAPI_FUNC(void) PyUnicode_InternImmortal(PyObject **);
frozendict/src/3_7/cpython_src/Include/unicodeobject.h: PyAPI_FUNC(void) PyUnicode_InternImmortal(PyObject **);
frozendict/src/3_8/cpython_src/Include/unicodeobject.h: PyAPI_FUNC(void) PyUnicode_InternImmortal(PyObject **);
frozendict/src/3_9/cpython_src/Include/unicodeobject.h: PyAPI_FUNC(void) PyUnicode_InternImmortal(PyObject **);

These are copies of the Python unicodeobject.h header files, but the PyUnicode_InternImmortal() function is not called by frozendict.

I used my download_pypi_top.py and search_pypi_top.py tools which can be found at:
https://github.com/vstinner/misc/tree/main/cpython
History
Date User Action Args
2022-04-11 14:59:35adminsetgithub: 85858
2021-12-08 11:28:35vstinnersetmessages: + msg408011
2021-05-10 04:20:28methanesetmessages: + msg393350
2020-10-02 12:49:46vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg377809

stage: patch review -> resolved
2020-10-02 12:49:08vstinnersetmessages: + msg377808
2020-10-01 23:00:06vstinnersetmessages: + msg377784
2020-10-01 22:58:56vstinnersetkeywords: + patch
stage: patch review
pull_requests: + pull_request21503
2020-09-05 10:40:01shihai1991setmessages: + msg376428
2020-09-03 14:19:26corona10setmessages: + msg376302
2020-09-03 14:19:14corona10setnosy: + corona10
2020-09-03 01:19:36methanesetmessages: + msg376271
2020-09-02 16:02:12shihai1991setnosy: + shihai1991
2020-09-02 13:59:25vstinnercreate