classification
Title: Deprecate immortal interned strings: PyUnicode_InternImmortal()
Type: Stage:
Components: Interpreter Core Versions: Python 3.10
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: corona10, inada.naoki, serhiy.storchaka, shihai1991, vstinner
Priority: normal Keywords:

Created on 2020-09-02 13:59 by vstinner, last changed 2020-09-05 10:40 by shihai1991.

Messages (4)
msg376237 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-09-02 13:59
Python has the concept of "immortal" interned strings: PyUnicode_InternImmortal().

The feature was first introduced in the Python 2 "str" (bytes) type, bpo-576101 (commit 45ec02aed14685c353e55841b5acbc0dadee76f8). New PyString_InternImmortal() function.

commit 45ec02aed14685c353e55841b5acbc0dadee76f8
Author: Guido van Rossum <guido@python.org>
Date:   Mon Aug 19 21:43:18 2002 +0000

    SF patch 576101, by Oren Tirosh: alternative implementation of
    interning.  I modified Oren's patch significantly, but the basic idea
    and most of the implementation is unchanged.  Interned strings created
    with PyString_InternInPlace() are now mortal, and you must keep a
    reference to the resulting string around; use the new function
    PyString_InternImmortal() to create immortal interned strings.

Later, the feature was added to the PyUnicodeObject type, new PyUnicode_InternImmortal() function:

commit 1680713e524016d93a94114c4a874ad71a090b95
Author: Walter Dörwald <walter@livinglogic.de>
Date:   Fri May 25 13:52:07 2007 +0000

    Add interning of unicode strings by copying the functionality from
    stringobject.c.
    
    Intern "True" and "False" in bool_repr() again as it was in the
    8bit string era.

Since Python 3.10, (mortal) interned strings are cleared at Python exit in Py_Finalize(). It avoids leaking memory when Python is embedded in an application: bpo-1635741.

commit 666ecfb0957a2fa0df5e2bd03804195de74bdfbf
Author: Victor Stinner <vstinner@python.org>
Date:   Thu Jul 2 01:19:57 2020 +0200

    bpo-1635741: Release Unicode interned strings at exit (GH-21269)
    
    * PyUnicode_InternInPlace() now ensures that interned strings are
      ready.
    * Add _PyUnicode_ClearInterned().
    * Py_Finalize() now releases Unicode interned strings:
      call _PyUnicode_ClearInterned().

--

PyUnicode_InternImmortal() is not used in the Python standard library. I propose to start deprecating the function and remove it in Python 3.12 (PEP 387 requires a deprecation for 2 releases). In Python 3.10, calling the function will emit a DeprecationWarning at runtime.

Note: PyString_InternImmortal() (for bytes strings) has been removed from Python 3.0.
msg376271 - (view) Author: Inada Naoki (inada.naoki) * (Python committer) Date: 2020-09-03 01:19
+1
msg376302 - (view) Author: Dong-hee Na (corona10) * (Python committer) Date: 2020-09-03 14:19
+1
msg376428 - (view) Author: hai shi (shihai1991) * Date: 2020-09-05 10:40
+1
History
Date User Action Args
2020-09-05 10:40:01shihai1991setmessages: + msg376428
2020-09-03 14:19:26corona10setmessages: + msg376302
2020-09-03 14:19:14corona10setnosy: + corona10
2020-09-03 01:19:36inada.naokisetmessages: + msg376271
2020-09-02 16:02:12shihai1991setnosy: + shihai1991
2020-09-02 13:59:25vstinnercreate