Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C API] Remove _Py_NewReference() and _Py_ForgetReference() from the public C API #85161

Closed
vstinner opened this issue Jun 15, 2020 · 11 comments
Closed
Labels
3.10 only security fixes topic-C-API

Comments

@vstinner
Copy link
Member

BPO 40989
Nosy @scoder, @vstinner
PRs
  • bpo-40989: PyObject_INIT() becomes an alias to PyObject_Init() #20901
  • bpo-40989: Move _Py_NewReference() to the internal C API #20904
  • bpo-40989: Make _PyTraceMalloc_NewReference() internal #20915
  • bpo-40989: Fix compiler warning in winreg.c #21722
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2021-09-21.21:57:43.476>
    created_at = <Date 2020-06-15.21:05:35.371>
    labels = ['expert-C-API', '3.10']
    title = '[C API] Remove _Py_NewReference() and _Py_ForgetReference() from the public C API'
    updated_at = <Date 2021-09-21.21:57:43.475>
    user = 'https://github.com/vstinner'

    bugs.python.org fields:

    activity = <Date 2021-09-21.21:57:43.475>
    actor = 'vstinner'
    assignee = 'none'
    closed = True
    closed_date = <Date 2021-09-21.21:57:43.476>
    closer = 'vstinner'
    components = ['C API']
    creation = <Date 2020-06-15.21:05:35.371>
    creator = 'vstinner'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 40989
    keywords = ['patch']
    message_count = 11.0
    messages = ['371597', '371600', '371604', '371654', '371656', '371671', '371711', '371715', '374883', '380890', '402377']
    nosy_count = 2.0
    nosy_names = ['scoder', 'vstinner']
    pr_nums = ['20901', '20904', '20915', '21722']
    priority = 'normal'
    resolution = 'rejected'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue40989'
    versions = ['Python 3.10']

    @vstinner
    Copy link
    Member Author

    The _Py_NewReference() and _Py_ForgetReference() functions are tightly coupled to CPython internals.

    _Py_NewReference() is only exposed because it used by the PyObject_INIT() macro which is the fast inlined flavor of PyObject_Init(). If we make PyObject_INIT() as alias to PyObject_Init(), as already done for the limited C API, _Py_NewReference() can be removed from the public C API (moved to the internal C API).

    _Py_ForgetReference() function is only defined if Py_TRACE_REFS macro is defined. I propose to also removed it from the public C API (move it to the internal C API).

    In the CPython code base, _Py_NewReference() is used:

    • to implement the free list optimization
    • in _PyBytes_Resize() and unicode_resize() (resize_compact() to be precise)
    • by PyObject_CallFinalizerFromDealloc() to resurrect an object

    These are corner cases which can be avoided in third party C extension modules.

    @vstinner vstinner added 3.10 only security fixes topic-C-API labels Jun 15, 2020
    @vstinner
    Copy link
    Member Author

    Work on this issue started in Python 3.9 with the following change of bpo-39542:

    commit f58bd7c
    Author: Victor Stinner <vstinner@python.org>
    Date: Wed Feb 5 13:12:19 2020 +0100

    bpo-39542: Make PyObject_INIT() opaque in limited C API (GH-18363)
    
    In the limited C API, PyObject_INIT() and PyObject_INIT_VAR() are now
    defined as aliases to PyObject_Init() and PyObject_InitVar() to make
    their implementation opaque. It avoids to leak implementation details
    in the limited C API.
    
    Exclude the following functions from the limited C API, move them
    from object.h to cpython/object.h:
    
    * _Py_NewReference()
    * _Py_ForgetReference()
    * _PyTraceMalloc_NewReference()
    * _Py_GetRefTotal()
    

    @vstinner
    Copy link
    Member Author

    New changeset 04fc4f2 by Victor Stinner in branch 'master':
    bpo-40989: PyObject_INIT() becomes an alias to PyObject_Init() (GH-20901)
    04fc4f2

    @vstinner
    Copy link
    Member Author

    See "Removal of _Py_ForgetReference from public header in 3.9 issue" discussion on capi-sig:
    https://mail.python.org/archives/list/capi-sig@python.org/thread/4EOCN7P4HI56GQ74FY3TMIKDBIPGKL2G/

    @vstinner
    Copy link
    Member Author

    In the CPython code base, _Py_NewReference() is used:

    • to implement the free list optimization

    I found a similar code pattern outside CPython code base.

    Like Nuitka generic macros for free lists:
    https://github.com/Nuitka/Nuitka/blob/8e2357ee8e9a93d835e98d5a88ca0181cc34dabc/nuitka/build/include/nuitka/freelists.h#L22

    Another example in the old unmaintained gmpy project (last release in 2013):
    https://github.com/LogicalKnight/gmpy/blob/5d758abc9fcb44b08dd000145afe48160bd802f1/src/gmpy2_cache.c#L93-L111

    These functions can opt-in for the internal C API to continue to get access to _Py_NewReference().

    Or maybe free lists should be implemented differently? (is it possible?)

    • in _PyBytes_Resize() and unicode_resize() (resize_compact() to be precise)

    Third-party code can use the public PyUnicode_Resize() function and the private _PyBytes_Resize() function. Later, if needed, we can consider to expose _PyBytes_Resize() in the limited C API.

    • by PyObject_CallFinalizerFromDealloc() to resurrect an object

    I found a few projects which basically reimplement the PEP-442 logic in their tp_dealloc function. Example with pyuv resurrect_object() function:
    https://github.com/saghul/pyuv/blob/9d226dd61162a998745681eb87ef34e1a7d8586a/src/handle.c#L9

    For these, using PEP-442 tp_finalizer here would avoid relying on private functions like _Py_NewReference(), make the code simpler and more reliable.

    @vstinner
    Copy link
    Member Author

    New changeset fcc60e4 by Victor Stinner in branch 'master':
    bpo-40989: Make _PyTraceMalloc_NewReference() internal (GH-20915)
    fcc60e4

    @scoder
    Copy link
    Contributor

    scoder commented Jun 17, 2020

    To add to the list, Cython also calls _Py_NewReference() in its async generator code, which uses a free-list. That code was mostly copied from the CPython-internal implementation.

    Other freelist implementations in Cython call PyObject_INIT() instead, so I guess calling _Py_NewReference() directly here isn't actually required and could be avoided.

    @scoder
    Copy link
    Contributor

    scoder commented Jun 17, 2020

    I looked into the freelists a bit more and (as always) it's not quite that simple. Calling _Py_NewReference() allows keeping the "ob_type" pointer in place, whereas PyObject_INIT() requires a blank object in order to set the type and properly ref-count heap types.

    I think what that means is that there are at least two different cases for freelists: those that only keep the bare object memory alive (and can also support subtypes of the same size), and those that keep the (cleared) object alive, including its type. For the first, PyObject_INIT() works. For the latter, _Py_NewReference() seems the right helper function.

    The advantage of keeping the object as it is is the much simpler freelist code in tp_dealloc(). All it needs to do is 1) clear the ref-counted object fields (if any) and 2) put it in the freelist. No other C-API interaction is needed. If we only want to keep the object memory, then we need C-API support in both tp_new() and tp_dealloc().

    If _Py_NewReference() is removed/hidden, then it would be nice if there was a replacement for the use case of initialising an object from a freelist that already knows its type.

    @vstinner
    Copy link
    Member Author

    vstinner commented Aug 5, 2020

    New changeset 15edaec by Victor Stinner in branch 'master':
    bpo-40989: Fix compiler warning in winreg.c (GH-21722)
    15edaec

    @vstinner
    Copy link
    Member Author

    See also "Removal of _Py_ForgetReference from public header in 3.9 issue" thread on python-dev list:
    https://mail.python.org/archives/list/python-dev@python.org/thread/CQYVR7TZZITURBZKVWIEOBGF343GI52W/#CQYVR7TZZITURBZKVWIEOBGF343GI52W

    And "Re: [Python-Dev] Removal of _Py_ForgetReference from public header in 3.9 issue" thread on capi-sig list:
    https://mail.python.org/archives/list/capi-sig@python.org/thread/4EOCN7P4HI56GQ74FY3TMIKDBIPGKL2G/

    @vstinner
    Copy link
    Member Author

    I still consider that these functions must be moved to the internal C API. But I failed to find time to design a *public* C API for that, or help projects using these functions to avoid it. I prefer to close the isseu for now.

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.10 only security fixes topic-C-API
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants