classification
Title: [C API] Convert PyTuple_GET_ITEM() macro to a static inline function
Type: Stage: patch review
Components: C API Versions: Python 3.10
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: scoder, vstinner
Priority: normal Keywords: patch

Created on 2020-06-22 14:50 by vstinner, last changed 2020-06-23 09:42 by vstinner.

Pull Requests
URL Status Linked Edit
PR 21056 merged vstinner, 2020-06-22 15:04
PR 21057 merged vstinner, 2020-06-22 15:12
PR 21058 merged vstinner, 2020-06-22 15:38
PR 21059 open vstinner, 2020-06-22 16:10
Messages (12)
msg372091 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-06-22 14:50
PyTuple_GET_ITEM() can be abused to access directly the PyTupleObject.ob_item member:

   PyObject **items = &PyTuple_GET_ITEM(0);

Giving a direct access to an array or PyObject* (PyObject**) is causing issues with other Python implementations, like PyPy, which don't use PyObject internally.

I propose to convert the PyTuple_GET_ITEM() and PyList_GET_ITEM() macros to static inline functions to disallow that.
msg372093 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-06-22 15:27
New changeset 384621c42f9102e31ba2c47feba144af09c989e5 by Victor Stinner in branch 'master':
bpo-41078: Rename pycore_tupleobject.h to pycore_tuple.h (GH-21056)
https://github.com/python/cpython/commit/384621c42f9102e31ba2c47feba144af09c989e5
msg372095 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-06-22 15:39
New changeset c45dbe93b7094fe014442c198727ee38b25541c4 by Victor Stinner in branch 'master':
bpo-41078: Add pycore_list.h internal header file (GH-21057)
https://github.com/python/cpython/commit/c45dbe93b7094fe014442c198727ee38b25541c4
msg372097 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-06-22 16:02
New changeset c96d00e88ead8f99bb6aa1357928ac4545d9287c by Victor Stinner in branch 'master':
bpo-41078: Fix bltinmodule.c with Py_TRACE_REFS (GH-21058)
https://github.com/python/cpython/commit/c96d00e88ead8f99bb6aa1357928ac4545d9287c
msg372100 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-06-22 16:31
The PR 21059 breaks Cython. I reported the issue to Cython upstream:
https://github.com/cython/cython/issues/3701

numpy is also affected: code generated by Cython (numpy/random/_mt19937.c) contains the issue.
msg372121 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-06-22 21:30
PySequence_Fast_ITEMS() must also be deprecated since it also gives a direct access to PyTupleObject.ob_item and PyListObject.ob_item members (PyObject**). PySequence_Fast_GET_ITEM() should be used instead.
msg372123 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-06-22 21:38
PySequence_Fast_ITEMS() is used in numpy at 7 lines:

numpy/core/src/common/ufunc_override.c:118:        *out_objs = PySequence_Fast_ITEMS(seq);

---
PyUFuncOverride_GetOutObjects(PyObject *kwds, PyObject **out_kwd_obj, PyObject ***out_objs)
---
=> PyObject** is part of the numpy C API (it's even a pointer to a PyObject** array)!


numpy/core/src/multiarray/arrayfunction_override.c:80:    PyObject **items = PySequence_Fast_ITEMS(relevant_args);
---
static int
get_implementing_args_and_methods(PyObject *relevant_args,
                                  PyObject **implementing_args,
                                  PyObject **methods)
{
    int num_implementing_args = 0;
    Py_ssize_t i;
    int j;

    PyObject **items = PySequence_Fast_ITEMS(relevant_args);
    Py_ssize_t length = PySequence_Fast_GET_SIZE(relevant_args);

    for (i = 0; i < length; i++) {
        int new_class = 1;
        PyObject *argument = items[i];
----

=> here PySequence_Fast_GET_ITEM() can be used instead.


numpy/core/src/multiarray/arrayfunction_override.c:167:    PyObject **items = PySequence_Fast_ITEMS(types);
numpy/core/src/multiarray/common.c:434:    objects = PySequence_Fast_ITEMS(seq);
numpy/core/src/multiarray/iterators.c:1368:    ret = multiiter_new_impl(n, PySequence_Fast_ITEMS(fast_seq));
numpy/core/src/multiarray/methods.c:1015:    in_objs = PySequence_Fast_ITEMS(fast);
numpy/core/src/umath/override.c:38:    arg_objs = PySequence_Fast_ITEMS(args);
msg372126 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2020-06-22 21:55
> Giving a direct access to an array or PyObject* (PyObject**) is causing issues with other Python implementations, like PyPy, which don't use PyObject internally.

I'm wondering – if the intention is to help other implementations, then why do we need to break extensions in *CPython* for that? In CPython, this usage seems perfectly valid with the current data structure. If that ever changes, we can still make this change as well, but do you really see the internal layout of tuples change at some point? (Lists are a different case, I'd say, but tuples?)
msg372128 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-06-22 21:59
The long rationale is explained in PEP 620:
https://www.python.org/dev/peps/pep-0620/

PyTupleObject and PyListObject structures must become opaque. Also, later, these structures may change to be more efficient.
msg372129 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2020-06-22 22:20
> Also, later, these structures may change to be more efficient.

Tuples? Really?

Ok, quoting PEP-620:
> Members of … PyTupleObject structures have not changed since the "Initial revision" commit (1990)

I honestly think the reason for that might simply be that there's not so much to improve for tuples.

> nothing prevents a C extension to get or set directly
> PyTupleObject.ob_item[0] (the first item of a tuple).

I certainly understand that that is a problem, especially if "PyObject" may change in the future. And this is essentially what the current "PyTuple_GET_ITEM()" macro does in a binary module.

Should we also turn "_PyTuple_ITEMS()" into a public inline function then to make up for the loss of the "&PyTuple_GET_ITEM(t, 0)" pattern? It would make it explicit what is intended. I think that should be our main goal in the CPython side.
msg372132 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-06-22 22:43
I suggest you to discuss the PEP 620 on python-dev, rather than on the bug tracker. I just posted it there today.

This issue tracks the implemetation of one part of the PEP 620.
msg372149 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-06-23 09:42
See "(PEP 620) C API for efficient loop iterating on a sequence of PyObject** or other C types" thread on python-dev:
https://mail.python.org/archives/list/python-dev@python.org/thread/632CV42376SWVYAZTHG4ROOV2HRHOVZ7/
History
Date User Action Args
2020-06-23 09:42:54vstinnersetmessages: + msg372149
2020-06-22 22:43:11vstinnersetmessages: + msg372132
2020-06-22 22:20:18scodersetmessages: + msg372129
2020-06-22 21:59:58vstinnersetmessages: + msg372128
2020-06-22 21:55:57scodersetnosy: + scoder
messages: + msg372126
2020-06-22 21:38:18vstinnersetmessages: + msg372123
2020-06-22 21:30:46vstinnersetmessages: + msg372121
2020-06-22 16:31:52vstinnersetmessages: + msg372100
2020-06-22 16:10:53vstinnersetpull_requests: + pull_request20230
2020-06-22 16:02:56vstinnersetmessages: + msg372097
2020-06-22 15:39:40vstinnersetmessages: + msg372095
2020-06-22 15:38:41vstinnersetpull_requests: + pull_request20229
2020-06-22 15:27:44vstinnersetmessages: + msg372093
2020-06-22 15:12:04vstinnersetpull_requests: + pull_request20228
2020-06-22 15:04:17vstinnersetkeywords: + patch
stage: patch review
pull_requests: + pull_request20227
2020-06-22 14:50:13vstinnercreate