classification
Title: [C API] Add Py_Is(x, y) and Py_IsNone(x) functions
Type: Stage: resolved
Components: C API Versions: Python 3.10
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Carl.Friedrich.Bolz, Mark.Shannon, erlendaasland, mark.dickinson, pablogsal, rhettinger, shihai1991, tim.peters, vstinner
Priority: normal Keywords: patch

Created on 2021-04-06 18:38 by vstinner, last changed 2021-05-18 20:44 by vstinner. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 25227 merged vstinner, 2021-04-06 19:02
Messages (18)
msg390358 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-04-06 18:38
I propose to add at least Py_Is(x, y) function to the Python C API which would be simply implemented as:

    static inline int Py_Is(PyObject *x, PyObject *y) { return (x == y); }

Right now, there is no benefit for CPython. The idea is to prepare the CPython code base for future optimization, like tagged pointers. It may help PyPy. It can also help to prepare C extensions to migrate to HPy, since HPy requires to use HPy_Is(ctx, x, y): "x == y" (where x and y types is HPy) fails with a compiler error with HPy.

--

CPython uses reference counting and singletons like None and False. The "x is y" operator in Python is implemented with the IS_OP opcode implemented in Python/ceval.c as:

        case TARGET(IS_OP): {
            PyObject *right = POP();
            PyObject *left = TOP();
            int res = (left == right)^oparg;
            PyObject *b = res ? Py_True : Py_False;
            Py_INCREF(b);
            SET_TOP(b);
            Py_DECREF(left);
            Py_DECREF(right);
            PREDICT(POP_JUMP_IF_FALSE);
            PREDICT(POP_JUMP_IF_TRUE);
            DISPATCH();
        }

In short, "x is y" in Python simply compares directly PyObject* pointer values in C: "x == y" where x and y are PyObject* pointers.

PyPy doesn't use reference counting and so implements "x is y" differently: id(x) == id(y) if I understood correctly. In PyPy, id(obj) is more complex than simply getting the object memory address. For example, id(1) is always 17 in PyPy (on Linux/x86-64 at least).

At the C API level, using "x == y" to check if y object "is" x object doesn't work well with PyPy object models. Moreover, "x == Py_None" doesn't work if None is stored as a tagged pointer in CPython.

--

Maybe we can also add functions to check if an object is a singleton:

* Py_IsNone(x): x == Py_None
* Py_IsTrue(x): x == Py_True
* Py_IsFalse(x): x == Py_False

See also bpo-39511 "[subinterpreters] Per-interpreter singletons (None, True, False, etc.)" which may need to replace Py_None singleton with Py_GetNone(). IMO it makes more sense to call Py_IS_NONE(x) function to express that code is run, rather than writing "x == Py_None" as if Py_None is and will always be a variable directly accesssible.

Py_IS_TRUE() name sounds like PyObject_IsTrue(). Can it be an issue? Can someone come with a better name?
msg390360 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-04-06 18:56
Python has more singletons. In C:

* Py_NotImplemented: NotImplemented
* Py_Ellipsis: Ellipsis

But I don't think that "x == NotImplemented" is common enough to justify a new function.
msg390364 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-04-06 19:07
Py_IS_TYPE(obj, type) was added to Python 3.9 by bpo-39573:
https://docs.python.org/dev/c-api/structures.html#c.Py_IS_TYPE

commit d905df766c367c350f20c46ccd99d4da19ed57d8
Author: Dong-hee Na <donghee.na92@gmail.com>
Date:   Fri Feb 14 02:37:17 2020 +0900

    bpo-39573: Add Py_IS_TYPE() function (GH-18488)
    
    Co-Author: Neil Schemenauer <nas-github@arctrix.com>

It's currently implemented as:

static inline int _Py_IS_TYPE(const PyObject *ob, const PyTypeObject *type) {
    return Py_TYPE(ob) == type;
}
#define Py_IS_TYPE(ob, type) _Py_IS_TYPE(_PyObject_CAST_CONST(ob), type)
msg390379 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2021-04-06 21:48
> Right now, there is no benefit for CPython.

Please don't this until we have a clear demonstrable benefit.  As it stands now, this is all cost and no benefit.

Adding unnecessary abstraction layers just makes it more difficult for people to learn to be core devs.  Also, it will likely result in a lot of pointless code churn where each change carries a risk of a new bug being introduced.

The notion of pointer comparison is so fundamental to C code that it is counter productive to try to abstract it away.  It isn't much different than saying that every instance of "a + b" should be abstracted to "add(a, b)" and every "a[b]" should be abstracted to "index_lookup(a, b)" on the hope that maybe it might someday be helpful for PyPy or HPy even though they have never requested such a change.

From my own point of view, these need abstractions just make it harder to tell what code is actually doing.  Further, it will just lead to wasting everyone's time in code reviews where the reviewer insists on the applying the new inline function and there is confusion about whether two pointers are generic pointers or python object pointers, each with their own comparison technique.

Also, there is too much faith in functions marked as "inline" always being inlined. Compilers get to make their own choices and under some circumstances will not inline, especially for cross module calls.  This risks taking code that is currently obviously fast and occasionally, invisibility slowing it down massively — from a step that is 1 cycle at most and is sometimes zero cost to a step that involves an actual function call, possibly needing to save and restore registers.

Lastly, the API changes aren't just for you or the standard library.  In effect, you're telling the entire ecosystem of C extensions that they are doing it wrong.  Almost certainly, some will follow this path and some won't, further fracturing the ecosystem.
msg390408 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2021-04-07 09:24
For any sane design of tagged pointers, `x == y` (in C) will work fine.

`is` is not well defined except for a small set of values, so the docs for `Py_Is` would have to so vague as to be worthless, IMO.
msg390411 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-04-07 10:33
> For any sane design of tagged pointers, `x == y` (in C) will work fine.

I wrote a proof-of-concept for using tagged pointners in CPython:
https://github.com/vstinner/cpython/pull/6

"The PR is large because of the first changes which add Py_IS_NONE(), Py_IS_TRUE() and Py_IS_FALSE() macros."

For backward compatibility, I added a function to get a concrete Python object from a tagged pointer: the True as a tagged pointer becomes (PyObject *)&_Py_TrueStruct concrete object. Py_IS_TRUE() has to check for both values: tagged pointer or &_Py_TrueStruct (don't look at my exact implementation, it's wrong ;-)).

In a perfect world, there would be no backward compatibility and you would never have to create Python objects from a tagged pointer.

The other problem is that the stable ABI exposes "Py_True" as "&_Py_TrueStruct" and so C extensions built with the stable ABI uses "&_Py_TrueStruct", not the tagged pointer.

See also bpo-39511 which discuss solutions for these problems.
msg390412 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-04-07 10:35
> Also, there is too much faith in functions marked as "inline" always being inlined.

I proposed to declare it as a "static inline" function, but I'm fine with a macro as well. I mostly care about the API: call "Py_Is()", not really about the exact implementation. In practice, for now, Py_Is(x, y) will continue to be compiled as "x == y".
msg390435 - (view) Author: Carl Friedrich Bolz-Tereick (Carl.Friedrich.Bolz) * Date: 2021-04-07 14:01
Just chiming in to say that for PyPy this API would be extremely useful, because PyPy's "is" is not implementable with a pointer comparison on the C level (due to unboxing we need to compare integers, floats, etc by value). Right now, C extension code that compares pointers is subtly broken and cannot be fixed by us.
msg390456 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2021-04-07 17:59
> Just chiming in to say that for PyPy this API would be extremely useful

Thanks for that input. Given that there would be some value add, I withdraw my objection.

> I proposed to declare it as a "static inline" function,
> but I'm fine with a macro as well.

Let's use a macro then because inlining in guaranteed and we don't have to be on the lookout for failures to inline.
msg390469 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-04-07 19:32
PR 25227: I reimplemented Py_Is() as a macro and I added unit tests.

Other added functions simply call Py_Is(), example:

#define Py_IsNone(x) Py_Is(x, Py_None)
msg390526 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-04-08 11:37
Mark:
> `is` is not well defined except for a small set of values, so the docs for `Py_Is` would have to so vague as to be worthless, IMO.

I don't propose to change the "is" operator semantic: Py_Is(x, y) is C would behave *exaclty* as "x is y" in Python.

Check my implementation: the IS_OP bytecode uses directly Py_Is().
msg390613 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2021-04-09 12:48
I'd also prefer a Py_IsNotNone() because it's more explicit than !Py_IsNone(); the exclamation mark can be easily missed when reading/writing code, IMO.

Just my 2 cents.
msg390614 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-04-09 12:50
> I'd also prefer a Py_IsNotNone() because it's more explicit than !Py_IsNone()

I would prefer keep the C API small. I don't think that we need to duplicate all functions testing for something. We provide PyTuple_Check(obj) but we don't provide PyTuple_NotCheck(obj) for example.

IMO !Py_IsNone(obj) makes perfectly sense in Python.

Also, "x == Py_None" is more common than "x != Py_None".
msg390616 - (view) Author: Erlend E. Aasland (erlendaasland) * (Python triager) Date: 2021-04-09 12:55
> I would prefer keep the C API small.

Yes, I see the value of that as well.

I tried applying this API on an extension, and I found the code to be slightly less readable for the "is not" cases.
msg390628 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-04-09 13:59
> I tried applying this API on an extension, and I found the code to be slightly less readable for the "is not" cases.

FYI you can try upgrade_pythoncapi.py on your project using the following PR to update code to use Py_IsNone/Py_IsTrue/Py_IsFalse:
https://github.com/pythoncapi/pythoncapi_compat/pull/8

For example, use "upgrade_pythoncapi.py -o Py_Is directory/" or "upgrade_pythoncapi.py -o Py_Is file.c".
msg390750 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-04-10 22:17
New changeset 09bbebea163fe7303264cf4069c51d4d2f22fde4 by Victor Stinner in branch 'master':
bpo-43753: Add Py_Is() and Py_IsNone() functions (GH-25227)
https://github.com/python/cpython/commit/09bbebea163fe7303264cf4069c51d4d2f22fde4
msg390751 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-04-10 22:19
Carl:
> Just chiming in to say that for PyPy this API would be extremely useful, because PyPy's "is" is not implementable with a pointer comparison on the C level (due to unboxing we need to compare integers, floats, etc by value). Right now, C extension code that compares pointers is subtly broken and cannot be fixed by us.

Can you come with a sentence that I can put in the documentation to explain that Py_Is() is written for interoperability with other Python implementations?
msg393907 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021-05-18 20:44
Well, nobody came up with a better definition, so let's go with:

"Test if the x object is the y object, the same as x is y in Python."

Thanks for the feedback and reviews! Py_Is(x, y), Py_IsNone(x), Py_IsTrue(x) and Py_IsFalse(x) are now part of Python 3.10 C API.
History
Date User Action Args
2021-05-18 20:44:33vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg393907

stage: patch review -> resolved
2021-04-10 22:19:22vstinnersetmessages: + msg390751
2021-04-10 22:17:46vstinnersetmessages: + msg390750
2021-04-09 13:59:56vstinnersetmessages: + msg390628
2021-04-09 12:55:45erlendaaslandsetmessages: + msg390616
2021-04-09 12:50:46vstinnersetmessages: + msg390614
2021-04-09 12:48:01erlendaaslandsetnosy: + erlendaasland
messages: + msg390613
2021-04-08 11:37:11vstinnersetmessages: + msg390526
2021-04-07 19:32:24vstinnersetmessages: + msg390469
2021-04-07 17:59:31rhettingersetmessages: + msg390456
2021-04-07 14:01:46Carl.Friedrich.Bolzsetnosy: + Carl.Friedrich.Bolz
messages: + msg390435
2021-04-07 10:35:39vstinnersetmessages: + msg390412
2021-04-07 10:33:31vstinnersetmessages: + msg390411
2021-04-07 09:24:56Mark.Shannonsetnosy: + Mark.Shannon
messages: + msg390408
2021-04-07 06:43:23mark.dickinsonsetnosy: + mark.dickinson
2021-04-07 05:00:30shihai1991setnosy: + shihai1991
2021-04-06 21:49:15rhettingersetnosy: + tim.peters, pablogsal
2021-04-06 21:48:49rhettingersetnosy: + rhettinger
messages: + msg390379
2021-04-06 19:07:08vstinnersetmessages: + msg390364
2021-04-06 19:02:28vstinnersettitle: [C API] Add Py_IS(x, y) and Py_IsNone(x) functions -> [C API] Add Py_Is(x, y) and Py_IsNone(x) functions
2021-04-06 19:02:24vstinnersettitle: [C API] Add Py_IS(x, y) macro -> [C API] Add Py_IS(x, y) and Py_IsNone(x) functions
2021-04-06 19:02:02vstinnersetkeywords: + patch
stage: patch review
pull_requests: + pull_request23964
2021-04-06 18:56:53vstinnersetmessages: + msg390360
2021-04-06 18:38:58vstinnercreate