classification
Title: Using typename.__setattr__ in extension type with Py_TPFLAGS_HEAPTYPE is broken (hackcheck too eager?)
Type: behavior Stage: resolved
Components: C API Versions: Python 3.10, Python 3.9, Python 3.8
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: David Caro, Matthias Braun, gvanrossum, miss-islington, petr.viktorin, scoder
Priority: normal Keywords: patch

Created on 2020-03-13 22:31 by Matthias Braun, last changed 2020-09-20 07:49 by scoder. This issue is now closed.

Files
File name Uploaded Description Edit
demomodule.zip Matthias Braun, 2020-03-13 22:31 Minimal extension module overriding tp_setattro
Pull Requests
URL Status Linked Edit
PR 21092 merged scoder, 2020-06-23 19:39
PR 21288 merged miss-islington, 2020-07-03 00:09
PR 21339 merged scoder, 2020-07-05 19:36
PR 21473 closed David Caro, 2020-07-14 16:29
Messages (20)
msg364123 - (view) Author: Matthias Braun (Matthias Braun) * Date: 2020-03-13 22:31
This is about an extension type created via `PyType_FromSpec` that overrides `tp_setattro` (minimal example attached). In this case cpython does not let me grab and use the `__setattr__` function "manually". Example:

```
>>> import demo
>>> mytype_setattr = demo.MyType.__setattr__
>>> i = demo.MyType()
>>> mytype_setattr(i, "foo", "bar")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: can't apply this __setattr__ to object object
```

I suspect this is related to the loop at the beginning of typeobject.c / hackcheck() that skips over types with Py_TPFLAGS_HEAPOBJECT. (Though removing the loop breaks things like the enum module).
msg371932 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2020-06-20 11:16
I ran into this, too. I agree that the "hackcheck" loop on heap types is probably wrong.

https://github.com/python/cpython/blob/04fc4f2a46b2fd083639deb872c3a3037fdb47d6/Objects/typeobject.c#L5947-L5977

It was written at a time (Py2.3?) when (practically) only Python implemented types were heap types, not extension types. I think what it tried to do was to find the builtin base type of a Python type, and check that no-one played tricks on the C slot functions of that C implemented type. With extension types implemented as heap types, having a different slot function in there is a perfectly valid thing.

I'll call in a couple of people since I'm also not sure how to fix the "hackcheck".

I'll also leave Py3.7 in the list of affected Python versions since we still have a short week before its final non-secfix-release. :)
msg371976 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2020-06-21 00:50
See also test_carloverre() in test_descr.py. I don't recall exactly what this was about, but that test is definitely relevant.
msg372020 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2020-06-21 20:55
This SO answer by Martijn Pieters explains the background:

https://stackoverflow.com/a/44854477

and links to the original python-dev discussion:

https://mail.python.org/pipermail/python-dev/2003-April/034535.html

The current implementation checks that the function being called is the one defined in the first non-heap type up the hierarchy. As long as heap types are only Python types, this is the first builtin/native/C-implemented (super)type. With extension types as heap types, however, it's no longer necessarily the type that defines the correct slot function.

IMHO, Greg Ewing gives the right direction here:

https://mail.python.org/pipermail/python-dev/2003-April/034569.html

> Providing some way for objects to prevent superclass
> methods from being called on them when they're not looking

So, I think we should do something like walking up the hierarchy to find the C function in it that is currently being called, and then check if it's the one we would expect or if a subclass defines a different one. The current check does not bother to search and just assumes that it knows which (super)type defines the right function to call.
msg372024 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2020-06-21 22:00
That sounds like the right thing to do. You should be able to recognize pure Python __setattr__ implementations by the presence of slot_tp_setattro in the tp_setattro slot. (Likewise for __delattr__.)

It's been really long since I looked at this -- do we still only support single inheritance at the C level? Because otherwise there would be another backdoor.

There's also tp_setattr (no 'o') but I think that's basically unused, and we ignored that in 2003 so we should be able to ignore it now. :-)
msg372179 - (view) Author: Petr Viktorin (petr.viktorin) * (Python committer) Date: 2020-06-23 15:34
> do we still only support single inheritance at the C level?

Not any more. Heap types may have multiple bases, and they can be created at the C level (since Python 3.2, PEP 384).

>  So, I think we should do something like walking up the hierarchy to find the C function in it that is currently being called, and then check if it's the one we would expect or if a subclass defines a different one. The current check does not bother to search and just assumes that it knows which (super)type defines the right function to call.

Should we be bold and skip the check for heap types?
That would mean that when/if `str` is converted to a heap type, you could do `object.__delattr__(str, "lower")`. That would break your Python, but only at the Python level (similar to things like `import builtins; del builtins.__build_class__`).
Heap types are not shared between interpreters, so that reason is gone. But Guido's [2003 mail] suggests there are other reasons to prevent changing built-in types. What are they?

[2003 mail] https://mail.python.org/pipermail/python-dev/2003-April/034535.html
msg372202 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2020-06-23 19:40
I chose to go through the MRO, which takes multiple inheritance into account.
msg372882 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2020-07-02 20:31
I think we missed the train for fixing 3.7 (which was questionable anyway), but I added a test, so it's ready for merging into 3.8+ (minus merge conflicts for the test in 3.8, probably).
msg372895 - (view) Author: miss-islington (miss-islington) Date: 2020-07-03 00:09
New changeset 148f32913573c29250dfb3f0d079eb8847633621 by scoder in branch 'master':
bpo-39960: Allow heap types in the "Carlo Verre" hack check that override "tp_setattro()" (GH-21092)
https://github.com/python/cpython/commit/148f32913573c29250dfb3f0d079eb8847633621
msg372896 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2020-07-03 00:13
Stefan can you do the 3.8 backport?
msg372898 - (view) Author: miss-islington (miss-islington) Date: 2020-07-03 00:28
New changeset bfec674254ea22ef9c0c335587eb65683f4145c7 by Miss Islington (bot) in branch '3.9':
bpo-39960: Allow heap types in the "Carlo Verre" hack check that override "tp_setattro()" (GH-21092)
https://github.com/python/cpython/commit/bfec674254ea22ef9c0c335587eb65683f4145c7
msg373043 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2020-07-05 20:12
New changeset 8912c182455de83e27d5c120639ec91b18247913 by scoder in branch '3.8':
bpo-39960: Allow heap types in the "Carlo Verre" hack check that override "tp_setattro()" (GH-21092) (GH-21339)
https://github.com/python/cpython/commit/8912c182455de83e27d5c120639ec91b18247913
msg373044 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2020-07-05 20:17
Fixed in 3.8+. Closing.
Thanks for the feedback.
msg373052 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2020-07-05 22:38
You're welcome. Hope this helps Cpython users.
msg373647 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2020-07-14 17:05
Reopening because there appears to be a problem with the fix (see PR 21473).
msg373648 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2020-07-14 17:06
@Stefan Do you agree that the fix proposed in PR 21473 needs to be made?
msg373875 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2020-07-18 06:39
The problem in the test added in PR 21473 is that there is an intermediate base type "object" in the hierarchy:

    class A(type):
        def __setattr__(cls, key, value):
            type.__setattr__(cls, key, value)

    class B:
        pass

    class C(B, A):
        pass

>>> [c for c in C.__mro__]
[<class '__main__.C'>, <class '__main__.B'>, <class '__main__.A'>, <class 'type'>, <class 'object'>]

>>> [c.__setattr__ for c in C.__mro__]
[<function A.__setattr__ at 0x7f3ca4308378>, <slot wrapper '__setattr__' of 'object' objects>, <function A.__setattr__ at 0x7f3ca4308378>, <slot wrapper '__setattr__' of 'type' objects>, <slot wrapper '__setattr__' of 'object' objects>]

I think the change to the algorithm there is too broad, it disables much of what the check was written for (or against).

Given Guido's second (negative) test case, I think it would also not be correct to add "object.__setattr__" to the list of allowed (intermediate) slot methods:

    class A(type):
        def __setattr__(cls, key, value):
            object.__setattr__(cls, key, value)   # this should fail!

    class B:
        pass

    class C(B, A):
        pass

It's difficult to turn this into an algorithm. Is the MRO really the place to look? For "A", we're only really interested in the C-level bases, aren't we?
msg373877 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2020-07-18 06:41
> intermediate base type "object" in the hierarchy

Sorry, I meant an intermediate base type "B", which inherits its setattr from "object".
msg373882 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2020-07-18 07:47
I pushed PR 21528 with a new proposal. See issue 41295.
msg377210 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2020-09-20 07:49
Closing again since GH-21528 has been merged in issue 41295.
History
Date User Action Args
2020-09-20 07:49:10scodersetstatus: open -> closed

messages: + msg377210
2020-07-18 07:47:22scodersetmessages: + msg373882
2020-07-18 06:41:07scodersetmessages: + msg373877
2020-07-18 06:39:42scodersetmessages: + msg373875
2020-07-14 17:06:26gvanrossumsetmessages: + msg373648
2020-07-14 17:05:16gvanrossumsetstatus: closed -> open

messages: + msg373647
2020-07-14 16:29:23David Carosetnosy: + David Caro

pull_requests: + pull_request20618
2020-07-05 22:38:21gvanrossumsetmessages: + msg373052
2020-07-05 20:17:22scodersetstatus: open -> closed
resolution: fixed
messages: + msg373044

stage: patch review -> resolved
2020-07-05 20:12:12scodersetmessages: + msg373043
2020-07-05 19:36:44scodersetpull_requests: + pull_request20487
2020-07-03 00:28:50miss-islingtonsetmessages: + msg372898
2020-07-03 00:13:01gvanrossumsetmessages: + msg372896
2020-07-03 00:09:58miss-islingtonsetpull_requests: + pull_request20435
2020-07-03 00:09:32miss-islingtonsetnosy: + miss-islington
messages: + msg372895
2020-07-02 20:31:26scodersetmessages: + msg372882
2020-07-02 06:45:14scodersetversions: - Python 3.7
2020-06-23 19:40:54scodersetstage: patch review
2020-06-23 19:40:37scodersetmessages: + msg372202
stage: patch review -> (no value)
2020-06-23 19:39:10scodersetkeywords: + patch
stage: patch review
pull_requests: + pull_request20258
2020-06-23 15:34:31petr.viktorinsetmessages: + msg372179
2020-06-21 22:00:47gvanrossumsetmessages: + msg372024
2020-06-21 20:55:07scodersetmessages: + msg372020
2020-06-21 00:50:33gvanrossumsetmessages: + msg371976
2020-06-20 11:16:56scodersetversions: + Python 3.8, Python 3.9, Python 3.10
nosy: + petr.viktorin, gvanrossum, scoder

messages: + msg371932

type: behavior
2020-03-13 22:31:04Matthias Brauncreate