classification
Title: Modifying class __dict__ inside __set_name__
Type: behavior Stage: resolved
Components: Interpreter Core Versions: Python 3.7, Python 3.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Martin.Teichmann, Wombatz, abarry, ncoghlan, ned.deily, python-dev, rhettinger, serhiy.storchaka
Priority: Keywords: patch

Created on 2016-11-25 02:48 by Wombatz, last changed 2017-03-31 16:36 by dstufft. This issue is now closed.

Files
File name Uploaded Description Edit
reproduce.py Wombatz, 2016-11-25 02:48 Example of how to reproduce the bug
set_name-dict-copy.patch serhiy.storchaka, 2016-11-25 07:04 review
set_name-filtered-copy.patch serhiy.storchaka, 2016-11-29 07:05 review
Pull Requests
URL Status Linked Edit
PR 552 closed dstufft, 2017-03-31 16:36
Messages (15)
msg281674 - (view) Author: (Wombatz) Date: 2016-11-25 02:48
This behavior occurs at least in python-3.6.0b3 and python-3.6.0b4.

Modifying the class __dict__ inside the __set_name__ method of a descriptor that is used inside that class can lead to a bug that possibly prevents other descriptors from being initialized (the call to their __set_name__ is prevented).

That happens because internally the cpython interpreter iterates the class __dict__ when calling all the __set_name__ methods. Changing the class __dict__ while iterating it leads to undefined behavior. This is the line in the source code https://github.com/python/cpython/blob/master/Objects/typeobject.c#L7010

See the attached file for an example where the bug can be observed. It defines a "desc" class that implements the __set_name__ method where it prints the name and modifies the class __dict__ of the containing class. Later a class is defined that has multiple instances of the "desc" class as class attributes. Depending on the number of attributes not all of them are initialized.

When you see the underlying C-Code the reason for this behavior is obvious but in python itself the problem might not be apparent.

To fix this bug the class __dict__ could be cashed and then the __set_name__ methods could be called while iterating over that copy.
msg281675 - (view) Author: Anilyka Barry (abarry) * (Python triager) Date: 2016-11-25 02:54
Ow!

I can confirm the bug still happens on latest trunk. Nice finding!
msg281681 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-11-25 07:04
Proposed patch iterates a copy of __dict__.
msg281939 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2016-11-29 03:10
Nick, as the original shepherd of PEP 487 (Issue27366), can you review Serhiy's proposed patch for 3.6.0rc1?  Thanks!
msg281942 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2016-11-29 03:45
Rather than taking a snapshot of the whole class namespace, I think it will make more sense to make a new empty dictionary of descriptors to notify, and then split the current loop into two loops:

- the first loop adds descriptors with __set_name__ attributes to the notification dict
- the second loop calls __set_name__ on all the descriptors in the notification dict

Serhiy's patch effectively already does that via the initial PyDict_Copy, but that approach also redundantly copies items that don't define __set_name__ into the snapshot and then filters them out on the second pass.

Reviewing the patch also made me realise we're currently missing a specification of the expected behaviour in https://docs.python.org/dev/reference/datamodel.html#creating-the-class-object.

I suggest adding the following paragraph between the one about setting __class__ and the one about calling class descriptors:

"""
When using the default metaclass :cls:`type`, or any metaclass that ultimately calls ``type.__new__``, the following additional customisation steps are invoked:

* first, ``type.__new__`` collects all of the descriptors in the class namespace that define a ``__set_name__`` method
* second, all of these ``__set_name__`` methods are called with the class being defined and the assigned name of that particular descriptor
* finally, the ``__init_subclass__`` hook is called on the immediate parent of the new class in its method resolution order
"""
msg281945 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-11-29 04:47
I considered this option, but if save only descriptors with the __set_name__ attribute, this would need either double looking up the __set_name__ attribute or packing it in a tuple with a value. All this too cumbersome. I chose copying all dictionary as the simplest way.
msg281949 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2016-11-29 06:06
I was thinking that the notification dict could consist of mappings from attribute names to already bound __set_name__ methods, so the double lookup would be avoided that way (but hadn't actually sent the review where I suggested that).

That is, on the second loop, the actual method call would be:

    tmp = PyObject_CallFunctionObjArgs(value, type, key, NULL);

Have I missed something that would prevent that from working?
msg281950 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-11-29 06:46
> Have I missed something that would prevent that from working?

Error message contains value->ob_type->tp_name.
msg281951 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2016-11-29 06:56
Ah, I'd missed that.

In that case, I think my suggestion to change the name of the local variable is still valid, while the specific approach just needs a comment saying it's handled as a full namespace snapshot so the descriptor type name can be reported in any error messages.
msg281952 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-11-29 07:05
Here is much more complex patch that implements Nick's suggestion.
msg281954 - (view) Author: Martin Teichmann (Martin.Teichmann) * Date: 2016-11-29 07:30
I personally prefer the first patch, which iterates over a copy of __dict__. Making a copy of a dict is actually a pretty fast operation, I would even expect that it is faster than the proposed alternative, creating tuples.

Even if the second approach should be faster, the added code complexity is not worth the effort, as this is a code path where speed shouldn't matter much.
msg281955 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2016-11-29 07:31
I'd be OK with starting with the simpler patch, and only looking at the more complex one if the benchmark suite shows a significant slowdown (I'd be surprised if it did, as it should mainly impact start-up, and class dicts generally aren't that big)
msg281956 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2016-11-29 07:32
(Plus, as Martin noted, the tuple creation overhead could easily make the more complex patch slower in many cases)
msg281960 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016-11-29 07:56
New changeset 6b8f7d1e2ba4 by Serhiy Storchaka in branch '3.6':
Issue #28797: Modifying the class __dict__ inside the __set_name__ method of
https://hg.python.org/cpython/rev/6b8f7d1e2ba4

New changeset 18ed518d2eef by Serhiy Storchaka in branch 'default':
Issue #28797: Modifying the class __dict__ inside the __set_name__ method of
https://hg.python.org/cpython/rev/18ed518d2eef
msg282321 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2016-12-04 07:13
Regarding the docs suggestion above, I need to make some other docs changes for issue 23722, so I'll roll that update in with those.

Given that, I think we can mark this issue as resolved.
History
Date User Action Args
2017-03-31 16:36:20dstufftsetpull_requests: + pull_request941
2016-12-05 19:52:11ned.deilysetpriority: release blocker ->
2016-12-04 07:13:13ncoghlansetstatus: open -> closed
resolution: fixed
messages: + msg282321

stage: patch review -> resolved
2016-11-29 07:56:26python-devsetnosy: + python-dev
messages: + msg281960
2016-11-29 07:32:43ncoghlansetmessages: + msg281956
2016-11-29 07:31:45ncoghlansetmessages: + msg281955
2016-11-29 07:30:25Martin.Teichmannsetmessages: + msg281954
2016-11-29 07:05:37serhiy.storchakasetfiles: + set_name-filtered-copy.patch

messages: + msg281952
2016-11-29 06:56:30ncoghlansetmessages: + msg281951
2016-11-29 06:46:22serhiy.storchakasetmessages: + msg281950
2016-11-29 06:06:34ncoghlansetmessages: + msg281949
2016-11-29 04:47:31serhiy.storchakasetmessages: + msg281945
2016-11-29 03:45:41ncoghlansetmessages: + msg281942
2016-11-29 03:10:54ned.deilysetnosy: + ncoghlan
messages: + msg281939
2016-11-25 07:04:49serhiy.storchakasetfiles: + set_name-dict-copy.patch
versions: + Python 3.7
messages: + msg281681

keywords: + patch
stage: needs patch -> patch review
2016-11-25 02:58:48abarrysetnosy: + Martin.Teichmann
2016-11-25 02:54:12abarrysetpriority: normal -> release blocker

nosy: + rhettinger, abarry, serhiy.storchaka, ned.deily
messages: + msg281675

stage: needs patch
2016-11-25 02:48:29Wombatzcreate