Title: Type cache is not correctly invalidated on a class defining mro()
Type: crash Stage:
Components: Interpreter Core Versions: Python 3.7, Python 3.6, Python 3.5, Python 2.7
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: ebarry, mdk, serhiy.storchaka, sjpalt, vstinner
Priority: high Keywords: patch

Created on 2016-12-04 14:35 by sjpalt, last changed 2017-02-18 10:43 by mdk.

File name Uploaded Description Edit sjpalt, 2016-12-04 14:35 Steps to reproduce mdk, 2016-12-04 21:34
issue28866.diff mdk, 2016-12-05 21:59 review
Messages (12)
msg282344 - (view) Author: Steve Palt (sjpalt) Date: 2016-12-04 14:35
I ran into a strange bug while experimenting with metaclasses that implement a custom mro() method. It only seems to occur in interactive mode, either returning completely unrelated values or causing a segfault. Python 2 appears unaffected. I have been able to reproduce this with the following code:

# $ python -i
# >>> proxy.x
# 52011448
# >>> proxy.x
# 6160
# ...

class Foo:

class Meta(type):
    def mro(cls):
        return (cls, Foo, object)

    def __setattr__(cls, name, value):
        setattr(Foo, name, value)

proxy = Meta('FooProxy', (), {})

proxy.x = 300
proxy.x  # don't omit
proxy.x = 0
msg282345 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-12-04 15:00
Python 2 is affected if make Foo a new style class.
msg282349 - (view) Author: Emanuel Barry (ebarry) * Date: 2016-12-04 15:40
Additionally, trying to access an attribute before assigning it will result in an AttributeError on all subsequent accesses (even if set).

I didn't manage to get a segfault, however.
msg282353 - (view) Author: Julien Palard (mdk) * Date: 2016-12-04 17:59
FWIW, in _PyType_Lookup I see the "300" PyLong being cached, and later, just before the segfault, I see its address getting out of the cache (a cache hit) but it's no longer a PyLong, it's scrambled, so we're getting a real pointer with scrambled values on the line:

    attribute = _PyType_Lookup(type, name);

segfaulting two lines later in:

    descrgetfunc local_get = Py_TYPE(attribute)->tp_descr_get

When I write scrambled value I mean: 

    (gdb) p *attribute                                                                                                         
    $21 = {_ob_next = 0xdbdbdbdbdbdbdbdb, _ob_prev = 0xdbdbdbdbdbdbdbdb, ob_refcnt = -2604246222170760229, ob_type = 0xdbdbdbd\

To debug interactive session in GDB I used:

    r -i < stdin

with "proxy.x" in the stdin file.
msg282368 - (view) Author: Julien Palard (mdk) * Date: 2016-12-04 21:34
I'm able to reproduce the segmentation fault outside the interactive mode, see attached file.
msg282375 - (view) Author: Julien Palard (mdk) * Date: 2016-12-04 22:51
Problem looks mainly due to the __setattr__ done on a different type than the getattr, and the cache of PyType_Lookup:

 - type_setattro will be called with Foo, "x", and the value to set
 - type_getattro will be called with FooProxy, "x"

The timeline (with my is:

 - type_setattro is called on Foo for attr "x" with the instance of Bar
 - type_getattro is called on FooProxy for attr "x" (the `proxy.x` line)
     During this type_getattro, the instance of Bar is cached in PyType_Lookup for (FooProxy, "x")
 - type_setattro is called on Foo for attr "x" with the value 0
     During this call, insertdict will legitimately call DECREF on the Bar instance, which is deleted
 - type_getattro is called on FooProxy with attr "x" for the print()
     During this call, PyType_Lookup will cache hit for (FooProxy, x) with the old reference to the instance of Bar
msg282430 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-12-05 16:28
If you call sys._clear_type_cache() after the setattr(), doesn't crash anymore.
msg282468 - (view) Author: Julien Palard (mdk) * Date: 2016-12-05 21:59
I found a way to fix it, but as I'm just discovering cpython internals, I'm not sure it's the right way, review are very welcome.

I fixed it this way because:

The bytecode generated for "proxy.x = 0" (a STORE_ATTR) will call a
PyObject_SetAttr with FooProxy which will soon execute the __setattr__ of Meta, itself calling a PyObject_SetAttr with Foo.

The actual attribute setting is done from the PyObject_SetAttr with Foo (calling in turn type_setattro, and so on), but it's already too late to invalidate the FooProxy type: we no longer have a reference on it and can't guess that FooProxy delegated __setattr__ to Foo.

So the only place when we can invalidate correctly is in the first call of PyObject_SetAttr when we know on which type the attribute will be set, and it make sense: It does not matter what a __setattr__ does and how it does it: invalidation should happen for this attribute on the current type, so invalidating here seems logic.

I did not yet took the time to measure performance loss induced with this patch.

With the patch:

./python -i
>>> proxy.x
>>> proxy.x
msg282485 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-12-05 23:13
The issue is that the Meta class has a reference to the class Foo in its mro() method, but Foo is not aware of Meta. So when Foo is modified, the Foo cache is invalidated, but not Meta cache.

issue28866.diff always invalidates the cache, so it works. But it is suboptimal, IMO it defeats the whole purpose of a cache.

I never defined a mro() method. I'm not sure that it's possible to have a type cache and a mro() method?


* Disable completely the cache on classes defining mro()
* Modify "Meta" (the C code implementing the type, not the Python code) to track the version tag of each class referenced by mro(). Problem: mro() is dynamic!?
* Somehow, notify Foo that Meta has a reference to its cache, so Foo is able to invalidate Meta cache
msg282486 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-12-05 23:30
> * Somehow, notify Foo that Meta has a reference to its cache, so Foo is able to invalidate Meta cache

This was implemented when the type cache was implemented in Python 2.6, but only for explicit subclasses. PyType_Modified() iterates on tp_subclasses.

PyType_Ready() updates tp_subclasses: it stores a weak reference to sub classes in each base class.

I understand that, if we want to implement this feature, type_mro_modified() should be modified to add a backward reference in each base class of the MRO. type_mro_modified() is called when a type is defined, but also when type.__bases__ is explicitly modified.

It would require to add a new slot to types, and so increase a little bit the memory usage, and slow down the creation of a type, and type.__bases__ (slow down: probably negligible, O(1) since the existing tp_subclasses uses a dict).
msg282598 - (view) Author: Julien Palard (mdk) * Date: 2016-12-07 07:06
> issue28866.diff always invalidates the cache, so it works. But it is suboptimal, IMO it defeats the whole purpose of a cache.

Not sure about defeating the purpose of the cache as I only invalidate in setattr, getattr are still cache hitting. I tried:

$ python3 -m performance run --python=./python --rigorous -b all -o master.json
$ git checkout issue28866; make -j 8
$ python3 -m performance run --python=./python --rigorous -b all -o issue28866.json
$ python3 -m performance compare master.json issue28866.json

And I don't see much differences, probably only noise as I get some faster tests:

### call_method_unknown ###
Median +- Std dev: 56.8 ms +- 3.3 ms -> 52.9 ms +- 1.6 ms: 1.08x faster
Significant (t=12.92)

### pybench.IfThenElse ###
Median +- Std dev: 247 ns +- 3 ns -> 224 ns +- 16 ns: 1.11x faster
Significant (t=15.15)

and some slower:
### pybench.StringPredicates ###
Median +- Std dev: 1.89 us +- 0.05 us -> 2.18 us +- 0.21 us: 1.15x slower
Significant (t=-15.20)

### unpack_sequence ###
Median +- Std dev: 207 ns +- 4 ns -> 231 ns +- 27 ns: 1.12x slower
Significant (t=-11.63)

I'm not yet accustomed to this perf suite, so I may miss something obvious. I'll ran it again on master to measure the noise and should probably fine-tune my system for stability.

I'll also try a benchmark without the cache for comparison.
msg288072 - (view) Author: Julien Palard (mdk) * Date: 2017-02-18 10:43

Tried again, this  time getting some stats with MCACHE_STATS 1, to check if my patch is defeating the cache:

Without my patch:

$ time ./python performance/benchmarks/ --worker -l1 -w0 -n1 --filename chaos.ppm --width=512 --height=512 --iterations 50000
chaos: Median: 2.51 sec
-- Method cache hits        = 16581735 (99%)
-- Method cache true misses = 4092 (0%)
-- Method cache collisions  = 28542 (0%)
-- Method cache size        = 96 KB

With my patch:

$ time ./python performance/benchmarks/ --worker -l1 -w0 -n1 --filename chaos.ppm --width=512 --height=512 --iterations 50000
chaos: Median: 2.53 sec
-- Method cache hits        = 16582260 (99%)
-- Method cache true misses = 4096 (0%)
-- Method cache collisions  = 28012 (0%)
-- Method cache size        = 96 KB
Date User Action Args
2017-02-18 10:43:24mdksetmessages: + msg288072
2016-12-07 07:06:00mdksetmessages: + msg282598
2016-12-05 23:30:05vstinnersetmessages: + msg282486
2016-12-05 23:13:53vstinnersetmessages: + msg282485
title: Unexpected behavior resulting from mro() and __setattr__ in interactive mode -> Type cache is not correctly invalidated on a class defining mro()
2016-12-05 21:59:04mdksetfiles: + issue28866.diff
keywords: + patch
messages: + msg282468
2016-12-05 16:28:04vstinnersetmessages: + msg282430
2016-12-05 13:36:36vstinnersetnosy: + vstinner
2016-12-04 22:51:04mdksetmessages: + msg282375
2016-12-04 21:34:53mdksetfiles: +

messages: + msg282368
2016-12-04 17:59:12mdksetnosy: + mdk
messages: + msg282353
2016-12-04 15:45:33serhiy.storchakasetversions: + Python 2.7
2016-12-04 15:40:23ebarrysetnosy: + ebarry
messages: + msg282349
2016-12-04 15:00:05serhiy.storchakasetpriority: normal -> high
versions: + Python 3.6, Python 3.7, - Python 3.4
nosy: + serhiy.storchaka

messages: + msg282345
2016-12-04 14:35:45sjpaltcreate