classification
Title: multiple inheritance + C extension = possibly unexpected __base__
Type: behavior Stage:
Components: Extension Modules Versions: Python 3.5, Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Martin.Teichmann, eric.snow
Priority: normal Keywords:

Created on 2014-02-05 08:30 by Martin.Teichmann, last changed 2014-02-08 14:41 by Martin.Teichmann.

Files
File name Uploaded Description Edit
patch Martin.Teichmann, 2014-02-08 14:41
Messages (3)
msg210291 - (view) Author: Martin Teichmann (Martin.Teichmann) * Date: 2014-02-05 08:30
Python behaves odd with regards to multiple inheritance and classes
written in C. I stumbled over this problem while working with PyQt4,
but soon realized that part of the problem is not actually in that
library, but is deep down in the CPython core. For better
understanding of this post, I still use PyQt4 as an example. For those
who don't know PyQt4, it's an excellent Python binding for some C++
library, for this post you only need to know that QTimer is a class
that inherits from QObject.

The PyQt4 documentation repeatedly insists that it is not possible to
inherit more than one of its classes. This is not astonishing, since
this is actually a limitation of CPython. What should still be
possible is to inherit from two classes if one is the parent of the
other. Let me give an example:

========================================
from PyQt4.QtCore import QObject, QTimer
# QObject is the parent of QTimer

class A(QObject):
    pass

class B(A, QTimer):
    pass

class C(QTimer, A):
    pass

print(B.__base__, B.__mro__)
print(C.__base__, C.__mro__)
========================================

Both classes B and C technically inherit from both QObject and QTimer,
but given that QTimer inherits from QObject, there is no actual
multiple inheritance here, from the perspective of a class written in
C.

But now the problems start. The metaclass of PyQt4 uses the __base__
class attribute to find out which of its classes the new class
actually decends from (this is called the "best_base" in typeobject.c).
This is the correct behavior, this is exactly
what __base__ is for. Lets see what it contains. For the class B, the
second-to-last line prints:

<class '__main__.A'> (<class '__main__.B'>, <class '__main__.A'>, <class 'PyQt4.QtCore.QTimer'>, <class 'PyQt4.QtCore.QObject'>, <class 'sip.wrapper'>, <class 'sip.simplewrapper'>, <class 'object'>)

So, __base__ is set to class A. This is incorrect, as PyQt4 now thinks
it should create a QObject. The reason is the weird algorithm that
typeobject.c uses: it tries to find the most special class that does
not change the size of the its instances (called the solid_base). This
sounds reasonable at first, because only classes written in C can
change the size of their instances. Unfortunately, this does not hold
the other way around: in PyQt4, the instances only contain a pointer
to the actual data structures, so all instances of all PyQt4 classes
have the same size, and __base__ will simply default to __bases__[0].

Now I tried to outsmart this algorithm, why not put the PyQt4 class as
the first parent class? This is what the class C is for in my example.
And indeed, the last line of my example prints:

<class 'PyQt4.QtCore.QTimer'> (<class '__main__.C'>, <class 'PyQt4.QtCore.QTimer'>, <class '__main__.A'>, <class 'PyQt4.QtCore.QObject'>, <class 'sip.wrapper'>, <class 'sip.simplewrapper'>, <class 'object'>)

So hooray, __base__ is set to QTimer, the metaclass will inherit from
the correct class! But well, there is a strong drawback: now the MRO,
which I print in the same example, has a mixture of Python and PyQt4
classes in it. Unfortunately, the methods of the PyQt4 classes do not
call super, so they are uncooperative when it comes to multiple
inheritance. This is expected, as they are written in C++, a language
that has a weird concept of cooperative multiple inheritance, if it
has one at all.

So, to conclude: it is sometimes not possible to use python
cooperative multiple inheritance if C base classes are involved. This
is a bummer.

Can we change this behavior? Yes, certainly. The clean way would be to
re-write typeobject.c to actually find the best_base in a sane way.
This would be easiest if we could just find out somehow whether a
class is written in Python or in C, e.g. by adding a tp_flag to
PyTypeObject. best_base would then point to the most specialized
parent written in C.

A simpler solution would be to default to __bases__[-1] for __base__,
then we can tell users to simply put their uncooperative base classes
last in the list of bases.
msg210360 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2014-02-06 04:47
So to restate, where some class Spam inherits from multiple classes and at least one was written in C, Spam.__base__ may have an unexpected value.

> So, to conclude: it is sometimes not possible to use python
> cooperative multiple inheritance if C base classes are involved. This
> is a bummer.

Be careful not to muddy the waters here by obscuring the problem you are describing (regarding __base__) with the challenges of making multiple inheritance work.  Multiple inheritance with classes that don't cooperate in your hierarchy is tricky, but solveable and is simply the nature of the beast.  To move this issue forward I recommend simply focusing on how __base__ (and its use) could be improved.

I think part of the problem is that the metaclass of PyQt4 uses the __base__ class attribute instead of the MRO...

<aside>
Cooperative multiple inheritance is all about the classes involved cooperating.  You already indicated that the QT classes do not cooperate.  To work around this you could wrap QT objects with proxies that *do* cooperate in your multiple inheritance scheme.  Alternately you could fiddle around in __init__ to make it work.  Raymond Hettinger had a great talk on this topic at Pycon 2012 (or was it 2013) that would be worth checking out.  I remember him discussing strategies for fitting uncooperative classes into a multiple inheritance hierarchy.
</aside>
msg210652 - (view) Author: Martin Teichmann (Martin.Teichmann) * Date: 2014-02-08 14:41
I've been working a bit on a solution to this issue, one proposal
is in the attached patch. The patch adds a new flag to tp_flags,
called Py_TPFLAGS_SOLID, which marks a class as being solid, i.e.
its memory layout is incompatible with its parent layout. C classes
can then set this flag, and cpython will assure that no incompatible
classes are in the same inheritance graph.

Other solutions are certainly thinkable: Eric proposed one should
use the MRO instead of __base__. This is a great idea, why is 
cpython itself not doing it? Instead of traversing the inheritance
graph trying to find the solid base, one could simply iterate over
the MRO.

In order to illustrate where the actual problem lies, here another
example:

=========================================
rom PyQt4.QtCore import QObject, QTimer, QTimerEvent
# QObject is the parent of QTimer

class Mixin(QObject):
    # this overwrites QObject.timerEvent
    def timerEvent(self, e):
        print('mixed in')
        super().timerEvent(e)

class B(Mixin, QTimer):
    pass

class C(QTimer, Mixin):
    pass

event = QTimerEvent(0)

b = B()
try:
    b.timerEvent(event)
except Exception as e:
    print(e)
print('---------')
c = C()
c.timerEvent(event)
=========================================

I'm writing a mixin class, that overwrites a method from QObject.
In the end I am calling this method (normally that's done from
within PyQt4). For class B, this mixing in works properly, my
code is executed, but then I am getting an exception: PyQt4 had
made b and instance of QObject, not QTimer, since this is where
the __base__ is pointing to. For class C, my code is never called
because PyQt4 is not cooperating. Sure, I could write wrapper
classes for every Qt class that I want to mix into, but what
would be the point of a mixin class then?

I hope I made my problem a bit clearer.
History
Date User Action Args
2014-02-08 14:41:36Martin.Teichmannsetfiles: + patch

messages: + msg210652
2014-02-06 04:47:45eric.snowsettitle: Weird behavior with multiple inheritance when C classes involved -> multiple inheritance + C extension = possibly unexpected __base__
nosy: + eric.snow

messages: + msg210360

components: + Extension Modules, - Interpreter Core
2014-02-05 08:30:36Martin.Teichmanncreate