Author Martin.Teichmann
Recipients Martin.Teichmann
Date 2014-02-05.08:30:35
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1391589036.91.0.744972661823.issue20518@psf.upfronthosting.co.za>
In-reply-to
Content
Python behaves odd with regards to multiple inheritance and classes
written in C. I stumbled over this problem while working with PyQt4,
but soon realized that part of the problem is not actually in that
library, but is deep down in the CPython core. For better
understanding of this post, I still use PyQt4 as an example. For those
who don't know PyQt4, it's an excellent Python binding for some C++
library, for this post you only need to know that QTimer is a class
that inherits from QObject.

The PyQt4 documentation repeatedly insists that it is not possible to
inherit more than one of its classes. This is not astonishing, since
this is actually a limitation of CPython. What should still be
possible is to inherit from two classes if one is the parent of the
other. Let me give an example:

========================================
from PyQt4.QtCore import QObject, QTimer
# QObject is the parent of QTimer

class A(QObject):
    pass

class B(A, QTimer):
    pass

class C(QTimer, A):
    pass

print(B.__base__, B.__mro__)
print(C.__base__, C.__mro__)
========================================

Both classes B and C technically inherit from both QObject and QTimer,
but given that QTimer inherits from QObject, there is no actual
multiple inheritance here, from the perspective of a class written in
C.

But now the problems start. The metaclass of PyQt4 uses the __base__
class attribute to find out which of its classes the new class
actually decends from (this is called the "best_base" in typeobject.c).
This is the correct behavior, this is exactly
what __base__ is for. Lets see what it contains. For the class B, the
second-to-last line prints:

<class '__main__.A'> (<class '__main__.B'>, <class '__main__.A'>, <class 'PyQt4.QtCore.QTimer'>, <class 'PyQt4.QtCore.QObject'>, <class 'sip.wrapper'>, <class 'sip.simplewrapper'>, <class 'object'>)

So, __base__ is set to class A. This is incorrect, as PyQt4 now thinks
it should create a QObject. The reason is the weird algorithm that
typeobject.c uses: it tries to find the most special class that does
not change the size of the its instances (called the solid_base). This
sounds reasonable at first, because only classes written in C can
change the size of their instances. Unfortunately, this does not hold
the other way around: in PyQt4, the instances only contain a pointer
to the actual data structures, so all instances of all PyQt4 classes
have the same size, and __base__ will simply default to __bases__[0].

Now I tried to outsmart this algorithm, why not put the PyQt4 class as
the first parent class? This is what the class C is for in my example.
And indeed, the last line of my example prints:

<class 'PyQt4.QtCore.QTimer'> (<class '__main__.C'>, <class 'PyQt4.QtCore.QTimer'>, <class '__main__.A'>, <class 'PyQt4.QtCore.QObject'>, <class 'sip.wrapper'>, <class 'sip.simplewrapper'>, <class 'object'>)

So hooray, __base__ is set to QTimer, the metaclass will inherit from
the correct class! But well, there is a strong drawback: now the MRO,
which I print in the same example, has a mixture of Python and PyQt4
classes in it. Unfortunately, the methods of the PyQt4 classes do not
call super, so they are uncooperative when it comes to multiple
inheritance. This is expected, as they are written in C++, a language
that has a weird concept of cooperative multiple inheritance, if it
has one at all.

So, to conclude: it is sometimes not possible to use python
cooperative multiple inheritance if C base classes are involved. This
is a bummer.

Can we change this behavior? Yes, certainly. The clean way would be to
re-write typeobject.c to actually find the best_base in a sane way.
This would be easiest if we could just find out somehow whether a
class is written in Python or in C, e.g. by adding a tp_flag to
PyTypeObject. best_base would then point to the most specialized
parent written in C.

A simpler solution would be to default to __bases__[-1] for __base__,
then we can tell users to simply put their uncooperative base classes
last in the list of bases.
History
Date User Action Args
2014-02-05 08:30:36Martin.Teichmannsetrecipients: + Martin.Teichmann
2014-02-05 08:30:36Martin.Teichmannsetmessageid: <1391589036.91.0.744972661823.issue20518@psf.upfronthosting.co.za>
2014-02-05 08:30:36Martin.Teichmannlinkissue20518 messages
2014-02-05 08:30:35Martin.Teichmanncreate