Author gvanrossum
Recipients ecatmur, gvanrossum, lukasz.langa, rhettinger
Date 2013-06-30.00:12:18
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <CAP7+vJKBvtUpj3VYSkFKbC-qR-seny5bUaWsFTuur+yi=PDA3w@mail.gmail.com>
In-reply-to <1372526677.94.0.12961643988.issue18244@psf.upfronthosting.co.za>
Content
> Looks like the priority ordering you mention is not yet documented
> anywhere.

Because up till now it has not been needed -- all you can do with ABCs
is use isinstance/issubclass.

> It definitely makes sense but I'd like to take a step back for
> a moment to consider the following questions:
>
> 1. What additional functionality do our users get with this ordering? In
>    other words, what purpose does this new ordering have?
>
>    Apart from solving the conflict we're discussing, I can't see any.

There doesn't have to be any other functionality. We're just trying to
address how ABCs should be ordered relative to classes explicitly in
the MRO for the purpose of @singledispatch.

> 2. What disadvantages does this ordering bring to the table?
>
>    I think that simplicity is a feature. This ordering creates
>    additional complexity in the language.

But so does not ordering.

The underlying question is how to dispatch when two or more classes in
an object's class hierarchy have a different dispatch rule. This is
fundamentally a question of ordering. For regular method dispatch, the
ordering used is the MRO, and there is always a unique answer: the
class that comes first in the MRO wins. (Ambiguities and
inconsistencies in the MRO are rejected at class definition time.)
This is very convenient because the issue of coming up with a total
ordering of base classes is solved once and for all, and because of
the total ordering we never have to reject a request to dispatch (of
regular methods or attribute lookup) as ambiguous.

(Note: I call it a total ordering, but the total ordering is only
within a specific class's MRO. Any class's explicit bases are totally
ordered in that class's MRO -- but the order of two classes may be
different in a different class's MRO. This is actually relevant for
the definition of C3, and we'll see this below.)

For @singledispatch, we are choosing to support ABCs (because it makes
sense), and so we have to think about how handle ABCs that are
relevant (isinstance/issubclass returns True) but not in the MRO.

Let's introduce terminology so we can talk about different cases easily.

Relevant: isinstance/issubclass returns True
Explicit: it's in the MRO
Implicit: relevant but not explicit
Registered: Implicit due to a register() call
Inferred: Implicit due to a special method

These categories are not always exclusive, e.g. an ABC may be
registered for one class in the MRO but inferred for a different one.

The registration mechanism tries to avoid outright cycles but
otherwise is not as strict about rejecting ambiguities as C3 is for
the MRO calculation, and this I believe is the reason we're still
debating the order. A simple example of something rejected by C3:

- suppose we have classes B and C derived directly from object
- class X(B, C) has MRO [X, B, C, object]
- class Y(C, B) has MRO [Y, C, B, object]
- class Z(X, Y) is *rejected* because there is no consistent ordering
for B and C in the MRO for Z

However, if we construct a similar situation using implicit ABCs, e.g.
by removing B and C from the list of explicit bases of X and Y, and
instead registering X and Y as their virtual subclasses -- then Z(X,
Y) is valid and Z is a virtual subclass of both B and C. But there is
no ordering preference in this case, so I think it's good if
@singledispatch refuses the temptation to guess here, in the case
where there are registered dispatches for B and C but none for X, Y or
Z.

I believe the rest of the discussion is purely about what to do if B
was explicitly listed as a base but C wasn't (or vice versa). There
are still a few different cases; the simplest is this:

class X(B) -- MRO is [X, B, object]
class Y(B) -- MRO is [Y, B, object]
C.register(X)
C.register(Y)
class Z(X, Y) -- MRO is [Z, X, Y, B, object]

IIUC you want @singledispatch to still reject a call for a Z instance
if the only relevant registered dispatches are for B and C -- because
you claim that X, Y and Z are "just as much" subclasses of B as they
are of C. (It doesn't actually matter whether we use X, Y or Z as an
example here -- all of them have the same problem.) But I disagree.
First, in the all-explicit example, we can equally say that X is "just
as much" a subclass of B as it is of C. And yet, because both appear
in the explicit MRO, B wins when dispatching for X -- and C wins when
dispatching for Y, because the explicit base classes are ordered
differently there. So the property of being "just as much" a base
class isn't used -- but the order in the explicit MRO is.

It is the nature and intent of ABC registration that it does not have
to occur in the same file as the class definition. In particular, it
is possible that the class definitions of B, C, X, Y and Z all occur
together, but the C.register() calls occur in a different, unrelated
module, which is imported only on the whim of the top-level
application, or as a side effect of some unrelated import made by the
top-level app. This, to me, is enough to consider registered ABC
inheritance as a second-class citizen compared to explicit inclusion
in the list of bases.

Consider this: dispatch(Z) would consider only Z, X, Y, B, object if
the C.register() calls were *not* made, and then it would choose B;
but under your rules, if the app changed to cause the C.register()
calls to be added, dispatch(Z) would complain. This sounds fragile to
me, and it is the main reason why I want explicit ABCs to prevail over
"merely" registered ABCs, rather than to be treated equal and possibly
cause complaints based on what happened elsewhere in the program.

Now let's move on to inferred ABCs. Let's assume C is actually Sized,
and both X and Y implicitly subclass Sized because they implement
__len__(). On the one hand, this doesn't have the problem with
register() calls that may or may not be loaded -- the __len__()
definitions are always there. On the other hand, to me this feels even
less explicit than using register() -- while presumably everyone who
uses register() knows at some basic level about ABCs, I do not make
the same assumption about people who write classes that have a
__len__() method. Because of Python's long history of duck typing,
many __len__() methods in code that is still in use (even if it has
evolved in other ways) were probably written before ABCs (or even
MROs) were introduced!

Now assume X's explicit base class B is actually Iterable, and X
implements both __iter__() and __len__(). X's author *could* easily
have made X a subclass of both Iterable and Sized, and thereby avoided
the ambiguity. But instead, they explicitly inherited only Iterable,
and left Sized implicit. (I guess their understanding of ABCs was
limited. :-) I really do think that in this case there is a
discernible difference between the author's attitude towards Iterable
and Sized -- they cared enough about Iterable to inherit from it. So I
think we should assume that *if* we asked them to add Sized as a base
class, they would add it second, thereby resolving the ambiguity in
favor of Iterable. And that's why I think that if they left Sized out,
we are doing them more of a favor by preferring Iterable than by
complaining.

Now on to your other examples.

>    Firstly,

[Off-topic English-major nit: it's better to write "first" instead of
"firstly". English is not a very consistent language. :-) See
http://www.randomhouse.com/wotd/index.pperl?date=20010629 for a
nuanced view that still ends up preferring "first".]

>    there is currently no obvious way for users to distinguish
>    between implicit

(which I called "inferred" above)

>    subclassing (via implementation) or subclassing by
>    `abc.register()`.

What do you mean by "no obvious way to distinguish"? That you can't
tell by merely looking at the MRO and calling isinstance/issubclass?
But you could look in the _abc_registry. Or you could scan the source
for register() calls.

TBH I would be fine if these received exactly the same status -- but
explicit inclusion in the MRO still ought to prevail.

>    This creates the dangerous situation where
>    backwards incompatibility introduced by switching between those ABC
>    registration mechanism is nearly impossible to debug.  Consider an
>    example: version A of a library has a type which only implicitly
>    subclasses an ABC. User code with singledispatch is created that
>    works with the current state of things. Version B of the same library
>    uses `ABC.register(Type)` and suddenly the dispatch changes without
>    any way for the user to see what's going on.  A similar example with
>    explicit subclassing and a different form of registration is easier
>    to debug, but not much, really.

So there are an infinite number of ways to break subtle code like this
by subtle changes in inheritance. I see that mostly as a fact of life,
not something we must contain at all cost. I suppose even with
explicit base classes it's not inconceivable that one developer's
"innocent" fix of a class hierarchy breaks another developer's code.
(See also http://xkcd.com/1172/ :-) So could "innocently" adding or
moving a __len__() method.

>    Secondly, it creates this awkward situation where dispatch for any
>    custom `MutableMapping` can be different from the dispatch for
>    `dict`.  Although the latter is a `MutableMapping` "only" by means of
>    `MutableMapping.register(dict)`, in reality the whole definition of
>    what a mutable mapping is comes from the behaviour found in `dict`.
>    Following your point of view, dict is less of a mutable mapping than
>    a custom type subclassing the ABC explicitly. You're saying the user
>    should "respect the choice of its author" but it's clearly suboptimal
>    here. I strongly believe I should be able to swap a mutable mapping
>    implementation with any other and get consistent dispatch.

Hmm... I need to make this situation explicit to think about it. I
think you are still looking at a case where there are exactly two
possible dispatches, one for Sized and one for Iterable, right? And
the issue is that if the user defines class Foo(MutableMapping), Sized
and Iterable appear explicitly in Foo's MRO, in that order (because MM
explicitly subclasses them in this order), so Sized wins.

But, hm, because there's a call to MutableMapping.register(dict) in
collections/abc.py, the way I see it, if the only dispatches possible
were on Sized and Iterable, Sized should still win, because it comes
first in MM's MRO. Now, if that MM.register(dict) call was absent, you
might be right. But it's there (and you mention it) so I'm not sure
what is going on here -- are you talking about a slightly different
example, or about a different rule than I am proposing?

>    Thirdly, while I can't support this with any concrete examples,
>    I have a gut feeling that considering all three ABC subclassing
>    mechanisms to be equally binding will end up as a toolkit with better
>    composability. The priority ordering on the other hand designs
>    `abc.register()` and implicit subclassing as inferior MRO wannabes.

Ok, when we're talking gut feelings, you're going to have a hard time
convincing others that your gut is more finely tuned to digesting
Python subtleties than mine. :-) Clearly my gut tells me that explicit
inclusion of an ABC in the list of bases is a stronger signal than
implicit subclassing; at least part of the reason why my gut feels
this way is that the C3 analysis is more strict about rejecting
ambiguous orderings outright than registered or inferred ABCs. (But my
gut agrees with yours that there's not much to break a tie between a
registered and an inferred ABC.)

>    Last but not least, the priority ordering will further complicate the
>    implementation of `functools._compose_mro()` and friends. While the
>    complexity of this code is not the reason of my stance on the matter,
>    I think it nicely shows how much the user has to keep in her head to
>    really know what's going on. Especially that we only consider this
>    ordering to be decisive on a single inheritance level.

The implementation of C3 is also complex, and understanding all its
rules is hard -- harder than what I had before. But it is a better
rule, and that's why we use it.

> 3. Why is the ordering MRO -> register -> implicit?

(Note: I already relented on the latter arrow.)

>    The reason I'm asking is that the whole existence of `abc.register()`
>    seems to come from the following:
>
>    * we want types that predate the creation of an ABC to be considered
>      its subclasses;

We certainly want it enough for isinstance/issubclass to work and for
an unambiguous dispatch to work. But we're not sure about the relative
ordering in all cases, because there's no order in the registry nor
for inferred ABCs, and that may affect when dispatch is considered
dispatch.

>    * we can't use implicit subclassing because either the existence of
>      methods in question is not enough (e.g. Mapping versus Sequence);
>      or the methods are added at runtime and don't appear in __dict__.
>
>    Considering the above, one might argue that the following order is
>    just as well justified: MRO -> implicit -> register. I'm aware that
>    the decision to put register first is because if the user is unhappy
>    with the dispatch, she can override the ordering by registering types
>    which were implicit before. But, while I can register third-party
>    types, I can't unregister any. In other words, I find this ordering
>    arbitrary.

So, if you can handle "MRO is stronger than registered or inferred" we
don't actually disagree on this. :-)

> I hope you don't perceive my position as stubborn, I just care enough to
> insist on this piece of machinery to be clearly defined and as simple as
> possible (but not simpler, sure).

Not at al. You may notice I enjoy the debate. :-)
History
Date User Action Args
2013-06-30 00:12:20gvanrossumsetrecipients: + gvanrossum, rhettinger, lukasz.langa, ecatmur
2013-06-30 00:12:19gvanrossumlinkissue18244 messages
2013-06-30 00:12:18gvanrossumcreate