classification
Title: Adopt C3-based linearization for improved ABC support in functools.singledispatch
Type: behavior Stage: committed/rejected
Components: Library (Lib) Versions: Python 3.4
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: lukasz.langa Nosy List: ecatmur, ggenellina, gvanrossum, lukasz.langa, python-dev, rhettinger
Priority: normal Keywords: patch

Created on 2013-06-17 17:34 by ecatmur, last changed 2013-07-02 09:13 by lukasz.langa. This issue is now closed.

Files
File name Uploaded Description Edit
singledispatch-mro-18244.patch ecatmur, 2013-06-17 18:31 singledispatch-mro-18244.patch review
singledispatch-mro-composition.patch ecatmur, 2013-06-19 08:12 review
issue18244.diff lukasz.langa, 2013-06-20 19:30 C3-based approach review
issue18244.diff lukasz.langa, 2013-06-27 14:21 C3-based approach after GvR review review
issue18244.diff lukasz.langa, 2013-06-30 13:39 C3-based approach favoring explicit ABCs review
issue18244.diff lukasz.langa, 2013-06-30 16:40 C3-based approach favoring explicit ABCs (improved) review
issue18244.diff lukasz.langa, 2013-07-01 10:26 C3-based approach favoring explicit ABCs after GvR review review
Repositories containing patches
https://bitbucket.org/ecatmur/singledispatch
Messages (17)
msg191350 - (view) Author: Edward Catmur (ecatmur) Date: 2013-06-17 17:34
Suppose we have a class C with MRO (C, B, A, object).  C virtual-inherits an ABC V, while B virtual-inherits an unrelated ABC W:

       object
      /   |  \
      A   W  |
      | .`  /
      B`  V
      | .`
      C`

Recalling that per PEP 443 singledispatch prefers concrete bases to virtual bases, we would expect the following composed MRO:

    C, B, V, A, W, object

However what actually happens is the composed MRO depends on the order of the haystack; if W is processed first the result is correct, but if V is processed first then (because V does not subclass W) W is inserted in the MRO *before* V:

    C, B, A, object
    C, B, V, A, object
    C, B, W, V, A, object

This results in ambiguity between V and W.

Suggested fix is a slight change to the MRO composition algorithm, considering whether the items already placed in the MRO are concrete base classes.
msg191351 - (view) Author: Edward Catmur (ecatmur) Date: 2013-06-17 17:37
Apologies, the linked repository is for the 2.x backport of singledispatch.  I'll replace it with a proper Python repo.
msg191355 - (view) Author: Edward Catmur (ecatmur) Date: 2013-06-17 18:35
See attachment for patch and test.  Note that reproducing the issue without access to singledispatch internals depends on iteration order of a dict of types and is thus intermittent/environment dependent.
msg191424 - (view) Author: Łukasz Langa (lukasz.langa) (Python committer) Date: 2013-06-18 18:49
Hello, Edward. First of all, thank you for correctly pointing out the problem. We'll integrate your patch but first I'd like to have a unit test that specifically shows why the secondonary `for` loop you introduced is necessary. Currently if we change this:

    for index, base in enumerate(mro[i + 1:], i + 1):
        if not issubclass(base, needle):
            break

to this:

    index = i + 1

the tests still pass.
msg191455 - (view) Author: Edward Catmur (ecatmur) Date: 2013-06-19 08:12
Łukasz, thanks. When the most-derived class virtual-inherits two related ABCs U, V:

       object
      /   |  \
      A   W   V
      | .`  .`
      B`  U`
      | .`
      C`

The secondary `for` loop is necessary to ensure U and V are ordered correctly.  I'll upload a patch with an improved test that covers this case.
msg191535 - (view) Author: Łukasz Langa (lukasz.langa) (Python committer) Date: 2013-06-20 19:30
I meditated on the nature of the problem. A big issue that Edward pointed out
is that at times the computed MRO depends on haystack ordering. We should fix
that. However, my conclusion is that whatever home-grown algorithm we come up
with, it will still be unintuitive for the end user.

We should rather adopt C3 linearisation, which Python uses for calculating the
method resolution order. Unfortunately, it doesn't specify what to do with
virtual bases (either registered or implied). To solve that we need to extend
the algorithm to insert the virtual bases in a "correct" place. We have two
issues here:

1. How can we determine where this "correct" place is?
2. How can we determine which order unrelated abstract classes should be put
   in?

The correct place for a virtual base class to appear in the MRO is on the level
where the functionality it describes first appears. For example:

  >>> from collections import *
  >>> class A(object): pass
  ...
  >>> class B(A):
  ...   def __len__(self):
  ...     return 0   # implies Sized
  ...
  >>> @Container.register
  ... class C(object): pass
  ...
  >>> class D(object): pass   # unrelated
  ...
  >>> class X(D, C, B):
  ...   def __call__(self):
  ...     return 0   # implies Callable
  ...

If we calculate the C3 MRO for strict bases only, we get the obvious:

  >>> from functools import _c3_mro
  >>> _c3_mro(X)
  [<class '__main__.X'>, <class '__main__.D'>, <class '__main__.C'>,
   <class '__main__.B'>, <class '__main__.A'>, <class 'object'>]

If we want to insert related abstract base classes, they consistently appear
right after the class where functionality has been added (note that the order
of the `abcs` here doesn't matter):

  >>> _c3_mro(X, abcs=[Sized, Callable, Container])
  [<class '__main__.X'>, <class 'collections.abc.Callable'>, <class '__main__.D'>,
   <class '__main__.C'>, <class 'collections.abc.Container'>, <class '__main__.B'>,
   <class 'collections.abc.Sized'>, <class '__main__.A'>, <class 'object'>]

This linearisation might be somewhat surprising but I believe it's the correct
way. In fact it's only surprising because it exposes ambiguous dispatch when a
class virtually subclasses multiple ABCs. For example, having a simple generic
function:

  >>> from functools import singledispatch
  >>> from collections import *
  >>> @singledispatch
  ... def f(arg):
  ...   return "base"
  ...
  >>> f.register(Iterable, lambda arg: "iterable")
  >>> f.register(Sized, lambda arg: "sized")
  >>> f.register(Set, lambda arg: "set")

this class evaluates reasonably:

  >>> class C(Sized):
  ...   def __len__(self):
  ...     return 0
  ...
  >>> f(C())
  'sized'

However, when we register it for another ABC, there has to be a conflict:

  >>> Iterable.register(C)
  <class '__main__.C'>
  >>> f(C())
  Traceback (most recent call last):
  ...
  RuntimeError: Ambiguous dispatch: <class 'collections.abc.Iterable'> or
                                    <class 'collections.abc.Sized'>

It doesn't matter that Sized appears explicitly in the MRO. Class C is-a Sized
just as well as it is-a Iterable. However, if we also register it to be a Set
as well (which subclasses both Iterable and Sized), this solves the conflict:

  >>> Set.register(C)
  <class '__main__.C'>
  >>> f(C())
  'set'

Same goes for the case described in PEP 443 where both ABCs appear explicitly
in the MRO. No conflict here becase MRO orders both bases unambiguously:

  >>> class D(Sized, Iterable):
  ...   def __len__(self):
  ...     return 0
  ...   def __iter__(self):
  ...     return iter([])
  ...
  >>> f(D())
  'sized'

The answer to the second question ("How can we determine in which order to put
unrelated abstract classes?") is that we can't in general. There's fortunately
a common special case that can help here: ABCs often have subclasses which are
also bases of the type we're building the MRO for. For example: when inserting
Iterable and Sized to the MRO of dict we can derive the correct order from the
fact that both Iterable and Sized have a subclass called Mapping which also
happens to be a base of dict. This trick cannot be used to fix every case but
it's good enough to handle many.

Summing up, the new algorithm is an extension of C3, which makes it much more
predictable in edge cases. ABCs are introduced in a way that is much more
deterministic (and when it depends on haystack ordering it always ends up as
a RuntimeError anyway). My only real gripe with the new algorithm is that it's
much larger codewise. On the other hand the ABC-aware C3 implementation could
be useful for other purposes as well. 

Nick, Guido, this needs a serious review. Took me a while to get it right.
msg191921 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2013-06-26 22:43
Wow. This is heady stuff. I can't say I totally get every detail, so I'll just pick on some things you wrote.

In particular, I'm not sure I agree that there should be a conflict when there are two applicable base classes with a different dispatch if one of them is explicit in the MRO and the other isn't.  After all, it both were explicit in the MRO the first one there would be picked, right? And virtual base classes are considered to appear after explicit base classes, right? (Or not?)

I do think the new approach is better (the extra code doesn't bother me), and I may be convinced yet that you made the right choice in the above case.

I have some trivial review comments on your patch to, will send those via Rietveld.
msg191947 - (view) Author: Łukasz Langa (lukasz.langa) (Python committer) Date: 2013-06-27 14:21
The reason why I think it's wrong to special-case ABCs that are
explicitly in the MRO is that it's only one of four methods of virtual
subclassing:

1. Explicit MRO;

2. Abstract functionality implicitly implemented without any form of
   registration;

3. abc.register();

4. One of the above on a base of the type in question.

This creates more possibilities for conflicts than just the example
described in my last message. For instance, what's the preferred ABC
here, Iterable or Sized?

  >>> class E(Sized):
  ...   def __len__(self):
  ...     return 0
  ...   def __iter__(self):
  ...     for i in []:
  ...       yield i

My answer is: neither. E equally is-a Sized and is-a Iterable. If the
dispatcher favors one over the other, you will get people upset about
the decision, no matter which one it is.

Note that the conflict arises only for multiple ABCs which end up *on
the same level* of the MRO. For instance in the example below the
dispatch always chooses the Iterable implementation (the functionality
appears directly on F, whereas Sized appears on a base):

  >>> class HasSize(Sized):
  ...   def __len__(self):
  ...     return 0
  ...
  >>> class F(HasSize):
  ...   def __iter__(self):
  ...     for i in []:
  ...       yield i

If we wanted to favor the ABCs that are explicitly in the MRO, what
should we choose in this case? If we say "Sized", then it breaks the
whole idea of arranging the ABCs next to the class that first implements
them in the MRO.

But it gets better! Suppose you have a generic function with
implementations for Sized and Callable. Which implementation should we
choose for class G?

  >>> class G(MutableMapping):
  ...   def __call__(self):
  ...     return None
  ...   # the rest of the MutableMapping implementation

Seems like Sized because it is explicitly in the MRO, right? What about
H then?

  >>> class H(dict):
  ...   def __call__(self):
  ...     return None

Well, now it's suddenly Callable because Sized is "only" a virtual base
class. I don't think we should let that happen.

It all comes down to the question whether you consider ABCs to be bases
FOR REAL or only sort-of-but-not-really. I believe they're real bases
regardless of the method of registration. Implicit implementation and
abc.register() doesn't make the base any less real.

All in all, the user will ask: "Hey, it's true, I have a tricky type
that subclasses both an ABC1 and an ABC2 and singledispatch raises
a RuntimeError. How do I make this work?" The answer is simple: just
register a more specific implementation on the generic function, even if
it simply selects one of the existing ones:

  generic_func.register(TrickyType, generic_func.dispatch(ABC2))

Explicit is better than implicit.
msg191956 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2013-06-27 16:51
Hm. I interpret "explicit is better than implicit" very differently.  I see a strict priority ordering from better to worse, in cases that would otherwise be ambiguous:

1. explicit base class (ABC or otherwise)

2. ABC explicitly registered

3. ABC implicitly inferred from presence of special method

I'm all for using all the other heuristics and rules you describe: inferred ABCs occur at the level where they are introduced, for example, and if two different ABCs are both inferred at the same level or both registered at the same level, that should be considered ambiguous.  But if one ABC is listed as an explicit base at some level and another is registered or implicit at the same level, the explicit base should just win -- just as if both ABCs were explicitly listed, the one listed first wins.

So the rule should be that registered and inferred bases are only considered after explicit bases at the same level.  (If you want the registered class to be preferred, you should add it as an explicit base instead -- and if you don't own the code, you should respect the choice of its author, or do something using subclassing.)

If it were me, explicitly registered ABCs would also trump inferred ABCs -- after all an inferred ABC is far from obvious to most readers and might even be an accident, and the situation can be rectified by explicit registration.

IOW, I disagree with your claim that "Class C is-a Sized just as well as it is-a Iterable."  C is a Sized *more* than an Iterable because Size is an explicitly listed base class and Iterable is added through registration.  Same if Iterable were inferred.

(Just to be clear, this only applies if the ambiguity occurs at a single level.  ABCs explicitly listed, registered, or inferred at some base class have explicitly lower priority.)
msg192040 - (view) Author: Łukasz Langa (lukasz.langa) (Python committer) Date: 2013-06-29 17:24
Looks like the priority ordering you mention is not yet documented
anywhere. It definitely makes sense but I'd like to take a step back for
a moment to consider the following questions:

1. What additional functionality do our users get with this ordering? In
   other words, what purpose does this new ordering have?

   Apart from solving the conflict we're discussing, I can't see any.

2. What disadvantages does this ordering bring to the table?

   I think that simplicity is a feature. This ordering creates
   additional complexity in the language.

   Firstly, there is currently no obvious way for users to distinguish
   between implicit subclassing (via implementation) or subclassing by
   `abc.register()`. This creates the dangerous situation where
   backwards incompatibility introduced by switching between those ABC
   registration mechanism is nearly impossible to debug.  Consider an
   example: version A of a library has a type which only implicitly
   subclasses an ABC. User code with singledispatch is created that
   works with the current state of things. Version B of the same library
   uses `ABC.register(Type)` and suddenly the dispatch changes without
   any way for the user to see what's going on.  A similar example with
   explicit subclassing and a different form of registration is easier
   to debug, but not much, really.

   Secondly, it creates this awkward situation where dispatch for any
   custom `MutableMapping` can be different from the dispatch for
   `dict`.  Although the latter is a `MutableMapping` "only" by means of
   `MutableMapping.register(dict)`, in reality the whole definition of
   what a mutable mapping is comes from the behaviour found in `dict`.
   Following your point of view, dict is less of a mutable mapping than
   a custom type subclassing the ABC explicitly. You're saying the user
   should "respect the choice of its author" but it's clearly suboptimal
   here. I strongly believe I should be able to swap a mutable mapping
   implementation with any other and get consistent dispatch.

   Thirdly, while I can't support this with any concrete examples,
   I have a gut feeling that considering all three ABC subclassing
   mechanisms to be equally binding will end up as a toolkit with better
   composability. The priority ordering on the other hand designs
   `abc.register()` and implicit subclassing as inferior MRO wannabes.

   Last but not least, the priority ordering will further complicate the
   implementation of `functools._compose_mro()` and friends. While the
   complexity of this code is not the reason of my stance on the matter,
   I think it nicely shows how much the user has to keep in her head to
   really know what's going on. Especially that we only consider this
   ordering to be decisive on a single interitance level.

3. Why is the ordering MRO -> register -> implicit?

   The reason I'm asking is that the whole existence of `abc.register()`
   seems to come from the following:

   * we want types that predate the creation of an ABC to be considered
     its subclasses;

   * we can't use implicit subclassing because either the existence of
     methods in question is not enough (e.g. Mapping versus Sequence);
     or the methods are added at runtime and don't appear in __dict__.

   Considering the above, one might argue that the following order is
   just as well justified: MRO -> implicit -> register. I'm aware that
   the decision to put register first is because if the user is unhappy
   with the dispatch, she can override the ordering by registering types
   which were implicit before. But, while I can register third-party
   types, I can't unregister any. In other words, I find this ordering
   arbitrary.

I hope you don't perceive my position as stubborn, I just care enough to
insist on this piece of machinery to be clearly defined and as simple as
possible (but not simpler, sure).
msg192060 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2013-06-30 00:12
> Looks like the priority ordering you mention is not yet documented
> anywhere.

Because up till now it has not been needed -- all you can do with ABCs
is use isinstance/issubclass.

> It definitely makes sense but I'd like to take a step back for
> a moment to consider the following questions:
>
> 1. What additional functionality do our users get with this ordering? In
>    other words, what purpose does this new ordering have?
>
>    Apart from solving the conflict we're discussing, I can't see any.

There doesn't have to be any other functionality. We're just trying to
address how ABCs should be ordered relative to classes explicitly in
the MRO for the purpose of @singledispatch.

> 2. What disadvantages does this ordering bring to the table?
>
>    I think that simplicity is a feature. This ordering creates
>    additional complexity in the language.

But so does not ordering.

The underlying question is how to dispatch when two or more classes in
an object's class hierarchy have a different dispatch rule. This is
fundamentally a question of ordering. For regular method dispatch, the
ordering used is the MRO, and there is always a unique answer: the
class that comes first in the MRO wins. (Ambiguities and
inconsistencies in the MRO are rejected at class definition time.)
This is very convenient because the issue of coming up with a total
ordering of base classes is solved once and for all, and because of
the total ordering we never have to reject a request to dispatch (of
regular methods or attribute lookup) as ambiguous.

(Note: I call it a total ordering, but the total ordering is only
within a specific class's MRO. Any class's explicit bases are totally
ordered in that class's MRO -- but the order of two classes may be
different in a different class's MRO. This is actually relevant for
the definition of C3, and we'll see this below.)

For @singledispatch, we are choosing to support ABCs (because it makes
sense), and so we have to think about how handle ABCs that are
relevant (isinstance/issubclass returns True) but not in the MRO.

Let's introduce terminology so we can talk about different cases easily.

Relevant: isinstance/issubclass returns True
Explicit: it's in the MRO
Implicit: relevant but not explicit
Registered: Implicit due to a register() call
Inferred: Implicit due to a special method

These categories are not always exclusive, e.g. an ABC may be
registered for one class in the MRO but inferred for a different one.

The registration mechanism tries to avoid outright cycles but
otherwise is not as strict about rejecting ambiguities as C3 is for
the MRO calculation, and this I believe is the reason we're still
debating the order. A simple example of something rejected by C3:

- suppose we have classes B and C derived directly from object
- class X(B, C) has MRO [X, B, C, object]
- class Y(C, B) has MRO [Y, C, B, object]
- class Z(X, Y) is *rejected* because there is no consistent ordering
for B and C in the MRO for Z

However, if we construct a similar situation using implicit ABCs, e.g.
by removing B and C from the list of explicit bases of X and Y, and
instead registering X and Y as their virtual subclasses -- then Z(X,
Y) is valid and Z is a virtual subclass of both B and C. But there is
no ordering preference in this case, so I think it's good if
@singledispatch refuses the temptation to guess here, in the case
where there are registered dispatches for B and C but none for X, Y or
Z.

I believe the rest of the discussion is purely about what to do if B
was explicitly listed as a base but C wasn't (or vice versa). There
are still a few different cases; the simplest is this:

class X(B) -- MRO is [X, B, object]
class Y(B) -- MRO is [Y, B, object]
C.register(X)
C.register(Y)
class Z(X, Y) -- MRO is [Z, X, Y, B, object]

IIUC you want @singledispatch to still reject a call for a Z instance
if the only relevant registered dispatches are for B and C -- because
you claim that X, Y and Z are "just as much" subclasses of B as they
are of C. (It doesn't actually matter whether we use X, Y or Z as an
example here -- all of them have the same problem.) But I disagree.
First, in the all-explicit example, we can equally say that X is "just
as much" a subclass of B as it is of C. And yet, because both appear
in the explicit MRO, B wins when dispatching for X -- and C wins when
dispatching for Y, because the explicit base classes are ordered
differently there. So the property of being "just as much" a base
class isn't used -- but the order in the explicit MRO is.

It is the nature and intent of ABC registration that it does not have
to occur in the same file as the class definition. In particular, it
is possible that the class definitions of B, C, X, Y and Z all occur
together, but the C.register() calls occur in a different, unrelated
module, which is imported only on the whim of the top-level
application, or as a side effect of some unrelated import made by the
top-level app. This, to me, is enough to consider registered ABC
inheritance as a second-class citizen compared to explicit inclusion
in the list of bases.

Consider this: dispatch(Z) would consider only Z, X, Y, B, object if
the C.register() calls were *not* made, and then it would choose B;
but under your rules, if the app changed to cause the C.register()
calls to be added, dispatch(Z) would complain. This sounds fragile to
me, and it is the main reason why I want explicit ABCs to prevail over
"merely" registered ABCs, rather than to be treated equal and possibly
cause complaints based on what happened elsewhere in the program.

Now let's move on to inferred ABCs. Let's assume C is actually Sized,
and both X and Y implicitly subclass Sized because they implement
__len__(). On the one hand, this doesn't have the problem with
register() calls that may or may not be loaded -- the __len__()
definitions are always there. On the other hand, to me this feels even
less explicit than using register() -- while presumably everyone who
uses register() knows at some basic level about ABCs, I do not make
the same assumption about people who write classes that have a
__len__() method. Because of Python's long history of duck typing,
many __len__() methods in code that is still in use (even if it has
evolved in other ways) were probably written before ABCs (or even
MROs) were introduced!

Now assume X's explicit base class B is actually Iterable, and X
implements both __iter__() and __len__(). X's author *could* easily
have made X a subclass of both Iterable and Sized, and thereby avoided
the ambiguity. But instead, they explicitly inherited only Iterable,
and left Sized implicit. (I guess their understanding of ABCs was
limited. :-) I really do think that in this case there is a
discernible difference between the author's attitude towards Iterable
and Sized -- they cared enough about Iterable to inherit from it. So I
think we should assume that *if* we asked them to add Sized as a base
class, they would add it second, thereby resolving the ambiguity in
favor of Iterable. And that's why I think that if they left Sized out,
we are doing them more of a favor by preferring Iterable than by
complaining.

Now on to your other examples.

>    Firstly,

[Off-topic English-major nit: it's better to write "first" instead of
"firstly". English is not a very consistent language. :-) See
http://www.randomhouse.com/wotd/index.pperl?date=20010629 for a
nuanced view that still ends up preferring "first".]

>    there is currently no obvious way for users to distinguish
>    between implicit

(which I called "inferred" above)

>    subclassing (via implementation) or subclassing by
>    `abc.register()`.

What do you mean by "no obvious way to distinguish"? That you can't
tell by merely looking at the MRO and calling isinstance/issubclass?
But you could look in the _abc_registry. Or you could scan the source
for register() calls.

TBH I would be fine if these received exactly the same status -- but
explicit inclusion in the MRO still ought to prevail.

>    This creates the dangerous situation where
>    backwards incompatibility introduced by switching between those ABC
>    registration mechanism is nearly impossible to debug.  Consider an
>    example: version A of a library has a type which only implicitly
>    subclasses an ABC. User code with singledispatch is created that
>    works with the current state of things. Version B of the same library
>    uses `ABC.register(Type)` and suddenly the dispatch changes without
>    any way for the user to see what's going on.  A similar example with
>    explicit subclassing and a different form of registration is easier
>    to debug, but not much, really.

So there are an infinite number of ways to break subtle code like this
by subtle changes in inheritance. I see that mostly as a fact of life,
not something we must contain at all cost. I suppose even with
explicit base classes it's not inconceivable that one developer's
"innocent" fix of a class hierarchy breaks another developer's code.
(See also http://xkcd.com/1172/ :-) So could "innocently" adding or
moving a __len__() method.

>    Secondly, it creates this awkward situation where dispatch for any
>    custom `MutableMapping` can be different from the dispatch for
>    `dict`.  Although the latter is a `MutableMapping` "only" by means of
>    `MutableMapping.register(dict)`, in reality the whole definition of
>    what a mutable mapping is comes from the behaviour found in `dict`.
>    Following your point of view, dict is less of a mutable mapping than
>    a custom type subclassing the ABC explicitly. You're saying the user
>    should "respect the choice of its author" but it's clearly suboptimal
>    here. I strongly believe I should be able to swap a mutable mapping
>    implementation with any other and get consistent dispatch.

Hmm... I need to make this situation explicit to think about it. I
think you are still looking at a case where there are exactly two
possible dispatches, one for Sized and one for Iterable, right? And
the issue is that if the user defines class Foo(MutableMapping), Sized
and Iterable appear explicitly in Foo's MRO, in that order (because MM
explicitly subclasses them in this order), so Sized wins.

But, hm, because there's a call to MutableMapping.register(dict) in
collections/abc.py, the way I see it, if the only dispatches possible
were on Sized and Iterable, Sized should still win, because it comes
first in MM's MRO. Now, if that MM.register(dict) call was absent, you
might be right. But it's there (and you mention it) so I'm not sure
what is going on here -- are you talking about a slightly different
example, or about a different rule than I am proposing?

>    Thirdly, while I can't support this with any concrete examples,
>    I have a gut feeling that considering all three ABC subclassing
>    mechanisms to be equally binding will end up as a toolkit with better
>    composability. The priority ordering on the other hand designs
>    `abc.register()` and implicit subclassing as inferior MRO wannabes.

Ok, when we're talking gut feelings, you're going to have a hard time
convincing others that your gut is more finely tuned to digesting
Python subtleties than mine. :-) Clearly my gut tells me that explicit
inclusion of an ABC in the list of bases is a stronger signal than
implicit subclassing; at least part of the reason why my gut feels
this way is that the C3 analysis is more strict about rejecting
ambiguous orderings outright than registered or inferred ABCs. (But my
gut agrees with yours that there's not much to break a tie between a
registered and an inferred ABC.)

>    Last but not least, the priority ordering will further complicate the
>    implementation of `functools._compose_mro()` and friends. While the
>    complexity of this code is not the reason of my stance on the matter,
>    I think it nicely shows how much the user has to keep in her head to
>    really know what's going on. Especially that we only consider this
>    ordering to be decisive on a single inheritance level.

The implementation of C3 is also complex, and understanding all its
rules is hard -- harder than what I had before. But it is a better
rule, and that's why we use it.

> 3. Why is the ordering MRO -> register -> implicit?

(Note: I already relented on the latter arrow.)

>    The reason I'm asking is that the whole existence of `abc.register()`
>    seems to come from the following:
>
>    * we want types that predate the creation of an ABC to be considered
>      its subclasses;

We certainly want it enough for isinstance/issubclass to work and for
an unambiguous dispatch to work. But we're not sure about the relative
ordering in all cases, because there's no order in the registry nor
for inferred ABCs, and that may affect when dispatch is considered
dispatch.

>    * we can't use implicit subclassing because either the existence of
>      methods in question is not enough (e.g. Mapping versus Sequence);
>      or the methods are added at runtime and don't appear in __dict__.
>
>    Considering the above, one might argue that the following order is
>    just as well justified: MRO -> implicit -> register. I'm aware that
>    the decision to put register first is because if the user is unhappy
>    with the dispatch, she can override the ordering by registering types
>    which were implicit before. But, while I can register third-party
>    types, I can't unregister any. In other words, I find this ordering
>    arbitrary.

So, if you can handle "MRO is stronger than registered or inferred" we
don't actually disagree on this. :-)

> I hope you don't perceive my position as stubborn, I just care enough to
> insist on this piece of machinery to be clearly defined and as simple as
> possible (but not simpler, sure).

Not at al. You may notice I enjoy the debate. :-)
msg192069 - (view) Author: Łukasz Langa (lukasz.langa) (Python committer) Date: 2013-06-30 13:39
Having all information in place, I think it's acceptable for both of us
to implement preference for explicit registration, leaving both ways of
implicit registration as equally binding. The latter is future proof in
the sense that if we change our minds later, it's going to be easier to
lift the dispatch conflict than to introduce conflicts where there
weren't any before.

The point you're making about `Abc.register(Cls)` being external to the
definition of Cls and thus somewhat dependant on import order and other
machinery is what convinced me in the end. I also like the proposed
terminology and think it should appear in the documentation.

I created a modified patch. This wasn't as tricky as I feared but
required me to formulate an explicit rule: all implicit ABCs are
inserted in the MRO of a given class directly after the last explicit
ABC in the MRO of said class. One open question is what to do with the
algorithm described in PEP 443 which no longer describes the state of
things exactly. Although the said PEP is final, some parts of the
discussion on this issue just beg to be included in the "Abstract base
classes" section. What do you think we should do?

Answering your questions, neither scanning the source code nor using
a private attribute on ABCMeta can be considered an *obvious* way to
distinguish between registered and inferred ABCs. The former is static
analysis which might involve opening (and understanding) a number of
black boxes, the latter is fragile by definition and breaks the
abstraction (again, opening a black box). This is why we introduced
a public API to get the current cache token [1]_. As for dispatch
differences between MutableMapping and dict, it's in my previous message
(191947) on the issue, classes G and H.

On an unrelated note, thank you for correcting my English. It seems that
after achieving a level of fluency that is bearable to native speakers,
nobody corrects my mistakes anymore. This in turn places me in an
unfortunate plateau.

And yes, I'm well aware that my gut would win no popularity contest,
especially when the BDFL's one is a contender :-) I just hope this
doesn't automatically make the feelings of my own gut invalid.

.. [1] http://bugs.python.org/issue16832
msg192076 - (view) Author: Łukasz Langa (lukasz.langa) (Python committer) Date: 2013-06-30 16:40
Here's an improved patch which doesn't copy data between lists in `_c3_mro()` so much.
msg192096 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2013-07-01 00:22
Ok, let's do it.  I just sent you a review of your latest code (admitting I don't actually follow the logic in full detail, so I'm mostly harping on tests and comments...).

Regarding the PEP: feel free to update this.  Clearly nobody read it this carefully before...  "Final" in the Python world doesn't mean "every typo remains forever" and for PEPs this young that also applies to esoteric details that were perhaps not fully, or not correctly, specified before.
msg192128 - (view) Author: Roundup Robot (python-dev) Date: 2013-07-01 14:03
New changeset a2672dc7c805 by Łukasz Langa in branch 'default':
Issue #18244: Adopt C3-based linearization in functools.singledispatch for improved ABC support
http://hg.python.org/cpython/rev/a2672dc7c805
msg192132 - (view) Author: Łukasz Langa (lukasz.langa) (Python committer) Date: 2013-07-01 14:27
I committed the patch with minor docstring and comment fixes; one test was tweaked to be hash randomization-proof.

Edward, thanks for taking the time to file the bug.
Guido, thanks for a thorough and illuminating review process.
msg192184 - (view) Author: Łukasz Langa (lukasz.langa) (Python committer) Date: 2013-07-02 09:13
For the record, the PEP has been updated [1]_ and so has been the backport for 2.6 - 3.3 [2]_.

.. [1] http://hg.python.org/peps/rev/000a8986ef73

.. [2] https://pypi.python.org/pypi/singledispatch/3.4.0.2
History
Date User Action Args
2013-07-02 09:13:29lukasz.langasetmessages: + msg192184
2013-07-01 14:27:05lukasz.langasetstatus: open -> closed
resolution: fixed
messages: + msg192132

stage: patch review -> committed/rejected
2013-07-01 14:03:38python-devsetnosy: + python-dev
messages: + msg192128
2013-07-01 13:59:10lukasz.langasettitle: singledispatch: When virtual-inheriting ABCs at distinct points in MRO, composed MRO is dependent on haystack ordering -> Adopt C3-based linearization for improved ABC support in functools.singledispatch
2013-07-01 10:26:02lukasz.langasetfiles: + issue18244.diff
2013-07-01 05:32:55ggenellinasetnosy: + ggenellina
2013-07-01 00:22:44gvanrossumsetmessages: + msg192096
2013-06-30 16:40:57lukasz.langasetfiles: + issue18244.diff

messages: + msg192076
2013-06-30 13:39:52lukasz.langasetfiles: + issue18244.diff

messages: + msg192069
2013-06-30 00:12:19gvanrossumsetmessages: + msg192060
2013-06-29 17:24:37lukasz.langasetmessages: + msg192040
2013-06-27 16:51:06gvanrossumsetmessages: + msg191956
2013-06-27 14:21:37lukasz.langasetfiles: + issue18244.diff

messages: + msg191947
2013-06-27 04:19:47rhettingersetnosy: + rhettinger
2013-06-26 22:43:28gvanrossumsetnosy: + gvanrossum
messages: + msg191921
2013-06-20 19:30:03lukasz.langasetfiles: + issue18244.diff

messages: + msg191535
2013-06-19 08:12:08ecatmursetfiles: + singledispatch-mro-composition.patch

messages: + msg191455
2013-06-18 18:49:36lukasz.langasetmessages: + msg191424
2013-06-17 18:41:31lukasz.langasetassignee: lukasz.langa
stage: patch review
2013-06-17 18:36:30berker.peksagsetnosy: + lukasz.langa
components: + Library (Lib), - Extension Modules
2013-06-17 18:35:05ecatmursetmessages: + msg191355
2013-06-17 18:31:57ecatmursettype: behavior
2013-06-17 18:31:04ecatmursetfiles: + singledispatch-mro-18244.patch
keywords: + patch
2013-06-17 17:37:34ecatmursetmessages: + msg191351
2013-06-17 17:34:28ecatmurcreate