Message 358929 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	cykerway
Recipients	cykerway, steven.daprano
Date	2019-12-28.05:53:09
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1577512389.43.0.625937573029.issue39145@roundup.psfhosted.org>
In-reply-to

Content
Thanks for reply. It's not about the Python's implementation of C3 but C3 itself being used as the MRO algorithm in Python. It bites when you remove an independent interface from your class definition and its method calls become something else. I think I can propose an easy fix to C3. I would just call it C3.9 if it's going to be merged in Python 3.9. The difference between C3.9 and C3 is: During a merge, put the list of the parents in front of the linearizations of the parents, namely: Instead of: L[C(B1 ... BN)] = C + merge(L[B1], ..., L[BN], B1 ... BN) We do: L[C(B1 ... BN)] = C + merge(B1 ... BN, L[B1], ... , L[BN]) This preserves the guarantees of C3 (for example, local precedence ordering, monotonicity), plus another property: When you remove a leaf node in the dependency graph, the MRO of other nodes remain the same. Practically, this means when a developer removes an independent interface from his class, he knows the MRO of the remaining classes keep the same. Here the independent interface can even have its own class hierarchy, and the proof goes by removing each leaf one by one in its hierarchy. For example, using the same mro.py: E + [BDA BA DC A] EB + [DA A DC A] EBD + [A A C A] EBDA + [C] EBDAC E + [BDXA BA DC X A] EB + [DXA A DC X A] EBD + [XA A C X A] EBDX + [A A C A] EBDXA + [C] EBDXAC You can see EBDAC is a sub-sequence of EBDXAC. The proof is intuitive. I can give a sketch here. Assume without loss of generality, class E extends A, B, C, D, and we add a new base class X between B and C, now we have: (f1) E + [ABXCD L(A) L(B) X L(C) L(D)] Compare this with: (f0) E + [ABCD L(A) L(B) L(C) L(D)] We can carry all the computation in f1 just as in f0 until X becomes the head of the first list item: (f1') E... + [XCD ... X ... ] (f0') E... + [CD ... ... ] At this time we know we can extract X in (f1') because X is in any tail. After we extract X, (f1') becomes (f0') and we can carry all the operations just as in (f0'). So the result of (f1) is merely an added X and all other elements keep the same order. Intuitively, the performance of C3.9 should be almost the same as C3. The only drawback I can think of is existing code may have a different MRO, but that happened when Python adopted C3. Not sure if this is worth a PEP. If anyone is interested feel free to leave a message.

Thanks for reply. It's not about the Python's implementation of C3 but C3 itself being used as the MRO algorithm in Python. It bites when you remove an independent interface from your class definition and its method calls become something else.

I think I can propose an easy fix to C3. I would just call it C3.9 if it's going to be merged in Python 3.9. The difference between C3.9 and C3 is: During a merge, put the list of the parents in front of the linearizations of the parents, namely:

Instead of: L[C(B1 ... BN)] = C + merge(L[B1], ..., L[BN], B1 ... BN)

We do: L[C(B1 ... BN)] = C + merge(B1 ... BN, L[B1], ... , L[BN])

This preserves the guarantees of C3 (for example, local precedence ordering, monotonicity), plus another property: When you remove a leaf node in the dependency graph, the MRO of other nodes remain the same. Practically, this means when a developer removes an independent interface from his class, he knows the MRO of the remaining classes keep the same. Here the independent interface can even have its own class hierarchy, and the proof goes by removing each leaf one by one in its hierarchy.

For example, using the same mro.py:

E + [BDA BA DC A]
EB + [DA A DC A]
EBD + [A A C A]
EBDA + [C]
EBDAC

E + [BDXA BA DC X A]
EB + [DXA A DC X A]
EBD + [XA A C X A]
EBDX + [A A C A]
EBDXA + [C]
EBDXAC


You can see EBDAC is a sub-sequence of EBDXAC.

The proof is intuitive. I can give a sketch here. Assume without loss of generality, class E extends A, B, C, D, and we add a new base class X between B and C, now we have:

(f1) E + [ABXCD L(A) L(B) X L(C) L(D)]

Compare this with:

(f0) E + [ABCD L(A) L(B) L(C) L(D)]

We can carry all the computation in f1 just as in f0 until X becomes the head of the first list item:

(f1') E... + [XCD ... X ... ]

(f0') E... + [CD ... ... ]


At this time we know we can extract X in (f1') because X is in any tail. After we extract X, (f1') becomes (f0') and we can carry all the operations just as in (f0'). So the result of (f1) is merely an added X and all other elements keep the same order. Intuitively, the performance of C3.9 should be almost the same as C3. The only drawback I can think of is existing code may have a different MRO, but that happened when Python adopted C3.

Not sure if this is worth a PEP. If anyone is interested feel free to leave a message.

History
Date	User	Action	Args
2019-12-28 05:53:09	cykerway	set	recipients: + cykerway, steven.daprano
2019-12-28 05:53:09	cykerway	set	messageid: <1577512389.43.0.625937573029.issue39145@roundup.psfhosted.org>
2019-12-28 05:53:09	cykerway	link	issue39145 messages
2019-12-28 05:53:09	cykerway	create