Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

typing.Protocol silently overrides __init__ method of delivered class #88970

Closed
uriyyo opened this issue Aug 2, 2021 · 18 comments
Closed

typing.Protocol silently overrides __init__ method of delivered class #88970

uriyyo opened this issue Aug 2, 2021 · 18 comments
Labels
3.11 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@uriyyo
Copy link
Member

uriyyo commented Aug 2, 2021

BPO 44807
Nosy @gvanrossum, @ambv, @JelleZijlstra, @uriyyo, @Fidget-Spinner, @adriangb
PRs
  • bpo-44807: Forbid to define __init__ method in Protocol classes #27543
  • bpo-44807: Allow Protocol classes to define __init__ #31628
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2021-08-02.09:53:46.694>
    labels = ['type-bug', 'library', '3.11']
    title = 'typing.Protocol silently overrides __init__ method of delivered class'
    updated_at = <Date 2022-03-01.06:52:43.672>
    user = 'https://github.com/uriyyo'

    bugs.python.org fields:

    activity = <Date 2022-03-01.06:52:43.672>
    actor = 'adriangb'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Library (Lib)']
    creation = <Date 2021-08-02.09:53:46.694>
    creator = 'uriyyo'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 44807
    keywords = ['patch']
    message_count = 16.0
    messages = ['398742', '402333', '402348', '402355', '402361', '402381', '414169', '414172', '414173', '414236', '414238', '414240', '414241', '414242', '414243', '414244']
    nosy_count = 6.0
    nosy_names = ['gvanrossum', 'lukasz.langa', 'JelleZijlstra', 'uriyyo', 'kj', 'adriangb']
    pr_nums = ['27543', '31628']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue44807'
    versions = ['Python 3.11']

    @uriyyo
    Copy link
    Member Author

    uriyyo commented Aug 2, 2021

    typing.Protocol silently overrides __init__ method of delivered class.

    I think it should be forbidden to define __init__ method at Protocol delivered class in case if cls is not Protocol.

    @uriyyo uriyyo added 3.11 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Aug 2, 2021
    @ambv
    Copy link
    Contributor

    ambv commented Sep 21, 2021

    I'm not sure whether we should downright reject __init__() methods. Sadly, they aren't usable at the moment anyway, but they could. Hear me out.

    There seems to be a gap in the PEP-544 definition of protocols, i.e. whether __init__ should be definable as part of a protocol. Currently this isn't specified and it looks like Mypy is ignoring __init__ in Protocol definitions.

    Consider this example: https://gist.github.com/ambv/0ed9c9ec5b8469878f47074bdcd37b0c

    Mypy doesn't consider __init__() to be part of a protocol, even though it's an instance method like any other. It is valid (albeit weird) to call it again on an already initialized instance:

    >>> class C:
    ...   def __init__(self) -> None:
    ...     print('init!')
    ...
    >>> c = C()
    init!
    >>> c.__init__()
    init!
    >>> c.__init__()
    init!

    In the linked gist example, Mypy currently ignores __init__() but doesn't ignore get_ball(), thus incorrectly judging BallContainer and TwoBallContainer to be equivalent.
    In fact, it only rejects the third ifmain call with object because object lacks a get_ball() method.

    So... would that be useful? Do we ever want to define protocols on the basis of how their initializers look like? My uninformed intuition is that this might be potentially useful, however I fail to come up with a realistic example here.

    @gvanrossum
    Copy link
    Member

    I wonder if part of the problem here isn't that protocols are primarily focused on instances, where the __init__ method is explicitly *not* considered part of the type.

    IIRC supporting protocols for classes was an afterthought.

    That said you have made a good case for seeing it as part of the protocol when the Type[] operator is applied to the protocol.

    So yes, I think we should be careful ruling that out too soon. Do we have any evidence that users are confused and define __init__ methods on protocols that are being ignored by their type checkers? The OP didn't say.

    When in doubt, let the status quo win.

    @ambv
    Copy link
    Contributor

    ambv commented Sep 21, 2021

    Yurri, is the issue you reported your original finding? If you haven't found this issue in the wild, we might want to let this be as-is until we specify exactly whether __init__() should be considered part of a protocol.

    @uriyyo
    Copy link
    Member Author

    uriyyo commented Sep 21, 2021

    Łukasz this issue is my original finding.

    I have found this case when I was working on another issue that was related to typing.Protocol.

    I agree that we can leave it as it is for now because I didn't find any information from python community about this case.

    @ambv
    Copy link
    Contributor

    ambv commented Sep 21, 2021

    Marking issue as "pending" until we figure out how PEP-544 should be amended.

    @adriangb
    Copy link
    Mannequin

    adriangb mannequin commented Feb 28, 2022

    While this is figured out, would it be possible to remove the silent overriding? This seems like something type checkers should be doing, not silent runtime modification of classes. Pyright already (correctly) checks this, so I think it would just need to be added to MyPy.

    >>> class C(Protocol):
    ...   def __init__(self) -> None:
    ...     print('init!')
    ...
    >>> c = C()  # Pyright error, MyPy says it's okay

    I came across this while trying to create a class that requires concrete subclasses to define a specific field, and it seems like Protocol is the only thing that can pull this off because of how type checkers special case it: https://gist.github.com/adriangb/6c1a001ee7035bad5bd56df70e0cf5e6

    I don't particularly like this use of Protocol, but it works (aside from the overriding issue).

    I don't know if this usage of implementing __init__ on a protocol class should be valid or not, but I do think it's interesting that __init__ is never called on the protocol class itself, even if the protocol class is the one defining it. It is only called on MyAPIKey, which is a concrete class that happens to inherit the implementation from a protocol class. So maybe that should be valid? I'm not sure.

    But I do know that making type checkers enforce this instead runtime would allow this use case to work while still prohibiting the (in my opinion invalid) use case of calling __init__ on the protocol class itself.

    @gvanrossum
    Copy link
    Member

    If the problem is in mypy, it should be filed in the mypy tracker, not here.

    @adriangb
    Copy link
    Mannequin

    adriangb mannequin commented Feb 28, 2022

    Apologies if that was noise, I filed an issue on the MyPy issue tracker: python/mypy#12261

    @JelleZijlstra
    Copy link
    Member

    Regardless of mypy's behavior (which isn't impacted by what typing.py does), there's a legitimate complaint here about the runtime behavior: any __init__ method defined in a Protocol gets silently replaced.

    >>> from typing import Protocol
    >>> class X(Protocol):
    ...     def __init__(self, x, y): pass
    ... 
    >>> X.__init__
    <function _no_init_or_replace_init at 0x10de4e5c0>

    Fixing that won't be easy though, unless we give up on making it impossible to instantiate a Protocol.

    @gvanrossum
    Copy link
    Member

    Oops.

    So this is an intentional feature -- Protocol replaces __init__ so that you can't (accidentally) instantiate a protocol. And the code to do this has changed a couple of times recently to deal with some edge cases. At least one of the PRs was by Yurii, who created this issue. I didn't read through all that when I closed the issue, so I'm reopening it.

    Maybe Yurii can devise a solution? (Although apparently their first attempt, #27543 was closed without merging.) Yurii and Lukasz should probably figure this out.

    @gvanrossum gvanrossum reopened this Mar 1, 2022
    @adriangb
    Copy link
    Mannequin

    adriangb mannequin commented Mar 1, 2022

    Agreed.

    What if we allow protocols that implement __init__ but still disallow instantiating a protocol that does not? It's a 1 line change, all existing tests pass and it would still catch what I think was the original intention (trying to instantiate a Protocol class with no init): https://github.com/python/cpython/pull/31628/files#diff-ddb987fca5f5df0c9a2f5521ed687919d70bb3d64eaeb8021f98833a2a716887

    @adriangb
    Copy link
    Mannequin

    adriangb mannequin commented Mar 1, 2022

    Guido, it looks like you replied while I was typing my reply out.

    Yurii can correct me here but I believe PR bpo-27543 was an attempt to disallow defining __init__ on a Protocol completely. What I proposed above is the opposite behavior, while still fixing the issue of __init__ getting silently overridden (which is the crux / title of this issue).

    I'm not sure which approach is right.

    @JelleZijlstra
    Copy link
    Member

    It doesn't make logical sense to instantiate any Protocol, whether it has __init__ or not, because a Protocol is inherently an abstract class. But we can just leave enforcement of that rule to static type checkers, so Adrian's proposed change makes sense to me.

    @gvanrossum
    Copy link
    Member

    Would it make sense to enforce the no-instantiation rule in __new__ instead of __init__?

    @adriangb
    Copy link
    Mannequin

    adriangb mannequin commented Mar 1, 2022

    I am not sure if that solves anything (other than the fact that __new__ is much less common to implement than __init__), but I may just be slow to pick up the implications of moving the check to __new__

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @lepeuvedic
    Copy link

    lepeuvedic commented Apr 11, 2022

    With the current implementation, it is impossible to derive a Mixin class from a Protocol, whether or not any Protocol actually defines an __init__ method: the __init__ chain terminates in _no_init after the __init__ of the Mixin class and the __init__ of the target class never runs. This behaviour occurs even as _no_init determines that instance initialization is legal.

    Using Mixin classes based on Protocols should be a common use case to add optional functionality to classes. For instance, most classes supporting copy, could be adapted as context managers using ContextVar. Some classes could make Iterable support optional using a Mixin class, just to save the auxiliary Iterator class.

    Of course, implementations need not explicitly derive from Protocols they implement. It is just a convenience offered by PEP 544 to inherit a default implementation of some methods. The workaround is simply not to explicitly derive from the Protocol class, and to implement all the methods in each implementation class. A module, which publishes a Protocol, may also publish a concrete generic implementation to avoid semantic protocol bugs and reduce code duplication.

    Protocol._no_init should call super().__init__() once it has found out that instance creation is legal, and should not just assume that the Protocol class is only followed by other Protocol classes in the __mro__. This assumption is violated in multiple inheritance situations, even as Protocols are restricted from inheriting only from other Protocols.

    If Protocols have an __init__ method (currently they cannot run) calling super ().__init__ from _no_init would also give the initialization a chance to run when an implementation class implements more than one Protocol.

    Protocol should probably save the original __init__ method under another name, instead of overwriting it.
    Because before calling the next-in-chain __init__ , it should call its own.

    It probably makes sense to have __init__ defined as part of a Protocol, for the cases where code has to be able to create instances of various implementations of a given Protocol without knowing which one in advance.

    @JelleZijlstra
    Copy link
    Member

    I just merged #31628 addressing this issue.

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.11 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants