classification
Title: Inconsistent __args__ between typing.Callable and collections.abc.Callable
Type: behavior Stage: patch review
Components: Library (Lib) Versions: Python 3.10, Python 3.9
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: BTaskaya, Zac Hatfield-Dodds, corona10, gvanrossum, hauntsaninja, kj, levkivskyi, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2020-10-29 12:25 by Zac Hatfield-Dodds, last changed 2020-12-01 02:24 by kj.

Pull Requests
URL Status Linked Edit
PR 23060 open kj, 2020-10-31 12:50
Messages (21)
msg379869 - (view) Author: Zac Hatfield-Dodds (Zac Hatfield-Dodds) * Date: 2020-10-29 12:25
The two ways of getting a parametrised Callable have inconsistent __args__:

    >>> import collections.abc, typing
    >>> typing.Callable[[int, int], int].__args__
    (int, int, int)
    >>> collections.abc.Callable[[int, int], int].__args__
    ([int, int], int)

I discovered this while working on PEP 585 support in Hypothesis [1], where it is easy enough to work around but carries a potentially serious performance cost - the list means we cannot use the type as a cache key for non-`...` argument types.  

https://bugs.python.org/issue40494 and https://bugs.python.org/issue40398 may be related.

[1] https://github.com/HypothesisWorks/hypothesis/pull/2653
msg379913 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2020-10-30 04:17
Good find! I see that typing.Callable has adopted this structure precisely to enable caching.

We should see if we can fix _collections_abc.Callable. It's still early in the life of 3.9 so I think this is reasonable.

We'll need a subclass of GenericAlias so that the repr() of Callable[[int, int], int] still comes out correctly. This is similar to how typing.Callable solves it.

Do you feel up to submitting a PR for this? Otherwise maybe Batuhan feels like contributing a fix for this?
msg379918 - (view) Author: Zac Hatfield-Dodds (Zac Hatfield-Dodds) * Date: 2020-10-30 05:56
Unfortunately I'm overcommitted for the next few weeks-months :-/
msg380031 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2020-10-31 01:11
@corona10 Do I hear that you'd like to work on this?
msg380040 - (view) Author: Dong-hee Na (corona10) * (Python committer) Date: 2020-10-31 05:53
@gvanrossum

Sorry, Not this time. I just add myself to observe how to solve this issue.

Maybe Batuhan is the proper member to handle this issue.

Just question to this issue.
Since GenericAlias does not have a Py_TPFLAGS_BASETYPE flag, IMHO we have to add Py_TPFLAGS_BASETYPE to type declaration and this behavior does not occur any regression?
msg380048 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2020-10-31 10:08
Would not be better to change typing.Callable?

Attributes __origin__, __args__ and __parameters__ are not documented in the typing module. They are also not mentioned in any PEP except PEP 585. So I don't know what is intention and correct value of these attributes.
msg380059 - (view) Author: Ken Jin (kj) * Date: 2020-10-31 12:50
Hi, I submitted a PR for a proof-of-concept on how the code which subclasses GenericAlias might look like for everyone's consideration.  

Personally, I'd lean more towards what Serhiy said and change typing.Callable rather than GenericAlias which affects many other classes' __class_getitem__. However, considering that typing.Callable has been around for years, I'm hesitant to change its __args__ which might break programs which rely on it.
msg380175 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2020-11-02 04:41
Actually you can't really change typing.Callable's __args__, because it must be hashable, and lists aren't.

If GenericAlias doesn't cache yet, it might very well do so in the future to gain some speed when e.g. list[int] is used at runtime outside annotations, e.g. in cast(), so it will be important there too.
msg380181 - (view) Author: Ken Jin (kj) * Date: 2020-11-02 05:38
Dear Guido, from what I can see in the typing module, _CallableType already casts the args list to a tuple before creating the __CallableGenericAlias, so it should support cacheing. This is taken from the from _CallableType::

def __getitem__(self, params):
    ...  # (some checking code here)
    args, result = params
    ...  # (some checking code here)

        params = (tuple(args), result)  # args is cast to a tuple
    return self.__getitem_inner__(params)

@_tp_cache   # the cache
def __getitem_inner__(self, params):
    args, result = params
    ...  # (some checking code here)
    
    # This is the suspect code causing the flattening of args
    params = args + (result,)     
    return self.copy_with(params)

def copy_with(self, params):
    return _CallableGenericAlias(self.__origin__, params,
                                 name=self._name, inst=self._inst)

Changing the suspect code from ``params = args + (result,)`` to ``params = (args, result)`` allows typing.Callable to be consistent with collections.abc.Callable's GenericAlias, and also allows for cacheing.

With that change:

>>> from typing import Callable
>>> Callable[[int, ], str].__args__
((<class 'int'>,), <class 'str'>)  # note the args is a tuple

>>> from collections.abc import Callable
>>> Callable[[int, ], str].__args__
([<class 'int'>], <class 'str'>)   # note the args is a list

This isn't fully consistent with collections.abc.Callable's GenericAlias just yet, but it's close.
msg381064 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2020-11-16 04:32
@Hatfield-Dodds, if we changed typing.Callable to return ((int, int), str) but collections.abc.Callable continued to return ([int, int], str), would that suffice for your purposes?
msg381065 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2020-11-16 04:33
Also, maybe we should make builtins.callable generic as well?
msg381068 - (view) Author: Zac Hatfield-Dodds (Zac Hatfield-Dodds) * Date: 2020-11-16 05:08
> @Hatfield-Dodds, if we changed typing.Callable to return ((int, int), str) but collections.abc.Callable continued to return ([int, int], str), would that suffice for your purposes?

For performance reasons I'd prefer that the return value be hashable for both, but we've already shipped the workarounds [0,1,2] for 3.9.0 and will maintain that until 3.9 reaches EOL in any case.

Whether we return (int, int, str) or ((int, int), str) doesn't make much difference to me, the latter will require a trivial patch to [2] so please do whatever makes most sense upstream.


> Also, maybe we should make builtins.callable generic as well?

I like this idea :-)


[0] https://hypothesis.readthedocs.io/en/latest/changes.html#v5-39-0
[1] https://github.com/HypothesisWorks/hypothesis/pull/2653/files#diff-c56f048e926cce76dc6cd811924136f5c97e0f68f59625869b4ab01f1dbe10e0L1473-R1480
[2] https://github.com/HypothesisWorks/hypothesis/pull/2653/files#diff-f6a209c019f3f6af11a027a0035e3fc736935d9920fd85da726f9abf4c325d6bR562-R567
msg381073 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2020-11-16 05:31
In that case I prefer ((int, int), str), in case we ever end up needing to add additional parameters to Callable. I propose we first fix https://bugs.python.org/issue42102.
msg381402 - (view) Author: Ken Jin (kj) * Date: 2020-11-19 10:54
I tried to implement Callable[[int, int], str] as ((int, int), str). However, it breaks much of typing's tests and requires recursion to account for the nested tuples, in both typing, and in the C implementation of GenericAlias.

I'd like to humbly propose a less breaking solution: express __args__ of Callable[[int, int], str] as (Tuple[int, int], str). Almost all the current code in the typing library already supports this. As for collections.abc.Callable, its __args__ simply needs to be expressed as (tuple[int, int], str). This is also an easy fix.

Semantically, this makes sense to me too. Both of the above changes will also still allow caching since Tuple[x] is hashable. This will allow us to fix this issue without depending on issue42102, or at least it can be a stop gap measure. If issue42102 has a resolution, the C implementation can just replace the Python one directly.
msg381642 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2020-11-23 05:32
I'm still not sold on __args__ == (Tuple[int, int], str); it looks too weird.

However if we introduced a new private type for this purpose that might work? I see that the definition of Tuple in typing.py is

Tuple = _TupleType(tuple, -1, inst=False, name='Tuple')

Maybe we could do something like

_PosArgs = _TupleType(tuple, -1, inst=False, name='_PosArgs')

?

Then __args__ could be (_PosArgs[int, int], str).

However this still leaves collections.abc.Callable different. (We really don't want to import typing there.)

Then again, maybe we should still not rule out ((int, int), str)? It feels less hackish than the others.

And yet another solution would be to stick with (int, int, str) and change collections.abc.Callable to match that. Simple, and more backward compatible for users of the typing module (since no changes at all there).
msg381643 - (view) Author: Ken Jin (kj) * Date: 2020-11-23 05:59
@guido,
Aesthetics-wise, I agree that ((int, int), str) looks by far the best. My gripe with it lies with the implementation - almost every function in typing currently assumes that every object in __args__ is a type, having (int, int) - the tuple object - requires many changes especially to TypeVar substitution and repr.

I thought that (tuple[int, int], str) looked fine because the positional arguments passed to a function can be represented by a single tuple (and also no need for imports in collections.abc :)! ). However, I support (int, int, str) too for the backwards-compatibility reasons you stated. The only downside to that is that Callable's __args__ will be slightly harder to parse if more args representing other things are added in the future.
msg381644 - (view) Author: Shantanu (hauntsaninja) * Date: 2020-11-23 06:31
I think ((int, int), str) is superior to the others and if it can be made to work, that's what we should do.

Note that if variadic type proposals go anywhere (https://docs.google.com/document/d/1oXWyAtnv0-pbyJud8H5wkpIk8aajbkX-leJ8JXsE318/edit#), my understanding is we'd probably have to loosen the assumption that __args__ only contains types for that to work too.

(int, int, str) is temptingly easy, but I think it's a bad idea. For starters, PEP 612 loosens what X can be in Callable[X, str] and life is too short to spend time figuring out whether (P, str) is meant to be Callable[P, str] or Callable[[P], str].

I'm still trying to figure out how I feel about (Tuple[int, int], str). My initial reaction was negative / I think I'd be more comfortable with a (_Posargs[int, int], str). I don't think Tuple[int, int] would be a workable solution in the event that variadic types pose a similar problem.
msg381647 - (view) Author: Ken Jin (kj) * Date: 2020-11-23 07:32
A possible workaround for _PosArgs in collections.abc without depending on typing could be like this:

_PosArgs = type('_PosArgs', (tuple, ), {})

This would help out the future type proposals too, because GenericAlias accepts non-type args.

__args__ of Callable in typing and collections.abc won't be == though. Because all typing types aren't equal to their builtin counterparts. Eg. Tuple[int] == tuple[int] is False.
msg382103 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2020-11-30 01:28
Now that I've seen and reviewed KJ's implementation using _PosArgs, I am worried about it, as it looks quite complicated. @Shantanu, do we really need to worry that Callable[P, R] could be ambiguous? If P is a ParamSpec, wouldn't Callable[[P], R] be invalid?
msg382108 - (view) Author: Shantanu (hauntsaninja) * Date: 2020-11-30 02:25
You're right that Callable[[P], int] is invalid, so a flat representation isn't currently ambiguous. But a flat representation would foist the responsibility for checking that to all users of `__args__`. If your code isn't ParamSpec aware, your failures will probably be less obvious or even silent.

The flat representation trades off a) implementation complexity with b) usability / aesthetics, c) future proof-ness. I'm not morally opposed to such a tradeoff / am happy to defer to people with more experience in making such tradeoffs.
msg382213 - (view) Author: Ken Jin (kj) * Date: 2020-12-01 02:24
FWIW, current code for extracting args type and return type from Callable seems to be something like this (at least from the typing module):

arg_types = __args__[:-1]
return_type = __args__[-1]

Once ParamSpec is added in, library authors would need to check specifically for that in arg_types and return_types. Other than that, I don't have enough expertise or experience to say what will/won't break. 

IMO a flat tuple is ok for now: AFAIK (and I might be completely wrong here), most runtime type checking libraries I've used don't validate the arguments passed in because being too strict may interfere with keyword arguments. (Eg. Callable in Pydantic https://pydantic-docs.helpmanual.io/usage/types/#callable). So the slightly more inconvenient way of reading Callable's __args__ seems to be an acceptable tradeoff for backwards compatibility and simplicity.
History
Date User Action Args
2020-12-01 02:24:57kjsetmessages: + msg382213
2020-11-30 02:25:13hauntsaninjasetmessages: + msg382108
2020-11-30 01:28:53gvanrossumsetmessages: + msg382103
2020-11-23 07:32:35kjsetmessages: + msg381647
2020-11-23 06:31:50hauntsaninjasetnosy: + hauntsaninja
messages: + msg381644
2020-11-23 05:59:44kjsetmessages: + msg381643
2020-11-23 05:32:48gvanrossumsetmessages: + msg381642
2020-11-19 10:54:55kjsetmessages: + msg381402
2020-11-16 05:31:17gvanrossumsetmessages: + msg381073
2020-11-16 05:08:07Zac Hatfield-Doddssetmessages: + msg381068
2020-11-16 04:33:35gvanrossumsetmessages: + msg381065
2020-11-16 04:32:35gvanrossumsetmessages: + msg381064
2020-11-02 05:38:42kjsetmessages: + msg380181
2020-11-02 04:41:31gvanrossumsetmessages: + msg380175
2020-10-31 12:50:52kjsetmessages: + msg380059
2020-10-31 12:50:23kjsetkeywords: + patch
nosy: + kj

pull_requests: + pull_request21979
stage: patch review
2020-10-31 10:08:32serhiy.storchakasetmessages: + msg380048
2020-10-31 05:53:27corona10setmessages: + msg380040
2020-10-31 01:11:33gvanrossumsetmessages: + msg380031
2020-10-30 07:01:20corona10setnosy: + corona10
2020-10-30 05:56:16Zac Hatfield-Doddssetmessages: + msg379918
2020-10-30 04:17:55gvanrossumsetnosy: + BTaskaya
messages: + msg379913
2020-10-29 12:25:19Zac Hatfield-Doddscreate