This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: dataclasses: Allow typing.Annotated to wrap dataclasses-specific annotations
Type: behavior Stage: patch review
Components: Library (Lib) Versions: Python 3.11, Python 3.10, Python 3.9
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: GBeauregard, JelleZijlstra, eric.smith
Priority: normal Keywords: patch

Created on 2022-01-25 09:02 by GBeauregard, last changed 2022-04-11 14:59 by admin.

Pull Requests
URL Status Linked Edit
PR 30997 open GBeauregard, 2022-01-28 21:23
Messages (9)
msg411567 - (view) Author: Gregory Beauregard (GBeauregard) * Date: 2022-01-25 09:02
In https://bugs.python.org/issue46491 the typing runtime behavior was changed so that `Annotated[Classvar[...]]` is now valid at runtime in order to alleviate tension between typing and non-typing annotation space uses. dataclasses.py should likely follow suit in its runtime use of `ClassVar` and `InitVar`.

Reviewing the code I see two areas that would need addressed:

1) `InitVar` needs changed so `Annotated[InitVar[...]]` is no longer a runtime error. This is currently a runtime error completely by accident: typing.py expects special type forms to be `callable()`, usually by implementing a `__call__` that throws an error, but `InitVar` does not implement this. Adding an  implementation like in typing.py would fix the runtime error:
https://github.com/python/cpython/blob/b1a3446f077b7d56b89f55d98dadb8018986a3e5/Lib/typing.py#L391-L392

2) The dataclasses-specific typehint introspection implementation needs modified to accommodate being wrapped by an `Annotated` annotation. I see in the comments the code is performance sensitive so I'm not sure what you want to do; f.ex. the regex needs modified, but it's not clean.

What are your thoughts?
msg411614 - (view) Author: Jelle Zijlstra (JelleZijlstra) * (Python committer) Date: 2022-01-25 15:26
Related: issue44799 about InitVar.
msg411637 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2022-01-25 17:22
My thoughts are that I'd like PEP 563 to go away, and PEP 649 to be accepted, and also never allow string literal annotations like the string "Annotated[ClassVar[int]]". But since we'll no doubt have to support string-ized annotations even if PEP 649 is accepted, that's a pipe dream.

I think your suggestion for #1 seems reasonable.

For #2, in the case where typing has been imported and the annotation isn't a string, I assume it's simple enough to look inside the Annotated object and extract the InitVar (or ClassVar) object. I haven't delved in to Annotated object yet.

For #2, with a string annotation, it does look like it will get ugly. I'll have to spend some time looking at _MODULE_IDENTIFIER_RE. I'm guessing we'll need another re that looks for the same basic thing with "Annotated[" or "module.Annotated[" prepended and then look inside that. Or maybe one re to do both. I don't think we should support cases like:

from __future__ import annotations
myAnnotated = typing.Annotated

@dataclass
class Foo:
    a: myAnnotated[ClassVar[int]]

(That is, we won't recognize the string "myAnnotated[ClassVar[int]]" as an Annotated ClassVar.

Maybe we should also restrict it to "Annotated" or "typing.Annotated", but that would prevent someone from using "import typing as _typing", for example. This is why the current code accepts any module name, not strictly "typing". At least that's my recollection, more study is needed.

Or maybe it's time to give up and use typing.get_type_hints() or inspect.get_annotations(), but I suspect there are enough corner cases that will fail that we'll never get that to work right. Nested classes, anyone?

The whole runtime inspection of string-ized annotations is a mess.

On the "performance is important" comment: I'm not sure this is really an issue any more. There was some PEP that was supposed to speed up importing typing, and I never looked at the performance once it was merged. But then again, I'm not sure we want to always have dataclasses import typing, either. If a program doesn't use dataclasses that using the typing module, there's no sense importing it and enlarging the working set of modules.

I welcome any insights on any of these issues. I'm not a typing expert.
msg411709 - (view) Author: Gregory Beauregard (GBeauregard) * Date: 2022-01-26 07:15
Thanks for getting back so quickly.

Annotated is set up to be 'transparent' by default to `typing.get_type_hints` so in the case of using `typing.py` it can be made straightforward by chaining with `typing.get_origin`, I think.

I don't see any reasonable way to do the regex that allows `Annotated` to be renamed, so I agree your suggested restriction. I think the regex would be something like this:

^\s*(?:(?:\w+\s*\.)?\s*Annotated\[)?(?:\s*(\w+)\s*\.)?\s*(\w+)

I'm a bit worried people who are into Annotated annotations might be concerned about line length and more likely to rename `Annotated`, but I don't know if this is a realistic concern. And a part of me wants to say always importing typing.py should be okay since dataclasses was designed to work with type hints after all.

On the other hand, I'm also a bit worried that if we made dataclasses always import typing.py it would rub people the wrong way even if we profiled it and decided it was okay.

I'm going to spend some more time digesting the code tomorrow and try to decide if there's any major speed bumps to a full `typing` approach. I'll also look at import time and such.
msg411710 - (view) Author: Gregory Beauregard (GBeauregard) * Date: 2022-01-26 07:19
Or rather,

^\s*(?:(?:\w+\s*\.)?\s*Annotated\s*\[)?(?:\s*(\w+)\s*\.)?\s*(\w+)
msg411945 - (view) Author: Gregory Beauregard (GBeauregard) * Date: 2022-01-28 00:22
Hi Eric,

to follow up on https://bugs.python.org/msg411943

I'm currently a bit negative on moving to get_type_hints, even though I got it working for the test suite. I think your worries with nesting are well placed, particularly with namespaces and such.

In that vein, I suggest we move forward with patching the existing implementation with the discussed regex restrictions. I'm not sure if you want to remove the test cases with leading spaces; it seems not too important.

While we're there I found a bug in the test suite, a missing comma that can be fixed at the same time:
https://github.com/python/cpython/blob/b1a3446f077b7d56b89f55d98dadb8018986a3e5/Lib/test/test_dataclasses.py#L3080

Do you have any other concerns before I take a stab at this?
msg411948 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2022-01-28 00:45
I was hoping to wait until the PEP 649 / PEP 563 thing was decided. But I realize that no matter how that turns out, there will be a need to deal with string annotations.

So I think I'm okay with the regex changes. Personally, I think we should remove support for leading spaces and should remove the tests, too. I guess there's some argument that there should be a deprecation period, I think it's just invalid syntax and shouldn't be supported.

Go ahead and put together a PR.
msg412033 - (view) Author: Gregory Beauregard (GBeauregard) * Date: 2022-01-28 21:31
I had a few style, approach, and testing preference questions, but I decided they're probably best addressed in a code review so I went ahead and posted the PR.
msg413094 - (view) Author: Gregory Beauregard (GBeauregard) * Date: 2022-02-11 19:22
It occurred to be that we do need to add the __call__ to KW_ONLY, but for a different reason than this bpo:

If you call get_type_hints on a dataclass with a KW_ONLY parameter when PEP 563 is enabled, the entire call will fail if KW_ONLY isn't callable(). This can also happen if you stringize KW_ONLY without PEP 563. I made a bpo to suggest removing the callable() check entirely, but it's waiting discussion currently: https://bugs.python.org/issue46644

My feeling is you probably wanted to wait on making changes of this kind for the 5.11 PEP 563 et al decision to play out, but on the other hand I think the KW_ONLY (and similar __call__ method for InitVar this patch already adds) change would likely be backportable so we may want to make them anyway for that purpose. Do you have an opinion on this?

This patch might be backportable to mirror https://bugs.python.org/issue46491 but I don't have a strong opinion on that.
History
Date User Action Args
2022-04-11 14:59:55adminsetgithub: 90669
2022-02-11 19:22:33GBeauregardsetmessages: + msg413094
2022-01-28 21:31:30GBeauregardsetmessages: + msg412033
2022-01-28 21:23:01GBeauregardsetkeywords: + patch
stage: patch review
pull_requests: + pull_request29178
2022-01-28 00:45:56eric.smithsetmessages: + msg411948
2022-01-28 00:22:30GBeauregardsetmessages: + msg411945
2022-01-26 07:19:09GBeauregardsetmessages: + msg411710
2022-01-26 07:15:31GBeauregardsetmessages: + msg411709
2022-01-25 17:22:32eric.smithsetmessages: + msg411637
2022-01-25 15:26:12JelleZijlstrasetmessages: + msg411614
2022-01-25 09:02:48GBeauregardcreate