msg316333 - (view) |
Author: Rick Teachey (Ricyteach) * |
Date: 2018-05-09 21:38 |
This is broken in 3.7 (both beta 3 and 4):
from __future__ import annotations
from dataclasses import dataclass
from typing import ClassVar, Any
@dataclass
class C():
class_var: ClassVar[Any] = object()
data: str
Traceback:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\ricky\AppData\Local\Programs\Python\Python37\lib\dataclasses.py", line 850, in dataclass
return wrap(_cls)
File "C:\Users\ricky\AppData\Local\Programs\Python\Python37\lib\dataclasses.py", line 842, in wrap
return _process_class(cls, init, repr, eq, order, unsafe_hash, frozen)
File "C:\Users\ricky\AppData\Local\Programs\Python\Python37\lib\dataclasses.py", line 763, in _process_class
else 'self',
File "C:\Users\ricky\AppData\Local\Programs\Python\Python37\lib\dataclasses.py", line 442, in _init_fn
raise TypeError(f'non-default argument {f.name!r} '
TypeError: non-default argument 'data' follows default argument
|
msg316335 - (view) |
Author: Rick Teachey (Ricyteach) * |
Date: 2018-05-09 21:39 |
To be clear: it works just fine with the annotations import.
|
msg316336 - (view) |
Author: Rick Teachey (Ricyteach) * |
Date: 2018-05-09 21:40 |
Sorry, mean to say it works just fine *without* the import.
|
msg316338 - (view) |
Author: Eric V. Smith (eric.smith) * |
Date: 2018-05-09 21:50 |
This is a known issue, but it wasn't being tracked here. So, thanks for opening the issue.
https://github.com/ericvsmith/dataclasses/issues/92#issuecomment-382473127
Not to put Łukasz on the spot (he's sitting behind me even as we speak), but I think we missed the window for 3.7.0 for this. I'll discuss it with him.
|
msg316345 - (view) |
Author: Łukasz Langa (lukasz.langa) * |
Date: 2018-05-10 03:08 |
Well, this is a rather big deal and I'd like to see it solved for 3.7.0. Ned, this is after the last beta but as far as I understand, we're still fine committing fixes that maintain ABI until RC1, right?
Note that this isn't specific to the `annotations` future import. If a user actually writes:
field: "ClassVar[SomeTypeReferencedLater]" = get_some_type_object()
the effect is the same.
There are two ways to solve it, the right way and the fast way.
The right way is to call `get_type_hints()` on the class which forces evaluation of all type annotations (including in base classes, if any). That way you can keep using your existing hack to check if a thing is a ClassVar. However, it's slow because it:
- forces evaluation of all type annotations at import time, which is even slower than without the `annotations` future import because now you're tokenizing, creating the AST and the code objects, too; and
- you force an import of typing for all users, regardless whether they use any annotations or not.
This is why attrs went with the fast way which covers most (but not all) bases in this case. If the annotation is a string, just check if it starts with "typing.ClassVar", "t.ClassVar", or just "ClassVar". That's a fast check and doesn't ever require importing typing. On the flip side, the 0.001% of users [1]_ who import ClassVar differently, will not have their class variables recognized.
So, Eric, unless you really want to do the right thing here and make dataclasses forever slower to start up than attrs, I would be happy to provide you with a patch for this during sprints.
[1] .. Figure made up on the spot.
|
msg316346 - (view) |
Author: Ned Deily (ned.deily) * |
Date: 2018-05-10 03:44 |
> Ned, this is after the last beta but as far as I understand, we're still fine committing fixes that maintain ABI until RC1, right?
Yes
|
msg316350 - (view) |
Author: Eric V. Smith (eric.smith) * |
Date: 2018-05-10 07:59 |
See also [2]_ for a brief discussion of forward references, which makes get_type_hints() undesirable in this case.
> This is why attrs went with the fast way which covers most (but not all) bases in this case. If the annotation is a string, just check if it starts with "typing.ClassVar", "t.ClassVar", or just "ClassVar". That's a fast check and doesn't ever require importing typing. On the flip side, the 0.001% of users [1]_ who import ClassVar differently, will not have their class variables recognized.
I'm okay with the fast check for the string "ClassVar". My only concern is how the user would figure out what's going on, if for example they "import typing as T". The generated __init__ wouldn't have the params they expect, but with default values thrown in I'm not so sure how quickly they'd notice. Hopefully they'd figure out soon enough there's a problem, but I'm not sure they'd know the cause if it.
Maybe we could do some additional checking if typing has been imported? If we see "T.ClassVar" ("[identifier].ClassVar" appears at the beginning of the string) then if typing has been imported, further check that "T is typing"? Although this isn't going to help with "from typing import ClassVar as CV" (but only 0.00004% of users would do that [3]_) and I don't want to use regex's for this. str.partition() is likely good enough, if we decide to go this route.
Is there any scenario where the following code would be useful, or expected to work, if "import typing as T" hadn't been executed before @dataclass runs? After all, if we do decide to call get_type_hints() it wouldn't work without it.
field: "T.ClassVar[SomeTypeReferencedLater]" = get_some_type_object()
But again, unless [2]_ is addressed, get_type_hints() will fail unless both T and SomeTypeReferencedLater are known when @dataclass runs, if we used get_type_hints().
So, I guess this is my roundabout way of saying we should do the best we can with string inspection, and not use get_type_hints(). Maybe we can discuss it with Guido at the sprints.
For all of this, I'm assuming we'll do something similar with InitVar. Although we obviously know it's been imported, it doesn't solve all of the other issues with get_type_hints.
[2] .. https://github.com/python/typing/issues/508
[3] .. Also a made up number.
|
msg316351 - (view) |
Author: Eric V. Smith (eric.smith) * |
Date: 2018-05-10 08:10 |
I see that https://github.com/python/typing/issues/508 is also referenced in https://github.com/python-attrs/attrs/issues/361, where it contributed to attrs using string inspection.
|
msg316362 - (view) |
Author: Rick Teachey (Ricyteach) * |
Date: 2018-05-10 13:48 |
> Is there any scenario where the following code would be useful...
Perhaps if someone, in search of a speedup, were sort of rolling their own lighter-weight equivalent to the typing module (in order to avoid importing the full typing module), but duck typed enough to fool the average type checker? Is that possible?
|
msg316391 - (view) |
Author: Eric V. Smith (eric.smith) * |
Date: 2018-05-11 11:04 |
The more I think about this, the more I think Łukasz is correct that just checking for strings starting with "ClassVar", "typing.ClassVar", or "t.ClassVar" is the correct thing to do. This is the change he made in https://github.com/python-attrs/attrs/pull/367.
In some private email I had discussed extracting the module name, if any, from the string annotation, and looking it up in sys.modules. This doesn't buy you much because you have to know how the caller imported typing. If they used "import typing as t", then you can't look up "t" in sys.modules. You could do some horrible frame trick to find out what the caller knew as "t", but that still won't work in plenty of cases. I don't want to use a complex solution that still doesn't work in all cases, and would indeed work in fewer places than just examining the string. The only name we could reliably detect is "typing.ClassVar", all others wouldn't be in sys.modules correctly.
So, that leaves us with just looking at the string and guessing what the caller meant. I think the three strings that Łukasz suggested are good enough.
This will need to be well documented. The documentation should note that things like this will break:
from __future__ import annotations
from typing import ClassVar as CV
@dataclass
class C:
x: CV
x will not be a class variable here, since @dataclass will see the annotation as "CV" and not know that it means typing.ClassVar.
InitVar has a similar problem. What strings to use there? "InitVar" and "dataclasses.InitVar", of course. But maybe "dc.InitVar"? It's hard to say with no in-the-field usage examples to search for. I'll start with just the first two strings.
I really don't want to use regexes here. But a refinement would be to not just use .startswith, and instead use a word boundary. After all, "ClassVarOfMine" should probably not be considered a ClassVar. I'll leave that issue for another day, though.
I'll have a patch ready before the sprints are over.
|
msg316392 - (view) |
Author: Inada Naoki (methane) * |
Date: 2018-05-11 11:35 |
"t.ClassVar" looks ugly...
How about dropping ClassVar support of dataclass module for 3.7?
|
msg316394 - (view) |
Author: Eric V. Smith (eric.smith) * |
Date: 2018-05-11 11:44 |
We can't break the API at this point in the release cycle. But I am open to what string prefixes we should allow.
|
msg316396 - (view) |
Author: Inada Naoki (methane) * |
Date: 2018-05-11 11:55 |
> We can't break the API at this point in the release cycle.
We hadn't release RC, and we hadn't documented dataclass module yet.
How about making "dataclass" module as porvisional state?
I think we've learned lesson that we shouldn't use typing in modules other than typing...
> But I am open to what string prefixes we should allow.
If it is really needed, I think we should only allow "typing.ClassVar".
All other prefixes seems ambiguous.
|
msg316411 - (view) |
Author: Łukasz Langa (lukasz.langa) * |
Date: 2018-05-11 18:47 |
> I think we've learned lesson that we shouldn't use typing in modules other than typing...
This is a blanket statement that as hurtful as it is factually incorrect. Let me address the latter.
1. Dataclasses are entirely dependent on annotations *as described by the typing module*. While it would be entirely possible for users to pass an arbitrary expression as an annotation to denote it's a field of the dataclass, then it's confusing the reader of that code... what's the point? The fact that dataclasses isn't itself importing `typing` is an implementation detail of dataclasses.
2. The problem isn't even specific to `typing` but stringified type annotations in general. If the magic sentinel annotation came from the dataclasses module, it would be just as tricky to get it right as it is now when this sentinel lives in `typing`.
3. singledispatch now also allows registering functions that are type-annotated. In fact, that is a nicer API than the old one. That usage also imports `typing`. I don't see how this is a problem.
> If it is really needed, I think we should only allow "typing.ClassVar". All other prefixes seems ambiguous.
I came up with "typing", "t", and straight "ClassVar" by grepping through typed codebases I work with. I agree that "t" is rather arbitrary so I'm totally fine with leaving that one out.
|
msg316412 - (view) |
Author: Rick Teachey (Ricyteach) * |
Date: 2018-05-11 19:02 |
Lending my voice to the suggestion of limiting the class attribute check to `typing.ClassVar` and `ClassVar`. It can always be expanded later if it is needed.
|
msg316417 - (view) |
Author: Inada Naoki (methane) * |
Date: 2018-05-12 00:16 |
> > I think we've learned lesson that we shouldn't use typing in modules
other than typing...
> This is a blanket statement that as hurtful as it is factually
incorrect. Let me address the latter.
> 1. Dataclasses are entirely dependent on annotations *as described by the
typing module*. While it would be entirely possible for users to pass an
arbitrary expression as an annotation to denote it's a field of the
dataclass, then it's confusing the reader of that code... what's the
point? The fact that dataclasses isn't itself importing `typing` is an
implementation detail of dataclasses.
When I said "use typing module", I meant "using type information from
annotation value" preciously.
Using only name of the annotation is not harmful.
> 2. The problem isn't even specific to `typing` but stringified type
annotations in general. If the magic sentinel annotation came from the
dataclasses module, it would be just as tricky to get it right as it is now
when this sentinel lives in `typing`.
Yes. I think we shouldn't parse annotation value until we can expect
issues like this and estimate performance overhead of
massive calling of get_type_hints() on application startup.
> 3. singledispatch now also allows registering functions that are
type-annotated. In fact, that is a nicer API than the old one. That usage
also imports `typing`. I don't see how this is a problem.
How about performance?
Isn't performance overhead of it is much larger than expected by people who
don't know internal of get_type_hints()?
I agree that annotation could very useful.
But I fear that utilizing annotation value may make ugly hack like
"ClassVar" prefix
or unexpected slow import time. So I want to be very conservative about
utilizing annotation in stdlib.
Once it is happen, it's very hard to remove it from stdlib.
I think we should learn more about runtime usage of type hints in third
party library,
before using it in stdlib.
|
msg316419 - (view) |
Author: Eric V. Smith (eric.smith) * |
Date: 2018-05-12 02:21 |
At this stage in the release cycle, if you really feel strongly about this, you should take it up with Guido directly.
|
msg316481 - (view) |
Author: Eric V. Smith (eric.smith) * |
Date: 2018-05-13 21:40 |
There have been comments on the PR, but I'd like to focus the higher level issue back here. Specifically, see my comment https://github.com/python/cpython/pull/6768#discussion_r187813919
To summarize: I still think string inspections are the best we can do.
I'm going to try to organize a meeting with Guido, Ivan, and Łukasz at the sprints on Monday.
|
msg316523 - (view) |
Author: Eric V. Smith (eric.smith) * |
Date: 2018-05-14 15:46 |
Followup from our meeting at the sprints: we're going to go with inspecting the type annotation string and use heuristics to determine if the type is a ClassVar or InitVar. I'll follow up with more specifics on the approach.
This will obviously need to make it in to the documentation.
|
msg316755 - (view) |
Author: Eric V. Smith (eric.smith) * |
Date: 2018-05-16 02:44 |
New changeset 2a7bacbd913cf2bf568b3c0f85a758946d3cf4e9 by Eric V. Smith in branch 'master':
bpo-33453: Handle string type annotations in dataclasses. (GH-6768)
https://github.com/python/cpython/commit/2a7bacbd913cf2bf568b3c0f85a758946d3cf4e9
|
msg316762 - (view) |
Author: miss-islington (miss-islington) |
Date: 2018-05-16 04:22 |
New changeset c73268aad7645a146b3e0e088c198a1fb74d57ff by Miss Islington (bot) in branch '3.7':
bpo-33453: Handle string type annotations in dataclasses. (GH-6768)
https://github.com/python/cpython/commit/c73268aad7645a146b3e0e088c198a1fb74d57ff
|
|
Date |
User |
Action |
Args |
2022-04-11 14:59:00 | admin | set | github: 77634 |
2020-05-04 11:56:05 | FFY00 | set | nosy:
+ FFY00
|
2018-05-16 07:15:34 | eric.smith | set | status: open -> closed versions:
+ Python 3.8 priority: release blocker -> keywords:
patch, patch resolution: fixed stage: patch review -> resolved |
2018-05-16 04:22:15 | miss-islington | set | nosy:
+ miss-islington messages:
+ msg316762
|
2018-05-16 02:45:40 | miss-islington | set | pull_requests:
+ pull_request6562 |
2018-05-16 02:44:30 | eric.smith | set | messages:
+ msg316755 |
2018-05-14 19:14:04 | gregory.p.smith | set | keywords:
patch, patch nosy:
+ gregory.p.smith
|
2018-05-14 15:46:35 | eric.smith | set | keywords:
patch, patch
messages:
+ msg316523 |
2018-05-13 21:40:08 | eric.smith | set | keywords:
patch, patch
messages:
+ msg316481 |
2018-05-13 01:45:13 | levkivskyi | set | keywords:
patch, patch nosy:
+ levkivskyi
|
2018-05-12 02:21:34 | eric.smith | set | keywords:
patch, patch assignee: eric.smith messages:
+ msg316419
|
2018-05-12 02:19:22 | eric.smith | set | keywords:
+ patch stage: patch review pull_requests:
+ pull_request6455 |
2018-05-12 02:19:20 | eric.smith | set | keywords:
+ patch stage: (no value) pull_requests:
+ pull_request6456 |
2018-05-12 00:16:24 | methane | set | messages:
+ msg316417 |
2018-05-11 19:02:42 | Ricyteach | set | messages:
+ msg316412 |
2018-05-11 18:47:34 | lukasz.langa | set | assignee: eric.smith -> (no value) messages:
+ msg316411 |
2018-05-11 11:55:49 | methane | set | messages:
+ msg316396 |
2018-05-11 11:44:15 | eric.smith | set | messages:
+ msg316394 |
2018-05-11 11:35:32 | methane | set | nosy:
+ methane messages:
+ msg316392
|
2018-05-11 11:04:53 | eric.smith | set | assignee: eric.smith messages:
+ msg316391 |
2018-05-10 13:48:45 | Ricyteach | set | messages:
+ msg316362 |
2018-05-10 08:10:37 | eric.smith | set | messages:
+ msg316351 |
2018-05-10 07:59:16 | eric.smith | set | priority: normal -> release blocker
messages:
+ msg316350 |
2018-05-10 03:44:03 | ned.deily | set | messages:
+ msg316346 |
2018-05-10 03:08:41 | lukasz.langa | set | messages:
+ msg316345 |
2018-05-10 03:08:11 | lukasz.langa | set | messages:
- msg316344 |
2018-05-10 03:07:33 | lukasz.langa | set | nosy:
+ ned.deily messages:
+ msg316344
|
2018-05-09 21:50:52 | eric.smith | set | nosy:
+ lukasz.langa
messages:
+ msg316338 title: from __future__ import annotations breaks dataclasses ClassVar handling -> from __future__ import annotations breaks dataclasses ClassVar and InitVar handling |
2018-05-09 21:40:36 | Ricyteach | set | messages:
+ msg316336 |
2018-05-09 21:39:41 | Ricyteach | set | messages:
+ msg316335 |
2018-05-09 21:38:42 | Ricyteach | create | |