classification
Title: from __future__ import annotations makes dataclasses.Field.type a string, not type
Type: behavior Stage:
Components: Library (Lib) Versions: Python 3.9, Python 3.8, Python 3.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: ARF1, drhagen, eric.smith, lopek
Priority: normal Keywords:

Created on 2020-01-24 11:48 by lopek, last changed 2020-11-19 09:50 by ARF1.

Messages (9)
msg360611 - (view) Author: Wojciech Łopata (lopek) Date: 2020-01-24 11:48
I've checked this behaviour under Python 3.7.5 and 3.8.1.

```
from __future__ import annotations
from dataclasses import dataclass, fields

@dataclass
class Foo:
    x: int

print(fields(Foo)[0].type)
```

With annotations imported, the `type` field of Field class becomes a string with a name of a type, and the program outputs 'int'.

Without annotations, the `type` field of Field class is a type, and the program outputs <class 'int'>.

I found this out when using dataclasses_serialization module. Following code works fine when we remove import of annotations:

```
from __future__ import annotations
from dataclasses import dataclass
from dataclasses_serialization.json import JSONSerializer

@dataclass
class Foo:
    x: int

JSONSerializer.deserialize(Foo, {'x': 42})
```

TypeError: issubclass() arg 1 must be a class
msg360613 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2020-01-24 12:31
Isn't that the entire point of "from __future__ import annotations"?

Also, please show the traceback when reporting errors so that I can see what's going on.
msg360617 - (view) Author: Wojciech Łopata (lopek) Date: 2020-01-24 13:18
> Isn't that the entire point of "from __future__ import annotations"?
I'm not complaining about Foo.__annotations__ storing strings instead of types. I'm complaining about dataclass.Field.type being a string instead of type. I don't think the former needs to imply the latter. I'm trying to access Field objects at runtime, when it should already be possible to resolve the types, as far as I understand.


> Also, please show the traceback when reporting errors so that I can see what's going on.

That's the error I get trying to use dataclasses_serialization module:

$ cat test.py 
from __future__ import annotations
from dataclasses import dataclass
from dataclasses_serialization.json import JSONSerializer

@dataclass
class Foo:
    x: int

JSONSerializer.deserialize(Foo, {'x': 42})
$ python3 test.py 
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/dataclasses_serialization/serializer_base.py", line 125, in dict_to_dataclass
    for fld, fld_type in zip(flds, fld_types)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/dataclasses_serialization/serializer_base.py", line 126, in <dictcomp>
    if fld.name in dct
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/toolz/functoolz.py", line 303, in __call__
    return self._partial(*args, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/dataclasses_serialization/serializer_base.py", line 234, in deserialize
    if issubclass(cls, type_):
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/dataclasses_serialization/serializer_base.py", line 72, in issubclass
    return original_issubclass(cls, classinfo)
TypeError: issubclass() arg 1 must be a class

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "test.py", line 9, in <module>
    JSONSerializer.deserialize(Foo, {'x': 42})
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/toolz/functoolz.py", line 303, in __call__
    return self._partial(*args, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/dataclasses_serialization/serializer_base.py", line 238, in deserialize
    return self.deserialization_functions[dataclass](cls, serialized_obj)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/toolz/functoolz.py", line 303, in __call__
    return self._partial(*args, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/dataclasses_serialization/serializer_base.py", line 131, in dict_to_dataclass
    cls
dataclasses_serialization.serializer_base.DeserializationError: Missing one or more required fields to deserialize {'x': 42} as <class '__main__.Foo'>
msg360628 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2020-01-24 16:33
Well the type comes from the annotation, so this makes sense to me. If dataclasses were to call get_type_hints() for every field, it would defeat the purpose of PEP 563 (at least for dataclasses).
msg360640 - (view) Author: David Hagen (drhagen) Date: 2020-01-24 19:03
Should `dataclass.Field.type` become a property that evaluates the annotation at runtime much in the same way that `get_type_hints` works?
msg360641 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2020-01-24 19:27
> Should `dataclass.Field.type` become a property that evaluates the annotation at runtime much in the same way that `get_type_hints` works?

I think not. But maybe a function that evaluates all of the field types. Or maybe an @dataclass parameter to cause it to happen at definition time.

At this point, this seems more like fodder for python-ideas.
msg360760 - (view) Author: Wojciech Łopata (lopek) Date: 2020-01-27 12:46
I thought of this behaviour as a bug, because PEP 563 mentions breaking  "applications depending on arbitrary objects to be directly present in annotations", while it is also breaking users of dataclasses.fields(), that is a part of the standard library. But if it's not something worth fighting for, feel free to close this issue.
msg381398 - (view) Author: ARF1 (ARF1) Date: 2020-11-19 09:25
One problem I have with the current behaviour is that users of library code need to know the exact namespace in which a library has defined a dataclass.

An example is if a library writer had to deconflict the name of a type he used in a user-facing dataclass.

Below is a "typical" use case which will become very fragile to implement.(E.g. imagine the dataclass with dynamically generated fields, the implementation of which I have neglected for the sake of brevity.)


=== some_library_typing.py ===
mytype = str  # library author defines some type alias


=== some_library_module_a.py ===
from __future__ import annotations
import dataclasses
from some_library_typing import mytype as mytype_deconflicted

mytype = int

@dataclasses.dataclass
class MyClass:
    var1: mytype_deconflicted = 'foo'

    def method1(self, val: mytype) -> mytype:
        return val + 1


=== user_code.py ===
from __future__ import annotations
import dataclasses
from some_library_typing import mytype
from some_library_module_a import MyClass

inst = MyClass('bar')

for f in dataclasses.fields(inst):
    if f.type is mytype:
        print('mytype found')
        break
else:
    print('mytype not found')


The `if f.type is mytype` comparison obviously won't work any more. But neither will `if f.type == 'mytype'`. The user will have to be aware that the library author had to deconflict the identifier `mytype` to `mytype_deconflicted` to write his code.

Of course, the library writer could have written the following to make the code work:

=== some_library_module_a.py ===
from __future__ import annotations
import dataclasses
from some_library_typing import mytype as mytype_deconflicted

mytype = int

@dataclasses.dataclass
class MyClass:
    var1: mytype = 'foo'

    def method1(self, val: mytype)
        return val + 1

That is a phenomenally obscure and counter-intuitive way of writing code!

Whichever way one turns this, the current behaviour either seems to require library authors to take extraordinary care with their namespaces when defining dataclasses or forces them to write hard-to-read code or seems to require from users detailed knowledge about the implementation specifics of a library they use.

If this behaviour is kept as is, some clear warnings and guidance on how to deal with this in practice should be given in the docs. From what I can see in the 3.10 docs, that is not yet the case.
msg381399 - (view) Author: ARF1 (ARF1) Date: 2020-11-19 09:50
Another counter-intuitive behaviour is the different behaviour of dataclasses depending on whether they were defined with the decorator or the make_dataclass factory method:


from __future__ import annotations
import dataclasses

mytype = int

@dataclasses.dataclass
class MyClass1:
    foo: mytype = 1

MyClass2 = dataclasses.make_dataclass(
    f'MyClass2',
    [('foo', mytype, 1)]
)

print(dataclasses.fields(MyClass1)[0].type)
print(dataclasses.fields(MyClass2)[0].type)


Results in:

mytype
<class 'int'>
History
Date User Action Args
2020-11-19 09:50:09ARF1setmessages: + msg381399
2020-11-19 09:25:34ARF1setnosy: + ARF1

messages: + msg381398
versions: + Python 3.9
2020-01-27 12:46:43lopeksetmessages: + msg360760
2020-01-24 19:27:13eric.smithsetmessages: + msg360641
2020-01-24 19:03:40drhagensetnosy: + drhagen
messages: + msg360640
2020-01-24 16:33:59eric.smithsetmessages: + msg360628
2020-01-24 13:18:59lopeksetmessages: + msg360617
2020-01-24 12:31:24eric.smithsetmessages: + msg360613
2020-01-24 12:01:19xtreaksetnosy: + eric.smith
2020-01-24 11:49:56lopeksettitle: from __future__ import annotations breaks dataclasses.Field.type -> from __future__ import annotations makes dataclasses.Field.type a string, not type
2020-01-24 11:48:40lopekcreate