Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dataclasses.asdict breaks with defaultdict fields #79721

Closed
wrmsr mannequin opened this issue Dec 19, 2018 · 7 comments
Closed

dataclasses.asdict breaks with defaultdict fields #79721

wrmsr mannequin opened this issue Dec 19, 2018 · 7 comments
Assignees
Labels
3.7 (EOL) end of life stdlib Python modules in the Lib dir

Comments

@wrmsr
Copy link
Mannequin

wrmsr mannequin commented Dec 19, 2018

BPO 35540
Nosy @ericvsmith, @ilevkivskyi, @pganssle, @wrmsr, @remilapeyre, @alexcoca, @ryx2, @kwsp
PRs
  • bpo-35540: Add collections.defaultdict support to dataclasses.{asdict,astuple} #11361
  • bpo-35540: Add collections.defaultdict support to dataclasses.{asdict,astuple} #11361
  • bpo-35540: Add collections.defaultdict support to dataclasses.{asdict,astuple} #11361
  • Proof of concept for a class registry in dataclasses.asdict #16356
  • bpo-35540 dataclasses.asdict support defaultdict fields #32056
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/ericvsmith'
    closed_at = None
    created_at = <Date 2018-12-19.22:06:44.140>
    labels = ['3.7', 'library']
    title = 'dataclasses.asdict breaks with defaultdict fields'
    updated_at = <Date 2022-03-22.18:28:58.077>
    user = 'https://github.com/wrmsr'

    bugs.python.org fields:

    activity = <Date 2022-03-22.18:28:58.077>
    actor = 'kwsp'
    assignee = 'eric.smith'
    closed = False
    closed_date = None
    closer = None
    components = ['Library (Lib)']
    creation = <Date 2018-12-19.22:06:44.140>
    creator = 'wrmsr'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 35540
    keywords = ['patch', 'patch', 'patch']
    message_count = 6.0
    messages = ['332166', '332741', '353092', '353116', '390850', '405243']
    nosy_count = 8.0
    nosy_names = ['eric.smith', 'levkivskyi', 'p-ganssle', 'wrmsr', 'remi.lapeyre', 'alexcoca', 'greenfish6', 'kwsp']
    pr_nums = ['11361', '11361', '11361', '16356', '32056']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue35540'
    versions = ['Python 3.7']

    @wrmsr
    Copy link
    Mannequin Author

    wrmsr mannequin commented Dec 19, 2018

    _asdict_inner attempts to manually recursively deepcopy dicts by calling type(obj) with a generator of transformed keyvalue tuples @

    elif isinstance(obj, dict):
    . defaultdicts are dicts so this runs but unlike other dicts their first arg has to be a callable or None:

        import collections
        import dataclasses as dc
    
        @dc.dataclass()
        class C:
            d: dict
    
        c = C(collections.defaultdict(lambda: 3, {}))
        d = dc.asdict(c)
    assert isinstance(d['d'], collections.defaultdict)
    assert d['d']['a'] == 3
    

    =>

        Traceback (most recent call last):
          File "boom.py", line 9, in <module>
            d = dc.asdict(c)
          File "/Users/spinlock/.pyenv/versions/3.7.1/lib/python3.7/dataclasses.py", line 1019, in asdict
            return _asdict_inner(obj, dict_factory)
          File "/Users/spinlock/.pyenv/versions/3.7.1/lib/python3.7/dataclasses.py", line 1026, in _asdict_inner
            value = _asdict_inner(getattr(obj, f.name), dict_factory)
          File "/Users/spinlock/.pyenv/versions/3.7.1/lib/python3.7/dataclasses.py", line 1058, in _asdict_inner
            for k, v in obj.items())
        TypeError: first argument must be callable or None

    I understand that it isn't this bit of code's job to support every dict (and list etc.) subclass under the sun but given defaultdict is stdlib it's imo worth supporting explicitly.

    @wrmsr wrmsr mannequin added 3.7 (EOL) end of life stdlib Python modules in the Lib dir labels Dec 19, 2018
    @ericvsmith ericvsmith self-assigned this Dec 19, 2018
    @remilapeyre
    Copy link
    Mannequin

    remilapeyre mannequin commented Dec 30, 2018

    Hi @wrmsr, this happens because the constructor for collections.defaultdict differs from the one of dict.

    I think the argument that collections.defaultdict is in the stdlib and should be supported is right, the changes in PR bpo-11361 should do what you want.

    @pganssle
    Copy link
    Member

    Considering that namedtuple is special-cased, I think it's reasonable to special-case defaultdict as well, though it may be worth considering more general solutions that will also work for things other than the standard library. One would be to solve this the same way that other "subclasses may have a different constructor" problems are solved (e.g. float, int, formerly datetime) and ignore the subclass (or selectively ignore it if it's a problem), for example changing _asdict_inner to something like this:

    if isinstance(obj, dict):
        new_keys = tuple((_asdict_inner(k, dict_factory),
                          _asdict_inner(v, dict_factory))
                          for k, v in obj.items())
    try:
        return type(obj)(new_keys)
    except Exception:
        return dict(new_keys)
    

    Another more general alternative would be to add a type registry for asdict, either as an additional parameter or with a new transformer class of some sort. I created a quick proof of concept for this in #60560 to see one way it could look.

    In any case I think it's quite unfortunate that we can't easily just support anything that has a deepcopy defined. There may be some crazy solution that involves passing a class with a custom getitem to the memo argument of copy.deepcopy, but if it's even possible (haven't thought about it enough) I'm not sure it's advisable.

    @pganssle
    Copy link
    Member

    I checked and it appears that attrs handles this by creating all dicts using the default dict_factory (similar to my original suggestion of just using dict instead of the specific type), if I'm reading this right: https://github.com/python-attrs/attrs/blob/master/src/attr/_funcs.py#L102

    Using attr.asdict seems to bear this out, as defaultdict attributes are converted to dict when the dict factory is not specified.

    I think changing the default behavior like that would be a backwards-incompatible change at this point (and one that it's really hard to warn about, unfortunately), but we could still use the "fall back to dict_factory" behavior by trying to construct a type(obj)(...) and in the case of an exception return dict_factory(...).

    @alexcoca
    Copy link
    Mannequin

    alexcoca mannequin commented Apr 12, 2021

    I was wondering if this issue is still being tracked for resolution? I found the same bug in Python 3.8.

    @ryx2
    Copy link
    Mannequin

    ryx2 mannequin commented Oct 28, 2021

    I am seeing this bug with 3.9.7

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @ericvsmith
    Copy link
    Member

    Fixed in #32056. Thanks!

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life stdlib Python modules in the Lib dir
    Projects
    Status: Done
    Development

    No branches or pull requests

    2 participants