This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Add a way to customize iteration over fields in asdict() for the nested dataclasses
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.7
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: eric.smith Nosy List: eric.smith, mkurnikov, rhettinger
Priority: normal Keywords:

Created on 2018-08-14 23:11 by mkurnikov, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (7)
msg323542 - (view) Author: (mkurnikov) * Date: 2018-08-14 23:11
Suppose I have two dataclasses:

@dataclass
class NestedDataclass(object):
    name: str
    options: Dict[str, Any] = field(default_factory=dict)

@dataclass
class RootDataclass(object):
    nested_list: List[NestedDataclass]

I want a dict under the key "options" to be merged in the NestedDataclass dict in the dataclasses.asdict(root_dcls_instance). 

For that, according to docs, I need to specify dict_factory= for dataclasses.asdict() function. 

The problem is that, according to the implementation, when this function "meets" dataclass, there's no way to customize how result dict will be built. Dataclass itself is never passed to the function. 

    if _is_dataclass_instance(obj):
        result = []
        for f in fields(obj):
            value = _asdict_inner(getattr(obj, f.name), dict_factory)
            result.append((f.name, value))
        return dict_factory(result)

Yes, I can catch "result" obj (what I did in the end):

def root_dataclass_dict_factory(obj):
    if isinstance(obj, list):
        dataclass_dict = dict(obj)
        if 'options' in dataclass_dict:
            dataclass_dict.update(dataclass_dict.pop('options'))

    return dict(obj)

The problem is that type of the dataclass is lost for the list, and if by any chance later I'll have "options" key in the RootDataclass, there's no way for me to distinguish between them cleanly. 

Other solution is to iterate over the RootDataclass dictionary, follow the path to the NestedDataclass and change dictionary there, but it even uglier. 

Would be nice to be able to somehow hook into the field traversal of the dataclass instance.
msg323543 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2018-08-14 23:31
Could you show some example dataclass instances? Also, show the output you get with asdict(), and show what output you'd like to get instead.

I'm not sure I understand it correctly from the description you've given.

Thanks!
msg323544 - (view) Author: (mkurnikov) * Date: 2018-08-14 23:53
from pprint import pprint
from typing import List, Any, Dict

import dataclasses
from dataclasses import field


def service_interface_dict_factory(obj: Any) -> Dict[str, Any]:
    print(type(obj)) # <- type(obj) here is a list, but there's no way to understand whether it's a ServiceInterface or
    # InputVar except for looking for the presence of certain keys which is not very convenient
    return dict(obj)


@dataclasses.dataclass
class InputVar(object):
    name: str
    required: bool = False
    options: Dict[str, Any] = field(default_factory=dict)


@dataclasses.dataclass
class ServiceInterface(object):
    input: List[InputVar] = field(default_factory=list)


if __name__ == '__main__':
    inputvar_inst = InputVar(name='myinput',
                             required=False,
                             options={'default': 'mytext'})
    interface = ServiceInterface(input=[inputvar_inst])

    outdict = dataclasses.asdict(interface, dict_factory=service_interface_dict_factory)
    print('outdict', end=' ')
    pprint(outdict)

    # prints:
    # outdict {'input': [{'name': 'myinput',
    #         'options': {'default': 'mytext'},
    #         'required': False}]}

    # desirable output
    # {'input': [{
    #     'name': 'myinput',
    #     'required': False,
    #     'default': 'mytext'
    # }]}
    # "default" key moved to the root of the dictionary (inside list)
msg325053 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2018-09-11 19:01
I've been thinking about this, but I don't have a suggestion on how to improve the API. Maybe some sort of visitor pattern? I'm open to concrete ideas.
msg325065 - (view) Author: (mkurnikov) * Date: 2018-09-11 21:18
Cleanest thing I could think of is:

1. Extract dataclass_to_dict function from _asdict_inner as:

def dataclass_asdict(obj, dict_factory):
    result = []
    for f in fields(obj):
        value = _asdict_inner(getattr(obj, f.name), dict_factory)
        result.append((f.name, value))
    return dict_factory(result)

2. Add "asdict" parameter to the dataclass decorator (with default value of dataclass_to_dict function)

@dataclass(asdict=specific_dcls_dict_factory)
class MyDataclass:
    pass

3. Change check to 

def _asdict_inner(obj, dict_factory):
    if _is_dataclass_instance(obj):
        return getattr(obj, _PARAMS).asdict(obj) 

    # ... other code


Other solution could be to add parameter "directly_serializable"(smth like that), add check for this parameter

def _asdict_inner(obj, dict_factory):
    if _is_dataclass_instance(obj):
        if getattr(obj, _PARAMS).directly_serializable: 
            return dict_factory(obj)

    # ... other code

and force user to process everything in dict_factory function.
msg325153 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2018-09-12 18:21
I recommend passing on this feature request as being too specialized and beyond the scope of what data classes are intended to do.  

FWIW, the information needed by a user to write their own customized iteration patterns is publicly available.  One of the reasons for having a general purpose programming language is to enable users to write customizations that best fit their particular use cases.
msg325155 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2018-09-12 18:28
Thanks, Raymond. I agree that this request is too specialized to add to dataclasses. Any proposal here or that I've been able to think of complicate the API for the much more common use case of not needing asdict() specialization.

To the original poster: I suggest you just implement the functionality you need in a custom version of asdict() yourself. As noted, all of the information needed for a custom asdict() are publicly available.
History
Date User Action Args
2022-04-11 14:59:04adminsetgithub: 78590
2018-09-12 18:28:08eric.smithsetstatus: open -> closed
resolution: rejected
messages: + msg325155

stage: resolved
2018-09-12 18:21:08rhettingersetnosy: + rhettinger
messages: + msg325153
2018-09-11 21:18:48mkurnikovsetmessages: + msg325065
2018-09-11 19:01:53eric.smithsetmessages: + msg325053
2018-08-14 23:53:57mkurnikovsetmessages: + msg323544
2018-08-14 23:31:34eric.smithsetassignee: eric.smith
2018-08-14 23:31:22eric.smithsetnosy: + eric.smith

messages: + msg323543
versions: + Python 3.7, - Python 3.6
2018-08-14 23:11:20mkurnikovcreate