Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dataclasses should allow frozendict default value #88840

Closed
gianni mannequin opened this issue Jul 19, 2021 · 12 comments
Closed

dataclasses should allow frozendict default value #88840

gianni mannequin opened this issue Jul 19, 2021 · 12 comments
Assignees
Labels
3.9 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@gianni
Copy link
Mannequin

gianni mannequin commented Jul 19, 2021

BPO 44674
Nosy @rhettinger, @ericvsmith, @CCLDArjun
PRs
  • bpo-44674: Use unhashability as a proxy for mutability for default dataclass __init__ arguments. #29867
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/ericvsmith'
    closed_at = <Date 2021-12-11.21:13:04.100>
    created_at = <Date 2021-07-19.14:06:32.276>
    labels = ['type-bug', 'library', '3.9']
    title = 'dataclasses should allow frozendict default value'
    updated_at = <Date 2021-12-12.09:27:52.990>
    user = 'https://bugs.python.org/gianni'

    bugs.python.org fields:

    activity = <Date 2021-12-12.09:27:52.990>
    actor = 'eric.smith'
    assignee = 'eric.smith'
    closed = True
    closed_date = <Date 2021-12-11.21:13:04.100>
    closer = 'eric.smith'
    components = ['Library (Lib)']
    creation = <Date 2021-07-19.14:06:32.276>
    creator = 'gianni'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 44674
    keywords = ['patch']
    message_count = 10.0
    messages = ['397799', '397800', '398098', '398105', '398118', '403103', '403108', '408335', '408365', '408370']
    nosy_count = 4.0
    nosy_names = ['rhettinger', 'eric.smith', 'CCLDArjun', 'gianni']
    pr_nums = ['29867']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue44674'
    versions = ['Python 3.9']

    @gianni
    Copy link
    Mannequin Author

    gianni mannequin commented Jul 19, 2021

    Using a frozendict as a default value should not cause an error in dataclasses. The check for mutability is:

       isinstance(f.default, (list, dict, set))

    It appears frozendict has been changed to have a dict base class and it now raises an exception.

    There should be a way to indicate object mutability as the purpose of the isinstance(f.default, (list, dict, set)) check is for mutable default values.

    Using default_factory to work around this issue is cumbersome.

    @gianni gianni mannequin added build The build process and cross-build 3.9 only security fixes stdlib Python modules in the Lib dir labels Jul 19, 2021
    @ericvsmith
    Copy link
    Member

    I agree that would be an improvement.

    @CCLDArjun
    Copy link
    Mannequin

    CCLDArjun mannequin commented Jul 23, 2021

    Which frozendict does your message concern? I found this in some test:

    class frozendict(dict):
    or are you suggesting any user defined immutable object?

    I'm not sure indicating that an object should be immutable to dataclasses will be less cumbersome than default_factory.

    Currently, you have to do this:

    x: frozendict = field(default_factory=frozendict)

    But, indicating mutability would be something like this:

    x: frozendict = field(mutable=false)

    @ericvsmith
    Copy link
    Member

    When I originally read this, I read it as frozenset. I agree that there's no good way of telling if an arbitrary class is immutable, so I'm not sure we can do anything here.

    So I think we should close this as a rejected idea.

    @gianni
    Copy link
    Mannequin Author

    gianni mannequin commented Jul 24, 2021

    @arjun - this is about default values (See the bug description - Using a frozendict as a default value)

    Try this:

    from frozendict import frozendict
    from dataclasses import dataclass
    
    @dataclass
    class A:
        a: frozendict = frozendict(a=1)

    This used to work until frozendict became a subclass of dict.

    Perhaps another fix is to convert any dict to a frozendict? Maybe not.

    How would you handle this case? The only thing I figured was:
    from frozendict import frozendict, field
    from dataclasses import dataclass

    AD=frozendict(a=1)
    @dataclass
    class A:
        a: frozendict = field(default_factory=lambda:AD)

    Which imho is cumbersome.

    @rhettinger
    Copy link
    Contributor

    [Eric]

    I agree that there's no good way of telling if an
    arbitrary class is immutable, so I'm not sure we can
    do anything here.

    Consider using non-hashability as a proxy indicator for immutability.

    -  isinstance(f.default, (list, dict, set))
    +  f.default.__hash__ is None
    

    While this is imperfect, it would be more reliable than what we have now.

    @ericvsmith
    Copy link
    Member

    That's a good idea, Raymond.

    >>> [x.__hash__ is None for x in (list, dict, set, frozenset)]
    [True, True, True, False]

    I don't think this change would cause any backward compatibility issues, except it would now allow a default of something bad like:

    >>> class BadList(list):
    ...   def __hash__(self): return 0
    ...
    >>> isinstance(BadList(), list), BadList.__hash__ is None
    (True, False)

    I can't say I care too much about now allowing things that didn't used to be allowed, especially if they're poorly designed like BadList.

    I'll put together a PR.

    @ericvsmith ericvsmith added type-bug An unexpected behavior, bug, or error and removed build The build process and cross-build labels Oct 3, 2021
    @ericvsmith
    Copy link
    Member

    New changeset e029c53 by Eric V. Smith in branch 'main':
    bpo-44674: Use unhashability as a proxy for mutability for default dataclass __init__ arguments. (GH-29867)
    e029c53

    @gianni
    Copy link
    Mannequin Author

    gianni mannequin commented Dec 12, 2021

    Excellent. Thanks!

    @ericvsmith
    Copy link
    Member

    @gianni: can you verify that your use case works in 3.11?

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    skshetry added a commit to skshetry/hydra that referenced this issue Oct 15, 2022
    When trying to import hydra in Python 3.11, it throws following error:
    ```python
      File "/home/user/projects/iterative/hydra/hydra/conf/__init__.py", line 75, in JobConf
        @DataClass
         ^^^^^^^^^
      File "/home/user/.pyenv/versions/3.11-dev/lib/python3.11/dataclasses.py", line 1221, in dataclass
        return wrap(cls)
               ^^^^^^^^^
      File "/home/user/.pyenv/versions/3.11-dev/lib/python3.11/dataclasses.py", line 1211, in wrap
        return _process_class(cls, init, repr, eq, order, unsafe_hash,
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/user/.pyenv/versions/3.11-dev/lib/python3.11/dataclasses.py", line 959, in _process_class
        cls_fields.append(_get_field(cls, name, type, kw_only))
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/user/.pyenv/versions/3.11-dev/lib/python3.11/dataclasses.py", line 816, in _get_field
        raise ValueError(f'mutable default {type(f.default)} for field '
    ValueError: mutable default <class 'hydra.conf.JobConf.JobConfig.OverrideDirname'> for field override_dirname is not allowed: use default_factory
    ```
    
    This is due to a recent change in dataclasses in Python 3.11 that
    disallows mutable to be set as a default value. This is done
    via check for hashability.
    
    See python/cpython#88840.
    Note that I am trying to add support for Python 3.11 to hydra. This
    seems like a major blocker.
    
    For reproducibility, I used the following command to search for these:
    ```console
    rg ".*: .* = [A-Z].*\(.*\)"
    ```
    ajyoon added a commit to DigiScore/neoscore that referenced this issue Dec 20, 2022
    @carlosgmartin
    Copy link

    Consider using non-hashability as a proxy indicator for immutability.

    Unhashability is used to approximate mutability.

    Is there a more precise alternative? For example, JAX arrays are immutable but unhashable. The current mutability "check" causes errors like these (and the ones linked above).

    @carlosgmartin
    Copy link

    At least, perhaps the error message could be changed to say

    unhashable default X for field Y is not allowed: use default_factory

    since that's what's actually being checked, to avoid confusion?

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.9 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants