Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

int.from_bytes() is broken for subclasses #67828

Closed
serhiy-storchaka opened this issue Mar 11, 2015 · 24 comments
Closed

int.from_bytes() is broken for subclasses #67828

serhiy-storchaka opened this issue Mar 11, 2015 · 24 comments
Assignees
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error

Comments

@serhiy-storchaka
Copy link
Member

BPO 23640
Nosy @warsaw, @rhettinger, @facundobatista, @mdickinson, @vstinner, @bitdancer, @skrah, @ethanfurman, @eltoder, @serhiy-storchaka
Files
  • issue23640.stoneleaf.01.patch
  • 0001-int.from_bytes-calls-constructor-for-subclasses.patch
  • 0002-int.from_bytes-calls-constructor-for-subclasses.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/serhiy-storchaka'
    closed_at = <Date 2016-05-12.07:49:10.222>
    created_at = <Date 2015-03-11.13:34:36.466>
    labels = ['interpreter-core', 'type-bug']
    title = 'int.from_bytes() is broken for subclasses'
    updated_at = <Date 2016-05-12.12:23:16.965>
    user = 'https://github.com/serhiy-storchaka'

    bugs.python.org fields:

    activity = <Date 2016-05-12.12:23:16.965>
    actor = 'pitrou'
    assignee = 'serhiy.storchaka'
    closed = True
    closed_date = <Date 2016-05-12.07:49:10.222>
    closer = 'serhiy.storchaka'
    components = ['Interpreter Core']
    creation = <Date 2015-03-11.13:34:36.466>
    creator = 'serhiy.storchaka'
    dependencies = []
    files = ['38444', '38447', '40606']
    hgrepos = []
    issue_num = 23640
    keywords = ['patch']
    message_count = 24.0
    messages = ['237864', '237881', '237887', '237888', '237891', '237896', '239619', '239622', '240053', '240065', '240085', '251750', '251751', '251833', '251863', '252091', '252092', '265104', '265334', '265352', '265355', '265372', '265373', '265397']
    nosy_count = 16.0
    nosy_names = ['barry', 'rhettinger', 'facundobatista', 'mark.dickinson', 'vstinner', 'Arfrever', 'r.david.murray', 'eli.bendersky', 'skrah', 'ethan.furman', 'python-dev', 'eltoder', 'serhiy.storchaka', 'bru', 'juggernaut', 'iyogeshjoshi']
    pr_nums = []
    priority = 'high'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue23640'
    versions = ['Python 3.5', 'Python 3.6']

    @serhiy-storchaka
    Copy link
    Member Author

    Example:

    >>> import socket
    >>> x = socket.AddressFamily.from_bytes(b'\1', 'big')
    >>> type(x)
    <enum 'AddressFamily'>
    >>> int(x)
    1
    >>> str(x)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/home/serhiy/py/cpython/Lib/enum.py", line 464, in __str__
        return "%s.%s" % (self.__class__.__name__, self._name_)
    AttributeError: 'AddressFamily' object has no attribute '_name_'
    >>> repr(x)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/home/serhiy/py/cpython/Lib/enum.py", line 461, in __repr__
        self.__class__.__name__, self._name_, self._value_)
    AttributeError: 'AddressFamily' object has no attribute '_name_'

    @serhiy-storchaka serhiy-storchaka added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Mar 11, 2015
    @ethanfurman
    Copy link
    Member

    The only solution that is coming to mind is to have EnumMeta go through the other base classes, wrap any classmethods it finds, and ensure that they return their appropriate type and not an Enum type.

    Any other ideas?

    @bru
    Copy link
    Mannequin

    bru mannequin commented Mar 11, 2015

    Hi,
    I feel like this behaviour does not only affect IntEnum & related but anything that inherits from int.

    Example:
    >>> class foo(int):
    ...     def __init__(self, value, base=10):
    ...         if value == 2:
    ...             raise ValueError("not that!!")
    ...         super(foo, self).__init__(value, base=base)
    ...
    >>> x = foo.from_bytes(b"\2", "big")
    >>> x
    2
    >>> type(x)
    foo

    2 solutions come to mind:

    • always return an int, and not the type. deleting Objects/longobjects.c:4845,4866 does the job.
    • instantiate the required type with the value as the (sole?) argument, correctly initializing the object to be returned. In Serhyi's example this results in x == AddressFamily.AF_UNIX. the same lines are concerned.

    @serhiy-storchaka
    Copy link
    Member Author

    Not all classmethods are constructors.

    I see two solutions:

    1. Override from_bytes() in IntEnum and every int subclass that needs this.

    2. Change int.from_bytes() to call __new__.

    @ethanfurman
    Copy link
    Member

    I think the classmethod-as-constructor behavior is correct, so it's up to IntEnum (or EnumMeta, or foo, or ...), to work around the issue.

    @bru
    Copy link
    Mannequin

    bru mannequin commented Mar 11, 2015

    I took the liberty of putting together a small patch which makes from_bytes() call & return type(value_found) if type != long (Serhiy's & my 2nd solution).

    Ethan: in that case, type.from_bytes() would always return an int, and not resort to build-a-pseudo-int, right? That way not overriding from_bytes() would not result in strange behaviour.

    @ethanfurman
    Copy link
    Member

    'from_bytes' is a classmethod. As such, it should return the same type as the class it is called on. If that wasn't the intent it would be a staticmethod instead.

    It is the responsibility of the subclass to override base class behavior, not the other way around.

    @bru
    Copy link
    Mannequin

    bru mannequin commented Mar 30, 2015

    I'm not sure how you can have both, those two seem opposite to me:

    • if 'from_bytes' returns the same type as the class it is called on then
      the instantiation of the result object should go through its constructor
      (patch proposed)
    • if the subclass should override base class behaviour then there is no
      reason for Enum.from_bytes to return the same type as the class it is
      called on, and therefore it should be made a classmethod.

    2015-03-30 18:02 GMT+02:00 Ethan Furman <report@bugs.python.org>:

    Ethan Furman added the comment:

    'from_bytes' is a classmethod. As such, it should return the same type as
    the class it is called on. If that wasn't the intent it would be a
    staticmethod instead.

    It is the responsibility of the subclass to override base class behavior,
    not the other way around.

    ----------


    Python tracker <report@bugs.python.org>
    <http://bugs.python.org/issue23640\>


    @ethanfurman
    Copy link
    Member

    With the patch:

      --> import enum
      --> class Huh(enum.IntEnum):
      ...   blah = 2
      ...
      --> Huh.blah.from_bytes(b'\04', 'big')
      Traceback (most recent call last):
        File "<stdin>", line 1, in <module>
        File "/home/ethan/source/python/issue23640/Lib/enum.py", line 222, in __call__
          return cls.__new__(cls, value)
        File "/home/ethan/source/python/issue23640/Lib/enum.py", line 457, in __new__
          raise ValueError("%r is not a valid %s" % (value, cls.__name__))
      ValueError: 4 is not a valid Huh

    This is not the correct behavior. An IntEnum should act like an int, and in cases where it can't and still be an IntEnum, it becomes an int. But this behavior is Enum specific, and I would not expect other int subclasses to need or want that behavior.

    Also, in cases where class methods are alternate constructors there is no requirement that they go through the main __new__/init constructors to do their job.

    In other words, if IntEnum.from_bytes (which is inherited) is not behaving correctly, it is up to IntEnum to fix it -- it is not the job of int, and this is not a bug in int.

    @ethanfurman ethanfurman self-assigned this Apr 4, 2015
    @pitrou
    Copy link
    Member

    pitrou commented Apr 4, 2015

    The fact that derived_int.from_bytes() doesn't call the derived constructor clearly sounds like a bug to me, regardless of whether IntEnum also has its own bugs.

    @serhiy-storchaka
    Copy link
    Member Author

    This bug allows to create new bool instances.

    >>> false = bool.from_bytes(b'\0', 'big')
    >>> true = bool.from_bytes(b'\1', 'big')
    >>> bool(false)
    False
    >>> bool(true)
    True
    >>> false is False
    False
    >>> true is True
    False
    >>> false
    False
    >>> true
    False

    @ethanfurman ethanfurman removed their assignment Jul 21, 2015
    @ethanfurman ethanfurman changed the title Enum.from_bytes() is broken int.from_bytes() is broken for subclasses Jul 21, 2015
    @serhiy-storchaka
    Copy link
    Member Author

    There are similar issues with Decimal.from_float() (C implementation only), chain.from_iterable(), epoll.fromfd() and kqueue.fromfd(). All these alternative constructors don't call __new__ or __init__.

    But float.fromhex() calls the constructor.

    >>> import enum
    >>> class M(float, enum.Enum):
    ...     PI = 3.14
    ... 
    >>> M.PI
    <M.PI: 3.14>
    >>> M.fromhex((3.14).hex())
    <M.PI: 3.14>
    >>> M.fromhex((2.72).hex())
    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "/home/serhiy/py/cpython/Lib/enum.py", line 241, in __call__
        return cls.__new__(cls, value)
    File "/home/serhiy/py/cpython/Lib/enum.py", line 476, in __new__
        raise ValueError("%r is not a valid %s" % (value, cls.__name__))
    ValueError: 2.72 is not a valid M

    And this behavior looks correct to me.

    @serhiy-storchaka
    Copy link
    Member Author

    Here is simplified Bruno's patch with additional tests for float, int, bool and enum. But I'm not sure that correct solution is so simple.

    @vstinner
    Copy link
    Member

    0002-int.from_bytes-calls-constructor-for-subclasses.patch looks good to me, but see my review on Rietveld for 2 minor comments.

    @skrah
    Copy link
    Mannequin

    skrah mannequin commented Sep 29, 2015

    There are similar issues with Decimal.from_float() (C implementation only), chain.from_iterable(), epoll.fromfd() and kqueue.fromfd(). All these alternative constructors don't call __new__ or __init__.

    Could you create new issues? I need a summary. :)

    @juggernaut
    Copy link
    Mannequin

    juggernaut mannequin commented Oct 2, 2015

    hi

    i am ashutosh. I am running kubuntu 14.04.
    before applying the patch. It failed in 2 test i.e

    test___all__
    test_re

    After applying patch. It again gave me 2 error
    test___all__
    test_re

    So,it worked for me.

    @iyogeshjoshi
    Copy link
    Mannequin

    iyogeshjoshi mannequin commented Oct 2, 2015

    I applied patch to module and ran all the tested on Mac Os El Captain, only 4 tests were failing with or without applying patch, following are the tests failing:
    test___all__ test_calendar test_re test_socket

    @ethanfurman
    Copy link
    Member

    Not sure what I was thinking at the time, but several of my comments were supportive of classmethods not calling subclass' new; I actually do not think that, and am +1 on the patch.

    @eltoder
    Copy link
    Mannequin

    eltoder mannequin commented May 11, 2016

    There's a similar issue with replace() methods on date/time/datetime classes. They create instances of derived types without calling derived __new__/init, thus potentially leaving those uninitialized.

    >>> from datetime import date
    >>> class D(date):
    ...   def __init__(self, y, m, d):
    ...     self.y = y
    >>> D(2016,1,1).y
    2016
    >>> D(2016,1,1).replace(2015).y
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    AttributeError: 'D' object has no attribute 'y'

    @bitdancer
    Copy link
    Member

    No, they do not return instances of the derived type, they return instances of the base type. This is an intentional design decision that has been discussed elsewhere (including, I think, somewhere in this tracker).

    @eltoder
    Copy link
    Mannequin

    eltoder mannequin commented May 12, 2016

    They sure do. Check the example in my comment, or if you prefer the source code, it's pretty clear as well: https://hg.python.org/cpython/file/fff0c783d3db/Modules/_datetimemodule.c#l2770

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented May 12, 2016

    New changeset 0af15b8ef3b2 by Serhiy Storchaka in branch '3.5':
    Issue bpo-23640: int.from_bytes() no longer bypasses constructors for subclasses.
    https://hg.python.org/cpython/rev/0af15b8ef3b2

    New changeset df14ea17a517 by Serhiy Storchaka in branch 'default':
    Issue bpo-23640: int.from_bytes() no longer bypasses constructors for subclasses.
    https://hg.python.org/cpython/rev/df14ea17a517

    @serhiy-storchaka
    Copy link
    Member Author

    Discussion on Python-Dev: http://comments.gmane.org/gmane.comp.python.devel/157649 .

    The conclusion is that alternate constructors should return an instance of the subclass, and either call constructor or use pickling machinery for creating an object in correct state.

    Pushed the fix for int.from_bytes() and tests. I'll open separate issues for other alternate constructors (for example bpo-27006 -- for Decimal.from_float()).

    Yes, there is similar bug in date.replace(). C implementation doesn't match Python implementation and bypasses the constructor. Please open separate issue for this Eugene.

    @serhiy-storchaka serhiy-storchaka added interpreter-core (Objects, Python, Grammar, and Parser dirs) and removed stdlib Python modules in the Lib dir labels May 12, 2016
    @bitdancer
    Copy link
    Member

    @eltoder: Ah, you are right, I was looking at the python version. There's an open issue about this, bpo-20371, which indicates that the design decision was to support subclasses. Clearly I was remembering the conversation backward based on looking at the wrong code :)

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants