Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

datetime.datetime.replace bypasses a subclass's __new__ #64570

Closed
AndrewLutomirski mannequin opened this issue Jan 23, 2014 · 15 comments
Closed

datetime.datetime.replace bypasses a subclass's __new__ #64570

AndrewLutomirski mannequin opened this issue Jan 23, 2014 · 15 comments
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@AndrewLutomirski
Copy link
Mannequin

AndrewLutomirski mannequin commented Jan 23, 2014

BPO 20371
Nosy @abalkin, @bitdancer, @eltoder, @serhiy-storchaka, @pganssle
Superseder
  • bpo-31222: datetime.py implementation of .replace inconsistent with C implementation
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2018-12-04.17:21:38.025>
    created_at = <Date 2014-01-23.19:47:24.110>
    labels = ['type-bug', 'library']
    title = "datetime.datetime.replace bypasses a subclass's __new__"
    updated_at = <Date 2018-12-04.17:21:38.025>
    user = 'https://bugs.python.org/AndrewLutomirski'

    bugs.python.org fields:

    activity = <Date 2018-12-04.17:21:38.025>
    actor = 'belopolsky'
    assignee = 'none'
    closed = True
    closed_date = <Date 2018-12-04.17:21:38.025>
    closer = 'belopolsky'
    components = ['Library (Lib)']
    creation = <Date 2014-01-23.19:47:24.110>
    creator = 'Andrew.Lutomirski'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 20371
    keywords = []
    message_count = 15.0
    messages = ['208980', '208984', '208986', '235773', '265398', '265622', '265625', '265629', '265631', '265633', '265648', '265650', '265651', '265655', '331066']
    nosy_count = 7.0
    nosy_names = ['belopolsky', 'r.david.murray', 'eltoder', 'serhiy.storchaka', 'Andrew.Lutomirski', 'eddygeek', 'p-ganssle']
    pr_nums = []
    priority = 'normal'
    resolution = 'duplicate'
    stage = 'resolved'
    status = 'closed'
    superseder = '31222'
    type = 'behavior'
    url = 'https://bugs.python.org/issue20371'
    versions = ['Python 3.5', 'Python 3.6']

    @AndrewLutomirski
    Copy link
    Mannequin Author

    AndrewLutomirski mannequin commented Jan 23, 2014

    I'll admit that what I'm doing is possibly unhealthy. Nonetheless, I find this behavior *extremely* surprising. This code:

    --- start code ---

    import datetime
    
    class my_dt(datetime.datetime):
        def __new__(cls, *args, **kwargs):
            print('In my_dt.__new__')
            return datetime.datetime.__new__(cls, *args, **kwargs)
    
        def __init__(self, *args, **kwargs):
            print('In my_dt.__init__')
            super(my_dt, self).__init__()
    
    dt = datetime.datetime.now()
    
    print('Create a my_dt')
    t = my_dt(dt.year, dt.month, dt.day, dt.hour, dt.minute, dt.second, dt.microsecond, dt.tzinfo)
    
    
    print('Calling replace')
    t2 = t.replace(tzinfo=None)
    print('Got a %r' % type(t2))

    --- end code ---

    results in:

    Create a my_dt
    In my_dt.__new__
    In my_dt.__init__
    Calling replace
    Got a <class '__main__.my_dt'>

    So datetime.datetime.replace will create an object of type my_dt *without calling __new__ or __init__*. This should (AFAIK) be impossible.

    I think that datetime.datetime.replace should either return an actual datetime object or that it should invoke __new__ and probably __init__.

    @AndrewLutomirski AndrewLutomirski mannequin added the stdlib Python modules in the Lib dir label Jan 23, 2014
    @1st1
    Copy link
    Member

    1st1 commented Jan 23, 2014

    I've just quickly looked the issue:

    1. There is an inconsistency between python & c implementations: datetime.replace always returns "datetime(...)" object, but should return "self.__class__()"

    2. "new_datetime_ex" in c implementation does not call constructors

    @bitdancer
    Copy link
    Member

    It is actually not surprising if you are familiar with the copy/pickle protocol. Presumably that should be mentioned in the replace docs, though (assuming I'm right and that is why the behavior occurs).

    @eddygeek
    Copy link
    Mannequin

    eddygeek mannequin commented Feb 11, 2015

    Here is a workaround for subclasses (2&3-compatible):

    --- start code ---

    class MyDate(datetime):
    
        @classmethod
        def fromDT(cls, d):
            """ :type d: datetime """
            return cls(d.year, d.month, d.day, d.hour, d.minute, d.second, d.microsecond, d.tzinfo)
    
        def replace(self, *args, **kw):
            """ :type other: datetime.timedelta
                :rtype: MyDate """
            return self.fromDT(super(MyDate, self).replace(*args, **kw))

    --- end code ---

    This is really a bug and not a misuse as datetime was specifically adapted to be subclassing-friendly. From a look at the (python) source, this seems to be the only bit that was forgotten.

    The real fix is as per yselivanov comment [and no, this has nothing to do with pickle or copy AFAIK]

    @bitdancer
    Copy link
    Member

    To summarize, it looks like there are three issues here: (1) the python and C versions are out of sync behavior wise (the python should work like the C in this case, it seems) (2) there are some missing tests, since we are running the same tests against both code bases and not getting any errors and (3) replace is doing its copy in a way that does not actually supporting subclassing (which again speaks to missing tests).

    @bitdancer bitdancer added the type-bug An unexpected behavior, bug, or error label May 12, 2016
    @eltoder
    Copy link
    Mannequin

    eltoder mannequin commented May 15, 2016

    Properly supporting subclasses in replace() is hard, at least without some cooperation from subclasses. For a proper replace()

    x.replace(a=newval).b == x.b

    should hold for every b not dependent on a, including ones added by subclasses. That is, it should replicate subclass state. Arguably, ideal replace() would also allow changing attributes defined by subclasses -- otherwise subclasses need to override it anyway, and all this effort was for nothing.

    The best I can think of is to assume that subclasses are immutable and all "primary" properties are settable via constructor arguments with the same names. Then replace() can be implemented like this:

    def replace(self, *args, **kwargs):
        sig = inspect.signature(self.__new__)
        bound = sig.bind_partial(type(self), *args, **kwargs)
        for arg in sig.parameters:
            if arg not in bound.arguments:
                bound.arguments[arg] = getattr(self, arg)
        return self.__new__(*bound.args, **bound.kwargs)

    This will not work for subclasses defined in C, but at least we can show a nice error about that. This will also not work if __new__ uses *args or **kwargs instead of listing every property as its own argument.

    (Another approach I can think of is patching properties on self, making a copy of self via __reduce__, and reverting values on self back. This doesn't rely on any signatures, but gets really dirty really quick.)

    So I don't know if we want to implement this, or if returning base class from replace() is a better choice.

    @bitdancer
    Copy link
    Member

    Why can't you make a copy and then patch the changed attributes on the copy?

    @eltoder
    Copy link
    Mannequin

    eltoder mannequin commented May 15, 2016

    Yes, this is similar to my second approach. (You need to copy via __reduce__, because __copy__ my have been overriden to return self.)

    The general problem with it, is that if a subclass is designed to be immutable (probably a common case here), it may not handle changing of its attributes well. Attributes will be readonly, so you can't support replace(derived_attr=newval). And while you can change base class' attributes under the covers, doing so may break subclass invariants. E.g. it may have cached hashcode or another property derived from other attributes.

    There's also a smaller problem of figuring attribute names for positional arguments of replace() to support derived attributes.

    @bitdancer
    Copy link
    Member

    Hmm. The contract is actually that replace returns a *new* instance with the specified values changed. So I think it would be adequate in this case to simply call the subclass constructor with the values that replace manages, and not worry about anything else. There is actually no (current) contract to preserve any other values of the existing instance.

    A subclass that wants to add other values that can be replaced will have to override replace, which should not be unexpected.

    @eltoder
    Copy link
    Mannequin

    eltoder mannequin commented May 15, 2016

    This seems like it can lead to even more subtle bugs when replace() is not overriden. Currently, any extra state in the subclass is left uninitialized, which usually leads to obvious breakage and is relatively easy to trace back to replace(). (I've done it before.) If we call the constructor passing only base class values, the extra state may get initialized with default values. This is probably not what anyone wants, and is probably harder to debug, because there's no obvious breakage.

    @serhiy-storchaka
    Copy link
    Member

    This looks as expected behavior to me. namedtuple._replace(), Parameter.replace(), Signature.replace() do it this way.

    @eltoder
    Copy link
    Mannequin

    eltoder mannequin commented May 15, 2016

    namedtuple._replace() actually doesn't call subclass' __new__. It calls tuple.__new__ directly, so it has the same problem as datetime classes.

    Parameter and Signature are new in 3.3. I'm not sure if they're expected to be used as base classes.

    @r.david.murray: is that contract specified anywhere? The doc says "Return a datetime with the same attributes, except for those attributes given new values by whichever keyword arguments are specified." This doesn't explicitly mention subclasses, but also doesn't mention the possibility of discarding any attribute values.

    @bitdancer
    Copy link
    Member

    Ah, interesting. The python version (which is based on the original python code that served as the specification for the C version) has docstrings that explicitly say "new XXX". The C docstrings and rest docs do not reflect that subtlety.

    The docstrings of the two modules and the rest docs should be harmonized. A single word can make a difference :)

    Perhaps this should be brought up on python-dev again, as the issue here clearly applies to more than just datetime. This in many ways highlights the heart of the subclassing issue.

    In the email package I called the method "clone", and all properties that can be changed are accessible as keyword arguments...and in fact that set of properties is explicitly the set of all class non-method attributes. Policy instance attributes are otherwise read-only. So I solved the problem by specifying the behavior of the clone method in the face of subclassing, making it an explicit API. That's more like your first option: specifying the API you need to follow in a subclass in order to support replace.

    @bitdancer
    Copy link
    Member

    By the way, when I say "again", I'm referring to the "return type of alternative constructors" thread:

    http://www.mail-archive.com/python-dev@python.org/msg92163.html

    This is effectively an extension of that discussion, because replace is an alternative constructor, with the additional complication of taking an existing instance as input.

    Rereading that thread, I think Guido's answer is that the subclass API needs to be clearly documented:

    http://www.mail-archive.com/python-dev@python.org/msg92191.html

    Which I guess means we get to decide what it is, and then figure out how to implement it.

    @pganssle
    Copy link
    Member

    pganssle commented Dec 4, 2018

    This issue was fixed in Python 3.7, and it turns out bpo-31222 was a duplicate of it. It can be closed now.

    @abalkin abalkin closed this as completed Dec 4, 2018
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants