classification
Title: datetime.datetime.replace bypasses a subclass's __new__
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.6, Python 3.5
process
Status: closed Resolution: duplicate
Dependencies: Superseder: datetime.py implementation of .replace inconsistent with C implementation
View: 31222
Assigned To: Nosy List: Andrew.Lutomirski, belopolsky, eddygeek, eltoder, p-ganssle, r.david.murray, serhiy.storchaka
Priority: normal Keywords:

Created on 2014-01-23 19:47 by Andrew.Lutomirski, last changed 2018-12-04 17:21 by belopolsky. This issue is now closed.

Messages (15)
msg208980 - (view) Author: Andrew Lutomirski (Andrew.Lutomirski) Date: 2014-01-23 19:47
I'll admit that what I'm doing is possibly unhealthy.  Nonetheless, I find this behavior *extremely* surprising.  This code:

--- start code ---

import datetime

class my_dt(datetime.datetime):
    def __new__(cls, *args, **kwargs):
        print('In my_dt.__new__')
        return datetime.datetime.__new__(cls, *args, **kwargs)

    def __init__(self, *args, **kwargs):
        print('In my_dt.__init__')
        super(my_dt, self).__init__()

dt = datetime.datetime.now()

print('Create a my_dt')
t = my_dt(dt.year, dt.month, dt.day, dt.hour, dt.minute, dt.second, dt.microsecond, dt.tzinfo)


print('Calling replace')
t2 = t.replace(tzinfo=None)
print('Got a %r' % type(t2))

--- end code ---

results in:

Create a my_dt
In my_dt.__new__
In my_dt.__init__
Calling replace
Got a <class '__main__.my_dt'>

So datetime.datetime.replace will create an object of type my_dt *without calling __new__ or __init__*.  This should (AFAIK) be impossible.

I think that datetime.datetime.replace should either return an actual datetime object or that it should invoke __new__ and probably __init__.
msg208984 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2014-01-23 20:10
I've just quickly looked the issue:

1. There is an inconsistency between python & c implementations: datetime.replace always returns "datetime(...)" object, but should return "self.__class__()"

2. "new_datetime_ex" in c implementation does not call constructors
msg208986 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2014-01-23 20:12
It is actually not surprising if you are familiar with the copy/pickle protocol.  Presumably that should be mentioned in the replace docs, though (assuming I'm right and that is why the behavior occurs).
msg235773 - (view) Author: Edward O (eddygeek) * Date: 2015-02-11 22:19
Here is a workaround for subclasses (2&3-compatible):

--- start code ---

class MyDate(datetime):

    @classmethod
    def fromDT(cls, d):
        """ :type d: datetime """
        return cls(d.year, d.month, d.day, d.hour, d.minute, d.second, d.microsecond, d.tzinfo)

    def replace(self, *args, **kw):
        """ :type other: datetime.timedelta
            :rtype: MyDate """
        return self.fromDT(super(MyDate, self).replace(*args, **kw))

--- end code ---

This is really a bug and not a misuse as datetime was specifically adapted to be subclassing-friendly. From a look at the (python) source, this seems to be the only bit that was forgotten.

The real fix is as per yselivanov comment [and no, this has nothing to do with pickle or copy AFAIK]
msg265398 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2016-05-12 12:28
To summarize, it looks like there are three issues here: (1) the python and C versions are out of sync behavior wise (the python should work like the C in this case, it seems) (2) there are some missing tests, since we are running the same tests against both code bases and not getting any errors and (3) replace is doing its copy in a way that does not actually supporting subclassing (which again speaks to missing tests).
msg265622 - (view) Author: Eugene Toder (eltoder) Date: 2016-05-15 15:40
Properly supporting subclasses in replace() is hard, at least without some cooperation from subclasses. For a proper replace()

x.replace(a=newval).b == x.b

should hold for every b not dependent on a, including ones added by subclasses. That is, it should replicate subclass state. Arguably, ideal replace() would also allow changing attributes defined by subclasses -- otherwise subclasses need to override it anyway, and all this effort was for nothing.

The best I can think of is to assume that subclasses are immutable and all "primary" properties are settable via constructor arguments with the same names. Then replace() can be implemented like this:

def replace(self, *args, **kwargs):
    sig = inspect.signature(self.__new__)
    bound = sig.bind_partial(type(self), *args, **kwargs)
    for arg in sig.parameters:
        if arg not in bound.arguments:
            bound.arguments[arg] = getattr(self, arg)
    return self.__new__(*bound.args, **bound.kwargs)

This will not work for subclasses defined in C, but at least we can show a nice error about that. This will also not work if __new__ uses *args or **kwargs instead of listing every property as its own argument.

(Another approach I can think of is patching properties on self, making a copy of self via __reduce__, and reverting values on self back. This doesn't rely on any signatures, but gets really dirty really quick.)

So I don't know if we want to implement this, or if returning base class from replace() is a better choice.
msg265625 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2016-05-15 15:50
Why can't you make a copy and then patch the changed attributes on the copy?
msg265629 - (view) Author: Eugene Toder (eltoder) Date: 2016-05-15 16:23
Yes, this is similar to my second approach. (You need to copy via __reduce__, because __copy__ my have been overriden to return self.)

The general problem with it, is that if a subclass is designed to be immutable (probably a common case here), it may not handle changing of its attributes well. Attributes will be readonly, so you can't support replace(derived_attr=newval). And while you can change base class' attributes under the covers, doing so may break subclass invariants. E.g. it may have cached hashcode or another property derived from other attributes.

There's also a smaller problem of figuring attribute names for positional arguments of replace() to support derived attributes.
msg265631 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2016-05-15 16:43
Hmm.  The contract is actually that replace returns a *new* instance with the specified values changed.  So I think it would be adequate in this case to simply call the subclass constructor with the values that replace manages, and not worry about anything else.  There is actually no (current) contract to preserve any other values of the existing instance.

A subclass that wants to add other values that can be replaced will have to override replace, which should not be unexpected.
msg265633 - (view) Author: Eugene Toder (eltoder) Date: 2016-05-15 17:02
This seems like it can lead to even more subtle bugs when replace() is not overriden. Currently, any extra state in the subclass is left uninitialized, which usually leads to obvious breakage and is relatively easy to trace back to replace(). (I've done it before.) If we call the constructor passing only base class values, the extra state may get initialized with default values. This is probably not what anyone wants, and is probably harder to debug, because there's no obvious breakage.
msg265648 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-05-15 19:45
This looks as expected behavior to me. namedtuple._replace(), Parameter.replace(), Signature.replace() do it this way.
msg265650 - (view) Author: Eugene Toder (eltoder) Date: 2016-05-15 20:36
namedtuple._replace() actually doesn't call subclass' __new__. It calls tuple.__new__ directly, so it has the same problem as datetime classes.

Parameter and Signature are new in 3.3. I'm not sure if they're expected to be used as base classes.

@r.david.murray: is that contract specified anywhere? The doc says "Return a datetime with the same attributes, except for those attributes given new values by whichever keyword arguments are specified." This doesn't explicitly mention subclasses, but also doesn't mention the possibility of discarding any attribute values.
msg265651 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2016-05-15 21:02
Ah, interesting.  The python version (which is based on the original python code that served as the specification for the C version) has docstrings that explicitly say "new XXX".  The C docstrings and rest docs do not reflect that subtlety.

The docstrings of the two modules and the rest docs should be harmonized.  A single word can make a difference :)

Perhaps this should be brought up on python-dev again, as the issue here clearly applies to more than just datetime.  This in many ways highlights the heart of the subclassing issue.

In the email package I called the method "clone", and all properties that can be changed are accessible as keyword arguments...and in fact that set of properties is explicitly the set of all class non-method attributes.  Policy instance attributes are otherwise read-only.  So I solved the problem by specifying the behavior of the clone method in the face of subclassing, making it an explicit API.  That's more like your first option: specifying the API you need to follow in a subclass in order to support replace.
msg265655 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2016-05-15 21:21
By the way, when I say "again", I'm referring to the "return type of alternative constructors" thread:

  http://www.mail-archive.com/python-dev@python.org/msg92163.html

This is effectively an extension of that discussion, because replace is an alternative constructor, with the additional complication of taking an existing instance as input.

Rereading that thread, I think Guido's answer is that the subclass API needs to be clearly documented:

   http://www.mail-archive.com/python-dev@python.org/msg92191.html

Which I guess means we get to decide what it is, and then figure out how to implement it.
msg331066 - (view) Author: Paul Ganssle (p-ganssle) * (Python committer) Date: 2018-12-04 17:19
This issue was fixed in Python 3.7, and it turns out issue 31222 was a duplicate of it. It can be closed now.
History
Date User Action Args
2018-12-04 17:21:38belopolskysetstatus: open -> closed
superseder: datetime.py implementation of .replace inconsistent with C implementation
resolution: duplicate
stage: needs patch -> resolved
2018-12-04 17:19:29p-gansslesetnosy: + p-ganssle
messages: + msg331066
2016-05-15 21:21:40r.david.murraysetmessages: + msg265655
2016-05-15 21:02:10r.david.murraysetmessages: + msg265651
2016-05-15 20:37:00eltodersetmessages: + msg265650
2016-05-15 19:45:07serhiy.storchakasetmessages: + msg265648
2016-05-15 18:50:07yselivanovsetnosy: - yselivanov
2016-05-15 17:02:56eltodersetmessages: + msg265633
2016-05-15 16:43:22r.david.murraysetmessages: + msg265631
2016-05-15 16:23:25eltodersetmessages: + msg265629
2016-05-15 15:50:35r.david.murraysetmessages: + msg265625
2016-05-15 15:40:54eltodersetnosy: + eltoder, serhiy.storchaka
messages: + msg265622
2016-05-12 12:29:22r.david.murraysetstage: needs patch
type: behavior
versions: + Python 3.5, Python 3.6, - Python 3.3, Python 3.4
2016-05-12 12:28:59r.david.murraysetmessages: + msg265398
2015-02-11 22:19:31eddygeeksetmessages: + msg235773
versions: + Python 3.4
2015-02-11 21:54:08eddygeeksetnosy: + eddygeek
2014-01-23 20:12:59r.david.murraysetnosy: + r.david.murray
messages: + msg208986
2014-01-23 20:10:19yselivanovsetmessages: + msg208984
2014-01-23 20:04:03yselivanovsetnosy: + belopolsky, yselivanov
2014-01-23 19:47:24Andrew.Lutomirskicreate