Author tim.peters
Recipients larry, mark.dickinson, r.david.murray, tbarbugli, tim.peters, trcarden, vivanov, vstinner
Date 2015-08-21.02:14:47
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1440123288.56.0.707573606269.issue23517@psf.upfronthosting.co.za>
In-reply-to
Content
It is really bad that roundtripping current microsecond datetimes doesn't work.  About half of all microsecond-resolution datetimes fail to roundtrip correctly now.  While the limited precision of a C double guarantees roundtripping of microsecond datetimes "far enough" in the future will necessarily fail, that point is about 200 years from now.

Rather than argue endlessly about rounding, it's possible instead to make the tiniest possible change to the timestamp _produced_ at the start.  Here's code explaining it:

    ts = d.timestamp()
    # Will microseconds roundtrip correctly?  For times far
    # enough in the future, there aren't enough bits in a C
    # double for that to always work.  But for years through
    # about 2241, there are enough bits.  How does it fail
    # before then?  Very few microsecond datetimes are exactly
    # representable as a binary float.  About half the time, the
    # closest representable binary float is a tiny bit less than
    # the decimal value, and that causes truncating 1e6 times
    # the fraction to be 1 less than the original microsecond
    # value.
    if int((ts - int(ts)) * 1e6) != d.microsecond:
        # Roundtripping fails.  Add 1 ulp to the timestamp (the
        # tiniest possible change) and see whether that repairs
        # it.  It's enough of a change until doubles just plain
        # run out of enough bits.
        mant, exp = math.frexp(ts)
        ulp = math.ldexp(0.5, exp - 52)
        ts2 = ts + ulp
        if int((ts2 - int(ts2)) * 1e6) == d.microsecond:
            ts = ts2
        else:
            # The date is so late in time that a C double's 53
            # bits of precision aren't sufficient to represent
            # microseconds faithfully.  Leave the original
            # timestamp alone.
            pass
    # Now ts exactly reproduces the original datetime,
    # if that's at all possible.

This assumes timestamps are >= 0, and that C doubles have 53 bits of precision.  Note that because a change of 1 ulp is the smallest possible change for a C double, this cannot make closest-possible unequal datetimes produce out-of-order after-adjustment timestamps.

And, yes, this sucks ;-)  But it's far better than having half of timestamps fail to convert back for the next two centuries.  Alas, it does nothing to get the intended datetime from a microsecond-resolution timestamp produced _outside_ of Python.  That requires rounding timestamps on input - which would be a better approach.

Whatever theoretical problems may exist with rounding, the change to use truncation here is causing real problems now.  Practicality beats purity.
History
Date User Action Args
2015-08-21 02:14:48tim.peterssetrecipients: + tim.peters, mark.dickinson, vstinner, larry, r.david.murray, vivanov, tbarbugli, trcarden
2015-08-21 02:14:48tim.peterssetmessageid: <1440123288.56.0.707573606269.issue23517@psf.upfronthosting.co.za>
2015-08-21 02:14:48tim.peterslinkissue23517 messages
2015-08-21 02:14:47tim.peterscreate