`tzinfo.fromutc()` fails when used for a fold-aware tzinfo implementation #72788

pganssle · 2016-11-03T18:55:50Z

BPO	28602
Nosy	@tim-one, @abalkin, @pganssle
PRs	bpo-28602: [WIP] Add fold support to fromutc #7425

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = 'https://github.com/abalkin'
closed_at = None
created_at = <Date 2016-11-03.18:55:50.016>
labels = ['type-feature', '3.8']
title = '`tzinfo.fromutc()` fails when used for a fold-aware tzinfo implementation'
updated_at = <Date 2018-06-05.18:37:55.782>
user = 'https://github.com/pganssle'

bugs.python.org fields:

activity = <Date 2018-06-05.18:37:55.782>
actor = 'belopolsky'
assignee = 'belopolsky'
closed = False
closed_date = None
closer = None
components = []
creation = <Date 2016-11-03.18:55:50.016>
creator = 'p-ganssle'
dependencies = []
files = []
hgrepos = []
issue_num = 28602
keywords = ['patch']
message_count = 10.0
messages = ['280007', '280009', '280011', '280012', '280014', '280015', '318743', '318772', '318773', '318780']
nosy_count = 3.0
nosy_names = ['tim.peters', 'belopolsky', 'p-ganssle']
pr_nums = ['7425']
priority = 'normal'
resolution = None
stage = 'patch review'
status = 'open'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue28602'
versions = ['Python 3.8']

pganssle · 2016-11-03T18:55:50Z

After PEP-495, the default value for non-fold-aware datetimes is that they return the DST side, not the STD side (as was the assumption before PEP-495). This invalidates an assumption made in tz.fromutc(). See lines 991-1000 of datetime.py:

    dtdst = dt.dst()
    if dtdst is None:
        raise ValueError("fromutc() requires a non-None dst() result")
    delta = dtoff - dtdst
    if delta:
        dt += delta
        dtdst = dt.dst()
        if dtdst is None:
            raise ValueError("fromutc(): dt.dst gave inconsistent "
                             "results; cannot convert")

Line 997 (

cpython/Lib/datetime.py

Line 997 in be8de92

dtdst = dt.dst()

) assumes that an ambiguous datetime will return the STD side, not the DST side, and as a result, if you feed it a date in UTC that should resolve to the STD side, it will actually return 1 hour (or whatever the DST offset is) AFTER the ambiguous date that should be returned.

If 997 is changed to:

    dtdst = dt.replace(fold=1).dst()

That will not affect fold-naive zones (which are instructed to ignore the fold parameter) and restore the original behavior. This will allow fold-aware timezones to be implemented as a wrapper around fromutc() rather than a complete re-implementation, e.g.:

class FoldAwareTzInfo(datetime.tzinfo):
    def fromutc(self, dt):
        dt_wall = super(FoldAwareTzInfo, self).fromutc(dt)

        is_fold = self._get_fold_status(dt, dt_wall)

    return dt_wall.replace(fold=is_fold)

abalkin · 2016-11-03T19:16:18Z

I don't think timezones that satisfy the invariant expected by the default fromutc() is common enough that we need to provide special support for them. Note that in most cases it is the UTC to local conversion that is straightforward while Local to UTC is tricky, so the code that reduces a simple task to a harder one has questionable utility.

pganssle · 2016-11-03T19:26:01Z

Of the tzinfo implementations provided by python-dateutil, tzrange, tzstr (GNU TZ strings), tzwin (Windows style time zones) and tzlocal all satisfy this condition. These are basically all implementations of default system time zone information.

With current implementations tzfile and tzical also use the invariant algorithm, though theoretically there are edge cases where this will cause problems, and those should have their fromutc() adjusted.

In any case, I can't think of a single actual downside to this change - all it does is preserve the original behavior of fromutc(). As currently implemented, the algorithm is simply wrong when dst() is affected by fold, and this change would have no effect on zones where dst() is not affected by fold.

abalkin · 2016-11-03T19:31:15Z

Paul G at Github:

"""
To be clear, I'm just saying that fromutc() goes from returning something meaningful (the correct date and time, with no indication of what side of the fold it's on) to something useless (an incorrect date and time) in the case where you're on the STD side of the divide. That change would restore the original behavior.

For most of the tzinfo implementations I'm providing (tzrange, tzwin, etc), there's no problem with an invariant standard time offset, so I'd prefer to fall back on the default algorithm in those cases.

Another advantage with using the standard algorithm as a starting point is that it all the type checking and such that's done in fromutc is done at that level, and I don't have to keep track of handling that.

All that said, it's not a huge deal either way.
"""

Since it is not a big issue for you, I would rather not reopen this can of worms. It may be better to return a clearly erroneous result on fold-aware tzinfos to push the developers like you in the right direction. :-)

After all, how much effort would it save for you in dateutil if you could reuse the base class fromutc?

pganssle · 2016-11-03T19:43:24Z

After all, how much effort would it save for you in dateutil if you could reuse the base class fromutc?

Realistically, this saves me nothing since I have to re-implement it anyway in in all versions <= Python 3.6 (basically just the exact same algorithm with line 997 replaced with enfold(dt, fold=1) rather than dt.replace(fold=1), but I'd rather it fall back to the standard fromutc() in fold-aware versions of Python 3.6.

That said, I don't see how it's a big can of worms to open. If you're going to provide fromutc() functionality, it should not be deliberately broken. As I mentioned above, I see no actual downside in having fromutc() actually work as advertised and as intended.

abalkin · 2016-11-03T19:49:50Z

I can't think of a single actual downside to this change - all it does is preserve the original behavior of fromutc().

You've got me on the fence here. If what you are saying is correct, it would make sense to make this change and better do it before 3.6 is out, but it would take me some effort to convince myself that an implementation that reuses patched fromutc() is indeed correct.

Why don't you implement tzrange.fromutc() in terms of say tzrange.simple_fromutc() which is your own patched version of tzinfo.fromutc(). If this passes your tests and is simpler than a straightforward fromutc() that does not have to look at tzinfo through the needle hole of utcoffset()/dst() pair but knows the internals of your timezone object, we can consider promoting your simple_fromutc() to default stdlib implementation.

Alternatively, you can just convince Tim. :-)

pganssle · 2018-06-05T12:32:45Z

So I created a little patch for this, and when I was working on it I realized that it's actually possible to implement full fold support in a generic way in fromutc under the assumption that utcoffset - dst holds at all points (which is the fundamental assumption of the builtin fromutc function).

It adds a bit of overhead, but I think any current fold-aware operations need to be implemented in pure Python anyway (and involve even more overhead), so I think it would be a net gain.

The big downside I see here is that it cannot take advantage of any sort of "is_ambiguous" function and has to rely on the method of "change the fold and see if the answer is different" to determine ambiguity. I think this is a small cost to pay for a generic implementation, though. I'm pretty busy this week and next week will be hectic too, but towards the end of the month I can probably come up with a test suite for this and look at some actual performance numbers.

abalkin · 2018-06-05T17:01:37Z

it's actually possible to implement full fold support in a generic way

I am aware of that. In fact, some of the draft versions of PEP-495 implementation did contain such code. The problem is that any such tz.fromutc() implementation would necessarily change the behavior of the old programs. Moreover, any implementation of tz.fromutc() in terms of tz.utcoffset() is more complicated and less efficient than code that he's direct access to a database of transition times.

pganssle · 2018-06-05T17:19:46Z

I am aware of that. In fact, some of the draft versions of PEP-495 implementation did contain such code. The problem is that any such tz.fromutc() implementation would necessarily change the behavior of the old programs.

This I'm not sure about - the implementation I've provided gives the same answer for any program that pays no attention to fold, because I'm relying on utcoffset and dst's reaction to a change in fold. Any code that's not explicitly handling fold will give the same answer as it always has.

Moreover, any implementation of tz.fromutc() in terms of tz.utcoffset() is more complicated and less efficient than code that he's direct access to a database of transition times.

While true, that does not argue in favor of having a version of fromutc that doesn't respect fold, because anyone looking for a more efficient implementation will have to reimplement fromutc anyway if necessary.

Another argument in favor of having fromutc respect fold by default is that it makes it very simple to support fold, particularly for people who aren't optimizing for speed. As it is now, you have to duplicate a lot of fold-handling and transition lookup logic in both fromutc and utcoffset / dst, because they are getting the same information but they are not implemented in terms of one another (and it's not even amazingly easy to write a function that factors out the common code). That's more code to write and maintain and test.

At the end of the day, almost everyone who cares about efficiency will use dateutil or pytz for their application, and dateutil and pytz can and do re-implement fromutc when appropriate - though there are still some dateutil tzinfo subclasses where this would be more efficient than what's there now. I don't see why the perfect has to be the enemy of the good here, though; having fold support in fromutc is a net benefit for anyone using fromutc and for adoption of PEP-495 in a wider context.

abalkin · 2018-06-05T18:37:56Z

Paul,

In your opening post to this issue you suggested to change one line 1 in Lib/datetime.py from

dtdst = dt.dst()

to

dtdst = dt.replace(fold=1).dst()

This looks like a rather innocuous change, but it does not by itself make fromutc() return properly "enfolded" instances. IIRC, the best algorithm that Tim and I were able to come up with to derive the fold value required something like six utcoffset() probes.

PR 7425 that you submitted looks somewhat involved. Can you submit an equivalent datetime.py patch?

pganssle added type-bug An unexpected behavior, bug, or error 3.7 (EOL) end of life labels Nov 3, 2016

abalkin self-assigned this Nov 3, 2016

abalkin added type-feature A feature request or enhancement and removed type-bug An unexpected behavior, bug, or error labels Nov 3, 2016

pganssle added the 3.8 only security fixes label Apr 22, 2018

abalkin removed the 3.7 (EOL) end of life label Jun 5, 2018

ezio-melotti transferred this issue from another repository Apr 10, 2022

abalkin mentioned this issue Feb 6, 2023

bpo-28602: [WIP] Add fold support to fromutc #7425

Draft

iritkatriel added the extension-modules C modules in the Modules dir label Nov 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`tzinfo.fromutc()` fails when used for a fold-aware tzinfo implementation #72788

`tzinfo.fromutc()` fails when used for a fold-aware tzinfo implementation #72788

pganssle commented Nov 3, 2016

pganssle commented Nov 3, 2016

abalkin commented Nov 3, 2016

pganssle commented Nov 3, 2016

abalkin commented Nov 3, 2016

pganssle commented Nov 3, 2016

abalkin commented Nov 3, 2016

pganssle commented Jun 5, 2018

abalkin commented Jun 5, 2018

pganssle commented Jun 5, 2018

abalkin commented Jun 5, 2018

tzinfo.fromutc() fails when used for a fold-aware tzinfo implementation #72788

tzinfo.fromutc() fails when used for a fold-aware tzinfo implementation #72788

Comments

pganssle commented Nov 3, 2016

pganssle commented Nov 3, 2016

abalkin commented Nov 3, 2016

pganssle commented Nov 3, 2016

abalkin commented Nov 3, 2016

pganssle commented Nov 3, 2016

abalkin commented Nov 3, 2016

pganssle commented Jun 5, 2018

abalkin commented Jun 5, 2018

pganssle commented Jun 5, 2018

abalkin commented Jun 5, 2018

`tzinfo.fromutc()` fails when used for a fold-aware tzinfo implementation #72788

`tzinfo.fromutc()` fails when used for a fold-aware tzinfo implementation #72788