classification
Title: datetime.datetime.strptime get day error
Type: behavior Stage: needs patch
Components: Library (Lib) Versions: Python 3.9, Python 3.8, Python 3.7, Python 3.6
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: belopolsky, eric.smith, karlcow, p-ganssle, zhanying
Priority: normal Keywords:

Created on 2020-04-09 09:55 by zhanying, last changed 2020-07-28 11:27 by karlcow.

Messages (9)
msg366039 - (view) Author: zhanying (zhanying) Date: 2020-04-09 09:55
In [7]: datetime.datetime.strptime("2024-0-3 00:00:00", "%Y-%W-%w %H:%M:%S")
Out[7]: datetime.datetime(2024, 1, 3, 0, 0)

In [8]: datetime.datetime.strptime("2024-1-3 00:00:00", "%Y-%W-%w %H:%M:%S")
Out[8]: datetime.datetime(2024, 1, 3, 0, 0)
msg366046 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2020-04-09 12:45
Can you tell us what platform you're on? Also, please include the header that's printed out when you run python from the command line. For example, mine shows:

$ python3
Python 3.7.6 (default, Jan 30 2020, 10:29:04) 
[GCC 9.2.1 20190827 (Red Hat 9.2.1-1)] on linux
msg366055 - (view) Author: Paul Ganssle (p-ganssle) * (Python committer) Date: 2020-04-09 14:40
I can reproduce this on Linux with Python 3.8.2.

I think this may be a bug, but it may also just be platform-specific weirdness. Either way it's very curious behavior:


>>> datetime.strptime("2023-0-0", "%Y-%W-%w")                         
datetime.datetime(2023, 1, 1, 0, 0)
>>> datetime.strptime("2023-0-1", "%Y-%W-%w")                         
datetime.datetime(2022, 12, 26, 0, 0)

The definition for %W (and %U, which is related) goes like this:


> Week number of the year (Monday as the first day of the week) as a decimal number. All days in a new year preceding the first Monday are considered to be in week 0.

2024 starts on a Monday, so there should be no Week 0 in that year at all. Seems to me like it's undefined what happens when you put in a string that puts in an invalid value for "%Y-%W-%w".

Seems to me that we are just passing through the behavior of `time.strptime` in this case (which just calls out to what the platform does):

>>> time.strptime("2024-0-3", "%Y-%W-%w")                             
time.struct_time(tm_year=2024, tm_mon=1, tm_mday=3, tm_hour=0, tm_min=0, tm_sec=0, tm_wday=2, tm_yday=3, tm_isdst=-1)


I am open to discussion about trying to rationalize this behavior - it would be a bit tricky but if we moved to our own implementation of the algorithm to calculate %W we could detect this situation and throw an exception. I'd rather see if this is intended behavior in the underlying C implementation first, though. If this is consistent across platforms and not just some random implementation detail, people may be relying on it.

I propose that we:

1. Determine what happens on different platforms (might be easy to just make a PR that asserts the current behavior and see if/how it breaks on any of the supported platforms).
2. Determine why it works the way it does.


After that, at the very least we should document the behavior with a warning or a footnote or something. If we make any changes to the behavior they would be 3.9+, but the documentation changes can be backported.

Thanks for the bug report zhanying! Very interesting!
msg366062 - (view) Author: Paul Ganssle (p-ganssle) * (Python committer) Date: 2020-04-09 15:10
Likely relevant is bpo-23136, where they dealt with similar issues in the past.

I don't see any explicit test for this behavior, but it seems that the solution is to try to be consistent and to not raise a ValueError.

Looking at this issue, I think it's a manifestation of a similar bug that hits when a year starts with a Monday.

It seems like the behavior is that the following days (%W-%w) should be sequential in any year: 00-1, 00-2, 00-3, 00-4, 00-5, 00-6, 00-0, 01-1, 01-2, ...

Since 2024 starts in a Monday, the first day of the year should be 2024-01-1, and the 2024-00-1 week should start 2023-12-25 rather than duplicating the following week.

I think there's an equivalent issue with dates of the form "%Y-%U-%w", but happening on years that start with a Sunday.
msg366070 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2020-04-09 15:30
I thought that strptime is platform specific (which is why I asked for the platform info). But, looking at the existing docs https://docs.python.org/3.5/library/time.html#time.strptime

"But strptime() is independent of any platform and thus does not necessarily support all directives available that are not documented as supported."

I'm not exactly sure what that means overall, with the double negative. But it say it's not platform specific.
msg366100 - (view) Author: zhanying (zhanying) Date: 2020-04-10 02:47
My platform is this.

#python
Python 3.6.9 |Anaconda, Inc.| (default, Jul 30 2019, 19:07:31)
[GCC 7.3.0] on linux

Type "help", "copyright", "credits" or "license" for more information.

At 2020-04-09 20:45:51, "Eric V. Smith" <report@bugs.python.org> wrote:
>
>Eric V. Smith <eric@trueblade.com> added the comment:
>
>Can you tell us what platform you're on? Also, please include the header that's printed out when you run python from the command line. For example, mine shows:
>
>$ python3
>Python 3.7.6 (default, Jan 30 2020, 10:29:04) 
>[GCC 9.2.1 20190827 (Red Hat 9.2.1-1)] on linux
>
>----------
>components: +Library (Lib)
>nosy: +eric.smith
>
>_______________________________________
>Python tracker <report@bugs.python.org>
><https://bugs.python.org/issue40236>
>_______________________________________
msg366101 - (view) Author: zhanying (zhanying) Date: 2020-04-10 03:05
i read the source code, in this part

def _calc_julian_from_U_or_W(year, week_of_year, day_of_week, week_starts_Mon):
"""Calculate the Julian day based on the year, week of the year, and day of
    the week, with week_start_day representing whether the week of the year
    assumes the week starts on Sunday or Monday (6 or 0)."""
first_weekday = datetime_date(year, 1, 1).weekday()
# If we are dealing with the %U directive (week starts on Sunday), it's
    # easier to just shift the view to Sunday being the first day of the
    # week.
if not week_starts_Mon:
        first_weekday = (first_weekday + 1) % 7
day_of_week = (day_of_week + 1) % 7
# Need to watch out for a week 0 (when the first day of the year is not
    # the same as that specified by %U or %W).
week_0_length = (7 - first_weekday) % 7
if week_of_year == 0:
return 1 + day_of_week - first_weekday
else:
        days_to_week = week_0_length + (7 * (week_of_year - 1))
return 1 + days_to_week + day_of_week

when first_weekday is 0, that year start with Monday, week_of_year equar 0 or 1, this func return same value

At 2020-04-09 20:45:51, "Eric V. Smith" <report@bugs.python.org> wrote:
>
>Eric V. Smith <eric@trueblade.com> added the comment:
>
>Can you tell us what platform you're on? Also, please include the header that's printed out when you run python from the command line. For example, mine shows:
>
>$ python3
>Python 3.7.6 (default, Jan 30 2020, 10:29:04) 
>[GCC 9.2.1 20190827 (Red Hat 9.2.1-1)] on linux
>
>----------
>components: +Library (Lib)
>nosy: +eric.smith
>
>_______________________________________
>Python tracker <report@bugs.python.org>
><https://bugs.python.org/issue40236>
>_______________________________________
msg374480 - (view) Author: karl (karlcow) * Date: 2020-07-28 08:41
Same on macOS 10.15.6 (19G73)


Python 3.8.3 (v3.8.3:6f8c8320e9, May 13 2020, 16:29:34) 
[Clang 6.0 (clang-600.0.57)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import datetime
>>> datetime.datetime.strptime("2024-0-3 00:00:00", "%Y-%W-%w %H:%M:%S")
datetime.datetime(2024, 1, 3, 0, 0)
>>> datetime.datetime.strptime("2024-1-3 00:00:00", "%Y-%W-%w %H:%M:%S")
datetime.datetime(2024, 1, 3, 0, 0)


Also 
https://pubs.opengroup.org/onlinepubs/007908799/xsh/strptime.html

note that iso8601 doesn't have this issue.
%V - ISO 8601 week of the year as a decimal number [01, 53].
https://en.wikipedia.org/wiki/ISO_week_date
msg374486 - (view) Author: karl (karlcow) * Date: 2020-07-28 11:27
Also this.

>>> import datetime
>>> d0 = datetime.datetime.strptime("2024-0-3 00:00:00", "%Y-%W-%w %H:%M:%S")
>>> d0.strftime("%Y-%W-%w %H:%M:%S")
'2024-01-3 00:00:00'
>>> d1 = datetime.datetime.strptime("2024-1-3 00:00:00", "%Y-%W-%w %H:%M:%S")
>>> d1.strftime("%Y-%W-%w %H:%M:%S")
'2024-01-3 00:00:00'
>>> d2301 = datetime.datetime.strptime("2023-0-1 00:00:00", "%Y-%W-%w %H:%M:%S")
>>> d2311 = datetime.datetime.strptime("2023-1-1 00:00:00", "%Y-%W-%w %H:%M:%S")
>>> d2301
datetime.datetime(2022, 12, 26, 0, 0)
>>> d2311
datetime.datetime(2023, 1, 2, 0, 0)
>>> d2311.strftime("%Y-%W-%w %H:%M:%S")
'2023-01-1 00:00:00'
>>> d2301.strftime("%Y-%W-%w %H:%M:%S")
'2022-52-1 00:00:00'


Week 0 2023 became Week 52 2022 (which is correct but might lead to surprises)
History
Date User Action Args
2020-07-28 11:27:34karlcowsetmessages: + msg374486
2020-07-28 08:41:12karlcowsetnosy: + karlcow
messages: + msg374480
2020-04-17 01:10:24p-gansslesetstage: needs patch
2020-04-10 03:05:38zhanyingsetmessages: + msg366101
2020-04-10 02:47:53zhanyingsetmessages: + msg366100
2020-04-09 15:30:23eric.smithsetmessages: + msg366070
2020-04-09 15:10:43p-gansslesetmessages: + msg366062
2020-04-09 14:40:49p-gansslesetmessages: + msg366055
versions: + Python 3.7, Python 3.8, Python 3.9
2020-04-09 14:09:54xtreaksetnosy: + belopolsky, p-ganssle
2020-04-09 12:45:51eric.smithsetnosy: + eric.smith
messages: + msg366046
components: + Library (Lib)
2020-04-09 09:55:31zhanyingcreate