classification
Title: time.strptime() unexpectedly gives the same result for %U and %W for 2018
Type: behavior Stage:
Components: Library (Lib) Versions: Python 3.8, Python 3.7, Python 3.6, Python 3.5, Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Paul Keating, belopolsky, eamanu, p-ganssle, xtreak
Priority: normal Keywords:

Created on 2018-12-19 12:54 by Paul Keating, last changed 2019-01-28 16:06 by eamanu.

Files
File name Uploaded Description Edit
test.c eamanu, 2019-01-28 16:06
Messages (4)
msg332135 - (view) Author: Paul Keating (Paul Keating) Date: 2018-12-19 12:54
This was originally reported on StackOverflow (53829118) and I believe the poster has found a genuine issue. He reported a problem converting from Python 2.3 to Python 2.7 in which strptime() produced a different result for %U in the two versions. For lack of an old enough copy of Python, I can not reproduce the Python 2.3 result, which he reports as follows:

Python 2.3.4
------------
>>> dw='51 0 18' # 51 week number, 0 for Sunday and 18 for year 2018
>>> date=time.strptime(dw,"%U %w %y")
>>> print date 
(2018, 12, 16, 0, 0, 0, 6, 350, -1) # 2018 12 16
[Remark: This output looks like Python 2.1 to me, but the issue is not the datatype of the result but the value of the result.]

Python 2.7.5
------------
>>> dw='51 0 18' # 51 week number, 0 for Sunday and 18 for year 2018
>>> date=time.strptime(dw,"%U %w %y")
>>> print date
time.struct_time(tm_year=2018, tm_mon=12, tm_mday=23, tm_hour=0, tm_min=0, tm_sec=0, tm_wday=6, tm_yday=357, tm_isdst=-1)

The point here is that the day of the month has shifted from 16 December to 23 December, and I believe that 16 December is correct.

In ISO week numbers, week 51 in 2018 runs from Monday 17 to Sunday 23 December. So the Python 2.7.5 result is correct for ISO week numbers. Only, ISO week numbers are provided by directive %W. 

%U is supposed to work with the week numbering system common (as I understand it) in North America, where (according to Wikipedia) week 1 begins on a Sunday, and contains both 1 January and the first Saturday of the year. While I am not familiar with that system, Excel 2016 is, and it reports 

=WEEKNUM(DATE(2018,12,16))  as 51
=ISOWEEKNUM(DATE(2018,12,16))  as 50 

But if I do the following in Python (2.6, 2.7 or 3.5) I get the week numbers reported as the same:

>>> dw='51 0 18' # 51 week number, 0 for Sunday and 18 for year 2018
>>> time.strptime(dw,"%U %w %y") == time.strptime(dw,"%W %w %y")
True
[Should be False]

So directives %U and %W are producing the equal results for this date, and further checking shows that the same unexpected equality appears for all Sundays in 2018. And I get the same unexpected equality for the Sunday of the 51st week of years 2063, 2057, 2052, 2046, 2035, 2027, 2007, 2001. It looks to recur when 1 January of a given year is a Monday. 

Now, it may be going too far to say that Excel is right and the Python standard library is wrong. It is clear that the algorithms are just systematically different. On the other hand, it appears that Python 2.3 did it the way that Excel does, and that raises the question of why Python does it differently now. 

A bit of searching reveals that people who complain that Excel's WEEKNUM function is wrong are generally just unaware that there are competing systems. So this difference is not in the same category as Excel's numbering of days before 1 March 1900.
msg332146 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python triager) Date: 2018-12-19 15:50
The results from ruby are the same as Python master as a data point. From the docs 

%U - Week number of the year (Sunday as the first day of the week) as a zero padded decimal number. All days in a new year preceding the first Sunday are considered to be in week 0.
%W - Week number of the year (Monday as the first day of the week) as a decimal number. All days in a new year preceding the first Monday are considered to be in week 0.
%w - Weekday as a decimal number, where 0 is Sunday and 6 is Saturday.

So with 51 (%U) strptime is returning week number 51 of the year (51st sunday, 23/12/2018) with Sunday as the first day of the week and with %w (first weekday with 0 as Sunday) as 0 it returns 23/12/2018 which is the first sunday. With 51 (%W) strptime is returning week number 51 of the year with Monday (51st Monday, 17/12/2018) as the first day of the week (2018 starts with Monday) and hence with %w as 0 it returns the next sunday (23/12/2018) as first weekday (sunday). Where it goes little counter-intuitive is time.strptime('51 1 2018',"%W %w %Y") returns 17/12/2018, Monday of the 51st monday as week number that returns the 17/12/2018 but time.strptime('51 0 2018',"%W %w %Y") returns 23/12/2018 so first weekday is higher than the second weekday.

CPython master

$ ./python.exe
Python 3.8.0a0 (heads/master:1dd035954b, Dec 18 2018, 10:12:34)
[Clang 7.0.2 (clang-700.1.81)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import time
>>> dw='51 0 18'
>>> time.strptime(dw,"%U %w %y")
time.struct_time(tm_year=2018, tm_mon=12, tm_mday=23, tm_hour=0, tm_min=0, tm_sec=0, tm_wday=6, tm_yday=357, tm_isdst=-1)
>>> time.strptime(dw,"%W %w %y")
time.struct_time(tm_year=2018, tm_mon=12, tm_mday=23, tm_hour=0, tm_min=0, tm_sec=0, tm_wday=6, tm_yday=357, tm_isdst=-1)
>>> time.strptime(dw,"%U %w %y") == time.strptime(dw,"%W %w %y")
True

Ruby

$ irb
irb(main):001:0> require 'date'
=> true
irb(main):002:0> DateTime::strptime("51 0 18", "%W %w %y")
=> #<DateTime: 2018-12-23T00:00:00+00:00 ((2458476j,0s,0n),+0s,2299161j)>
irb(main):003:0> DateTime::strptime("51 0 18", "%U %w %y")
=> #<DateTime: 2018-12-23T00:00:00+00:00 ((2458476j,0s,0n),+0s,2299161j)>
irb(main):004:0> DateTime::strptime("51 0 18", "%U %w %y") == DateTime::strptime("51 0 18", "%W %w %y")
=> true
msg332154 - (view) Author: Paul Ganssle (p-ganssle) * (Python committer) Date: 2018-12-19 17:06
I don't really know what Python was doing in version 2.3, and I don't have immediate access to a Python 2.3 interpreter, but at least for %U and %W, datetime is calling the platform's `strftime` under the hood, so presumably if this is a bug it's a bug in glibc and the other providers of `strftime`.

Digging a bit more, %U and %W appear to be the the same for all Sundays if (and only if) the year starts on a Monday:

    import calendar
    from datetime import datetime
    from dateutil import rrule


    rr = rrule.rrule(freq=rrule.WEEKLY,
                     byweekday=rrule.SU,
                     dtstart=datetime(1900, 1, 1),
                     until=datetime(2100, 1, 1))

    for dt in rr:
        is_same = dt.strftime("%U") == dt.strftime("%W")
        year_starts_monday = calendar.weekday(dt.year, 1, 1) == 0
        assert is_same == year_starts_monday


This seems to be the right behavior, because %U and %W count all days before their respective "first day of the week" as "week 0", and week 1 starts with the relevant day of the week. If the year starts with Monday, week 1 is  1 January - 7 January according to %W (year starts on Monday), and week 1 is 7 January - 13 January according to %U (year starts on Sunday), thus all Sundays will be in the same "week number" in both systems.

> %U is supposed to work with the week numbering system common (as I understand it) in North America, where (according to Wikipedia) week 1 begins on a Sunday, and contains both 1 January and the first Saturday of the year. While I am not familiar with that system, Excel 2016 is, and it reports 

The documentation for %U says:

> Week number of the year (Sunday as the first day of the week) as a zero padded decimal number. All days in a new year preceding the first Sunday are considered to be in week 0.

This means that week 1 would only contain the first Saturday of the month and January 1st on years that start on Sunday. The Python documentation is consistent with the man page for strftime(3): http://man7.org/linux/man-pages/man3/strftime.3.html
msg334477 - (view) Author: Emmanuel Arias (eamanu) * Date: 2019-01-28 16:06
I try to reproduce and confirm the xtreak example, and how xtreak and p-ganssle explain, I think that the behavoir is correct according the documentation. 

I would like to know why there is difference between 2.3.4 (Paul Keating example) and cpython master
History
Date User Action Args
2019-01-28 16:06:38eamanusetfiles: + test.c
nosy: + eamanu
messages: + msg334477

2018-12-19 17:06:58p-gansslesetmessages: + msg332154
versions: + Python 3.6, Python 3.7, Python 3.8
2018-12-19 15:50:48xtreaksetnosy: + belopolsky, p-ganssle, xtreak
messages: + msg332146
2018-12-19 12:54:20Paul Keatingcreate