msg233219 - (view) |
Author: Jim Carroll (jamercee) * |
Date: 2014-12-30 21:04 |
The following bit of code demonstrates a bug in how _strptime handles week 0. The bug is an edge condition that specifically affects how it handles two days before the first day of the new year
>>> for i in range(7):
... datetime.strptime('%s %s %s' % (0, 2015, i), '%W %Y %w').date()
...
datetime.date(2015, 1, 4)
datetime.date(2014, 12, 29)
datetime.date(2015, 1, 1) # <-- ERROR: should be '2014, 12, 30'
datetime.date(2014, 12, 31)
datetime.date(2015, 1, 1)
datetime.date(2015, 1, 2)
datetime.date(2015, 1, 3)
The issue stems from the fact that calls to _calc_julian_from_U_or_W() can return a -1, which under normal circumstances is invalid. The strptime() function tests and corrects when julian values are -1...but it should not do this when the week_of_year is zero.
The fact that it tests for ONLY -1 is why we see the problem in 2015. The error condition only exists when the first day of the year is exactly two days ahead of the date being tested for.
Patch is attached
|
msg233248 - (view) |
Author: Alexander Belopolsky (belopolsky) * |
Date: 2014-12-31 16:25 |
What's a New Year without fixing a calendar bug! :-)
Jim's logic looks right to me. The patch needs a test before it can be applied.
|
msg233250 - (view) |
Author: Saimadhav Heblikar (Saimadhav.Heblikar) * |
Date: 2014-12-31 16:32 |
Hi,
Here is my understanding of it.
I have used the following C example of strptime and strftime to show that the string '0 2015 2' along with the format specifier '%W %Y %w' is invalid.
For any valid string, strptime followed by strftime should return the same string. This is easily verified. However, when you try the same thing(as in the attached example) with '0 2015 2', we get a different string after strptime followed by strftime, indicating the string is invalid.
Do let me know if I have made any mistake in understanding this.
|
msg233251 - (view) |
Author: Alexander Belopolsky (belopolsky) * |
Date: 2014-12-31 16:41 |
In a comment for #23134, Serhiy suggested that this may not be a bug (msg233215) because C strptime behavior for this case is undefined.
I disagree. On Mac OSX, strptime man page states:
"""
The %U and %W format specifiers accept any value within the range 00 to 53 without validat-
ing against other values supplied (like month or day of the year, for example).
"""
And for the data="0 2015 2", format="%W %Y %w" case, it produces 2015-12-30 00:00:00 as expected.
|
msg233252 - (view) |
Author: Alexander Belopolsky (belopolsky) * |
Date: 2014-12-31 16:48 |
See also #12006.
|
msg233253 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2014-12-31 16:51 |
Thank you Jim for your patch. That all right and the patch fixes the issue. But
the code looks a little tricky, and it would be more robust to use other
signal value for julian (e.g. None).
However I'm not sure that strptime should accept weekdays before the start of
year. In C the strptime function doesn't return valid date for such input.
$ ./strptimetest "0 2015 1" "%W %Y %w"
2015-00--2 00:00:00
$ ./strptimetest "0 2015 2" "%W %Y %w"
2015-00--1 00:00:00
$ ./strptimetest "0 2015 3" "%W %Y %w"
2015-00-00 00:00:00
$ ./strptimetest "0 2015 4" "%W %Y %w"
2015-01-01 00:00:00
$ ./strptimetest "0 2015 5" "%W %Y %w"
2015-01-02 00:00:00
$ ./strptimetest "0 2015 6" "%W %Y %w"
2015-01-03 00:00:00
$ ./strptimetest "0 2015 7" "%W %Y %w"
2015-01-00 00:00:00
May be Python strptime should raise an exception for weekdays before the start
of year as well as it raise an exception for weekdays out of range 0-6.
I'm not sure that any patch should be applied to maintained releases.
|
msg233255 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2014-12-31 17:30 |
Here is alternative patch which rejects week-weekday combinations out of specified year.
|
msg233256 - (view) |
Author: Jim Carroll (jamercee) * |
Date: 2014-12-31 17:35 |
I understand. Actually, raising an exception would be perfectly acceptable as well (possibly even more desirable).
I too experimented with the c-lib strptime() and discovered the negative values of tm_mday. These results are good too -- as they clearly show the number of days before the new year.
I don't think it's important for the python version to return the same values, but it would be better for it to at least be predictable.
|
msg233257 - (view) |
Author: Alexander Belopolsky (belopolsky) * |
Date: 2014-12-31 17:40 |
> For any valid string, strptime followed by strftime should return
> the same string.
Not necessarily. String to datetime mapping implemented by strptime can be many to one. For example,
>>> datetime.strptime("2014 366", "%Y %j") == datetime.strptime("2015 1", "%Y %j")
True
in this case, strptime-strftime may not round-trip.
>>> datetime.strptime("2014 366", "%Y %j").strftime("%Y %j")
'2015 001'
|
msg233258 - (view) |
Author: Saimadhav Heblikar (Saimadhav.Heblikar) * |
Date: 2014-12-31 18:00 |
>>Not necessarily. String to datetime mapping implemented by strptime can be many to one. For example,
I was referring to C strptime and strftime.
But thanks for posting the round trip example. I was unaware of it.
|
msg233259 - (view) |
Author: Alexander Belopolsky (belopolsky) * |
Date: 2014-12-31 18:05 |
I would prefer to accept "denormalized" "%Y %W %w" combinations. Note that while Mac OSX and glibc versions of strptime produce different results, neither implementation signals an error by returning NULL.
> In C the strptime function doesn't return valid date for such input.
This is not true. In C, struct tm is not required to contain normalized data. I am attaching a modified version of Serhiy's strptimetest.c:
$ diff -u strptimetest.c strptimetest2.c
--- strptimetest.c 2014-12-30 13:45:17.000000000 -0500
+++ strptimetest2.c 2014-12-31 12:56:17.000000000 -0500
@@ -16,6 +16,7 @@
exit(1);
}
strptime(argv[1], argv[2], &tm);
+ mktime(&tm);
strftime(buf, sizeof(buf), "%Y-%m-%d %H:%M:%S", &tm);
puts(buf);
return 0;
With this modification, I get on Linux
$ ./a.out "0 2015 2" "%W %Y %w"
2014-11-29 00:00:00
which is still wrong, but I think this is a glibc issue: mktime accepted strptime result as valid even though it failed to normalize it correctly.
|
msg233265 - (view) |
Author: Alexander Belopolsky (belopolsky) * |
Date: 2014-12-31 18:19 |
Also, given that "2015 0 0", "2015 0 1", and "2015 0 3", all currently work with '%Y %W %w' format specification and produce dates in the year 2014, I think raising an error in all those cases is more likely to break user code than correcting the value in the edge case.
|
msg233323 - (view) |
Author: Alexander Belopolsky (belopolsky) * |
Date: 2015-01-02 19:47 |
Note that round-trip also fails in weeks 52-53. For example,
>>> datetime.strptime('2014 53 6', '%Y %W %w')
datetime.datetime(2015, 1, 10, 0, 0)
>>> datetime.strptime('2014 53 6', '%Y %W %w').strftime('%Y %W %w')
'2015 01 6'
If we decide to make "2015 0 0" invalid for format '%Y %W %w', we should probably invalidate the entire week 53 in 2014 for consistency. However, I don't think there are any C implementations that would have a problem with such dates.
Overall, I am inclined to accept Jim's solution for 3.5, but I am not sure about the maintenance branches.
|
msg233328 - (view) |
Author: Jim Carroll (jamercee) * |
Date: 2015-01-02 21:50 |
All the proposed patches are acceptable from my point of view, but it would be really great if we could get the fix in the 2.x line. It seems unlikely there could be any legacy code that would depend on the bug to exist (ie: to only be wrong when then date requested is exactly two days before the first date of the new year).
|
msg238540 - (view) |
Author: Alexander Belopolsky (belopolsky) * |
Date: 2015-03-19 16:38 |
Serhiy,
Your latest patch LGTM.
|
msg238542 - (view) |
Author: Roundup Robot (python-dev) |
Date: 2015-03-19 17:15 |
New changeset 350ea309953e by Serhiy Storchaka in branch '2.7':
Issue #23136: _strptime now uniformly handles all days in week 0, including
https://hg.python.org/cpython/rev/350ea309953e
New changeset 93e1da502338 by Serhiy Storchaka in branch '3.4':
Issue #23136: _strptime now uniformly handles all days in week 0, including
https://hg.python.org/cpython/rev/93e1da502338
New changeset 750c6d53890a by Serhiy Storchaka in branch 'default':
Issue #23136: _strptime now uniformly handles all days in week 0, including
https://hg.python.org/cpython/rev/750c6d53890a
|
msg238543 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2015-03-19 17:16 |
Thank you for the review Alexander. Thank you for your contribution Jim.
|
msg238544 - (view) |
Author: Alexander Belopolsky (belopolsky) * |
Date: 2015-03-19 17:18 |
"including Jan 30" - did you mean Dec 31?
|
msg238545 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2015-03-19 17:51 |
Oh, I meant Dec 30.
|
msg238546 - (view) |
Author: Alexander Belopolsky (belopolsky) * |
Date: 2015-03-19 17:52 |
I don't think it is worth the effort to try to fix the commit message, but it would be nice to have the NEWS text corrected.
|
msg238547 - (view) |
Author: Roundup Robot (python-dev) |
Date: 2015-03-19 17:55 |
New changeset d9f3dd1d6b48 by Serhiy Storchaka in branch '2.7':
Fixed Misc/NEWS entry for issue #23136.
https://hg.python.org/cpython/rev/d9f3dd1d6b48
New changeset afca9867e1bb by Serhiy Storchaka in branch '3.4':
Fixed Misc/NEWS entry for issue #23136.
https://hg.python.org/cpython/rev/afca9867e1bb
New changeset 18bc81ce6090 by Serhiy Storchaka in branch 'default':
Fixed Misc/NEWS entry for issue #23136.
https://hg.python.org/cpython/rev/18bc81ce6090
|
msg238548 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) * |
Date: 2015-03-19 17:59 |
Is it possible to fix the commit message without breaking any of cloned
repositories that already have pulled it?
|
msg238550 - (view) |
Author: Alexander Belopolsky (belopolsky) * |
Date: 2015-03-19 18:04 |
Probably not.
|
msg238551 - (view) |
Author: Jim Carroll (jamercee) * |
Date: 2015-03-19 19:25 |
Thanks for all your work !
|
|
Date |
User |
Action |
Args |
2022-04-11 14:58:11 | admin | set | github: 67325 |
2015-03-19 19:25:55 | jamercee | set | messages:
+ msg238551 |
2015-03-19 18:04:25 | belopolsky | set | messages:
+ msg238550 |
2015-03-19 17:59:01 | serhiy.storchaka | set | messages:
+ msg238548 |
2015-03-19 17:55:39 | python-dev | set | messages:
+ msg238547 |
2015-03-19 17:52:56 | belopolsky | set | messages:
+ msg238546 |
2015-03-19 17:51:14 | serhiy.storchaka | set | messages:
+ msg238545 |
2015-03-19 17:18:13 | belopolsky | set | messages:
+ msg238544 |
2015-03-19 17:16:33 | serhiy.storchaka | set | status: open -> closed resolution: fixed messages:
+ msg238543
stage: commit review -> resolved |
2015-03-19 17:15:06 | python-dev | set | nosy:
+ python-dev messages:
+ msg238542
|
2015-03-19 16:38:55 | belopolsky | set | type: behavior messages:
+ msg238540 stage: patch review -> commit review |
2015-01-11 11:13:08 | serhiy.storchaka | set | files:
+ strptime_julian_none_2.patch |
2015-01-02 21:50:40 | jamercee | set | messages:
+ msg233328 |
2015-01-02 19:47:53 | belopolsky | set | messages:
+ msg233323 |
2014-12-31 19:20:11 | belopolsky | set | nosy:
+ Erik.Cederstrand, aganders3
|
2014-12-31 18:19:55 | belopolsky | set | messages:
+ msg233265 |
2014-12-31 18:05:48 | belopolsky | set | files:
+ strptimetest2.c
messages:
+ msg233259 |
2014-12-31 18:00:19 | Saimadhav.Heblikar | set | messages:
+ msg233258 |
2014-12-31 17:40:25 | belopolsky | set | messages:
+ msg233257 |
2014-12-31 17:35:28 | jamercee | set | messages:
+ msg233256 |
2014-12-31 17:30:15 | serhiy.storchaka | set | files:
+ strptime_check_valid_week.patch
messages:
+ msg233255 |
2014-12-31 16:51:57 | torm | set | nosy:
+ torm
|
2014-12-31 16:51:29 | serhiy.storchaka | set | files:
+ strptimetest.c, strptime_julian_none.patch keywords:
+ patch messages:
+ msg233253
|
2014-12-31 16:48:25 | belopolsky | set | messages:
+ msg233252 |
2014-12-31 16:41:49 | belopolsky | set | messages:
+ msg233251 |
2014-12-31 16:32:31 | Saimadhav.Heblikar | set | files:
+ example.c
messages:
+ msg233250 |
2014-12-31 16:30:33 | belopolsky | link | issue23134 superseder |
2014-12-31 16:25:13 | belopolsky | set | messages:
+ msg233248 |
2014-12-31 15:37:22 | Saimadhav.Heblikar | set | nosy:
+ Saimadhav.Heblikar
|
2014-12-31 12:58:26 | pitrou | set | nosy:
+ serhiy.storchaka
|
2014-12-31 09:00:31 | pitrou | set | nosy:
+ lemburg, belopolsky stage: patch review
versions:
+ Python 3.4, Python 3.5 |
2014-12-30 21:04:37 | jamercee | create | |