New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG in how _strptime() handles week 0 #67325
Comments
The following bit of code demonstrates a bug in how _strptime handles week 0. The bug is an edge condition that specifically affects how it handles two days before the first day of the new year >>> for i in range(7):
... datetime.strptime('%s %s %s' % (0, 2015, i), '%W %Y %w').date()
...
datetime.date(2015, 1, 4)
datetime.date(2014, 12, 29)
datetime.date(2015, 1, 1) # <-- ERROR: should be '2014, 12, 30'
datetime.date(2014, 12, 31)
datetime.date(2015, 1, 1)
datetime.date(2015, 1, 2)
datetime.date(2015, 1, 3) The issue stems from the fact that calls to _calc_julian_from_U_or_W() can return a -1, which under normal circumstances is invalid. The strptime() function tests and corrects when julian values are -1...but it should not do this when the week_of_year is zero. The fact that it tests for ONLY -1 is why we see the problem in 2015. The error condition only exists when the first day of the year is exactly two days ahead of the date being tested for. Patch is attached |
What's a New Year without fixing a calendar bug! :-) Jim's logic looks right to me. The patch needs a test before it can be applied. |
Hi, Here is my understanding of it. For any valid string, strptime followed by strftime should return the same string. This is easily verified. However, when you try the same thing(as in the attached example) with '0 2015 2', we get a different string after strptime followed by strftime, indicating the string is invalid. Do let me know if I have made any mistake in understanding this. |
In a comment for bpo-23134, Serhiy suggested that this may not be a bug (msg233215) because C strptime behavior for this case is undefined. I disagree. On Mac OSX, strptime man page states: """ And for the data="0 2015 2", format="%W %Y %w" case, it produces 2015-12-30 00:00:00 as expected. |
See also bpo-12006. |
Thank you Jim for your patch. That all right and the patch fixes the issue. But However I'm not sure that strptime should accept weekdays before the start of $ ./strptimetest "0 2015 1" "%W %Y %w"
2015-00--2 00:00:00
$ ./strptimetest "0 2015 2" "%W %Y %w"
2015-00--1 00:00:00
$ ./strptimetest "0 2015 3" "%W %Y %w"
2015-00-00 00:00:00
$ ./strptimetest "0 2015 4" "%W %Y %w"
2015-01-01 00:00:00
$ ./strptimetest "0 2015 5" "%W %Y %w"
2015-01-02 00:00:00
$ ./strptimetest "0 2015 6" "%W %Y %w"
2015-01-03 00:00:00
$ ./strptimetest "0 2015 7" "%W %Y %w"
2015-01-00 00:00:00 May be Python strptime should raise an exception for weekdays before the start I'm not sure that any patch should be applied to maintained releases. |
Here is alternative patch which rejects week-weekday combinations out of specified year. |
I understand. Actually, raising an exception would be perfectly acceptable as well (possibly even more desirable). I too experimented with the c-lib strptime() and discovered the negative values of tm_mday. These results are good too -- as they clearly show the number of days before the new year. I don't think it's important for the python version to return the same values, but it would be better for it to at least be predictable. |
Not necessarily. String to datetime mapping implemented by strptime can be many to one. For example, >>> datetime.strptime("2014 366", "%Y %j") == datetime.strptime("2015 1", "%Y %j")
True in this case, strptime-strftime may not round-trip. >>> datetime.strptime("2014 366", "%Y %j").strftime("%Y %j")
'2015 001' |
I was referring to C strptime and strftime. |
I would prefer to accept "denormalized" "%Y %W %w" combinations. Note that while Mac OSX and glibc versions of strptime produce different results, neither implementation signals an error by returning NULL.
This is not true. In C, struct tm is not required to contain normalized data. I am attaching a modified version of Serhiy's strptimetest.c: $ diff -u strptimetest.c strptimetest2.c
--- strptimetest.c 2014-12-30 13:45:17.000000000 -0500
+++ strptimetest2.c 2014-12-31 12:56:17.000000000 -0500
@@ -16,6 +16,7 @@
exit(1);
}
strptime(argv[1], argv[2], &tm);
+ mktime(&tm);
strftime(buf, sizeof(buf), "%Y-%m-%d %H:%M:%S", &tm);
puts(buf);
return 0; With this modification, I get on Linux $ ./a.out "0 2015 2" "%W %Y %w"
2014-11-29 00:00:00 which is still wrong, but I think this is a glibc issue: mktime accepted strptime result as valid even though it failed to normalize it correctly. |
Also, given that "2015 0 0", "2015 0 1", and "2015 0 3", all currently work with '%Y %W %w' format specification and produce dates in the year 2014, I think raising an error in all those cases is more likely to break user code than correcting the value in the edge case. |
Note that round-trip also fails in weeks 52-53. For example, >>> datetime.strptime('2014 53 6', '%Y %W %w')
datetime.datetime(2015, 1, 10, 0, 0)
>>> datetime.strptime('2014 53 6', '%Y %W %w').strftime('%Y %W %w')
'2015 01 6' If we decide to make "2015 0 0" invalid for format '%Y %W %w', we should probably invalidate the entire week 53 in 2014 for consistency. However, I don't think there are any C implementations that would have a problem with such dates. Overall, I am inclined to accept Jim's solution for 3.5, but I am not sure about the maintenance branches. |
All the proposed patches are acceptable from my point of view, but it would be really great if we could get the fix in the 2.x line. It seems unlikely there could be any legacy code that would depend on the bug to exist (ie: to only be wrong when then date requested is exactly two days before the first date of the new year). |
Serhiy, Your latest patch LGTM. |
New changeset 350ea309953e by Serhiy Storchaka in branch '2.7': New changeset 93e1da502338 by Serhiy Storchaka in branch '3.4': New changeset 750c6d53890a by Serhiy Storchaka in branch 'default': |
Thank you for the review Alexander. Thank you for your contribution Jim. |
"including Jan 30" - did you mean Dec 31? |
Oh, I meant Dec 30. |
I don't think it is worth the effort to try to fix the commit message, but it would be nice to have the NEWS text corrected. |
New changeset d9f3dd1d6b48 by Serhiy Storchaka in branch '2.7': New changeset afca9867e1bb by Serhiy Storchaka in branch '3.4': New changeset 18bc81ce6090 by Serhiy Storchaka in branch 'default': |
Is it possible to fix the commit message without breaking any of cloned |
Probably not. |
Thanks for all your work ! |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: