Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG in how _strptime() handles week 0 #67325

Closed
jamercee mannequin opened this issue Dec 30, 2014 · 24 comments
Closed

BUG in how _strptime() handles week 0 #67325

jamercee mannequin opened this issue Dec 30, 2014 · 24 comments
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@jamercee
Copy link
Mannequin

jamercee mannequin commented Dec 30, 2014

BPO 23136
Nosy @malemburg, @abalkin, @aganders3, @serhiy-storchaka, @jamercee
Files
  • patch.txt: Bug fix
  • example.c
  • strptimetest.c: Utility for testing C strptime
  • strptime_julian_none.patch: Use None as signal value for julian
  • strptime_check_valid_week.patch: Check if week and weekday fit in the year
  • strptimetest2.c
  • strptime_julian_none_2.patch: Added tests
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2015-03-19.17:16:33.516>
    created_at = <Date 2014-12-30.21:04:37.125>
    labels = ['type-bug', 'library']
    title = 'BUG in how _strptime() handles week 0'
    updated_at = <Date 2015-03-19.19:25:55.584>
    user = 'https://github.com/jamercee'

    bugs.python.org fields:

    activity = <Date 2015-03-19.19:25:55.584>
    actor = 'jamercee'
    assignee = 'none'
    closed = True
    closed_date = <Date 2015-03-19.17:16:33.516>
    closer = 'serhiy.storchaka'
    components = ['Library (Lib)']
    creation = <Date 2014-12-30.21:04:37.125>
    creator = 'jamercee'
    dependencies = []
    files = ['37568', '37572', '37573', '37574', '37575', '37576', '37674']
    hgrepos = []
    issue_num = 23136
    keywords = ['patch']
    message_count = 24.0
    messages = ['233219', '233248', '233250', '233251', '233252', '233253', '233255', '233256', '233257', '233258', '233259', '233265', '233323', '233328', '238540', '238542', '238543', '238544', '238545', '238546', '238547', '238548', '238550', '238551']
    nosy_count = 9.0
    nosy_names = ['lemburg', 'belopolsky', 'Erik.Cederstrand', 'python-dev', 'aganders3', 'serhiy.storchaka', 'jamercee', 'Saimadhav.Heblikar', 'torm']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue23136'
    versions = ['Python 2.7', 'Python 3.4', 'Python 3.5']

    @jamercee
    Copy link
    Mannequin Author

    jamercee mannequin commented Dec 30, 2014

    The following bit of code demonstrates a bug in how _strptime handles week 0. The bug is an edge condition that specifically affects how it handles two days before the first day of the new year

    >>> for i in range(7):
    ...     datetime.strptime('%s %s %s' % (0, 2015, i), '%W %Y %w').date()
    ... 
    datetime.date(2015, 1, 4)
    datetime.date(2014, 12, 29)
    datetime.date(2015, 1, 1)   # <-- ERROR: should be '2014, 12, 30'
    datetime.date(2014, 12, 31)
    datetime.date(2015, 1, 1)
    datetime.date(2015, 1, 2)
    datetime.date(2015, 1, 3)

    The issue stems from the fact that calls to _calc_julian_from_U_or_W() can return a -1, which under normal circumstances is invalid. The strptime() function tests and corrects when julian values are -1...but it should not do this when the week_of_year is zero.

    The fact that it tests for ONLY -1 is why we see the problem in 2015. The error condition only exists when the first day of the year is exactly two days ahead of the date being tested for.

    Patch is attached

    @jamercee jamercee mannequin added the stdlib Python modules in the Lib dir label Dec 30, 2014
    @abalkin
    Copy link
    Member

    abalkin commented Dec 31, 2014

    What's a New Year without fixing a calendar bug! :-)

    Jim's logic looks right to me. The patch needs a test before it can be applied.

    @SaimadhavHeblikar
    Copy link
    Mannequin

    SaimadhavHeblikar mannequin commented Dec 31, 2014

    Hi,

    Here is my understanding of it.
    I have used the following C example of strptime and strftime to show that the string '0 2015 2' along with the format specifier '%W %Y %w' is invalid.

    For any valid string, strptime followed by strftime should return the same string. This is easily verified. However, when you try the same thing(as in the attached example) with '0 2015 2', we get a different string after strptime followed by strftime, indicating the string is invalid.

    Do let me know if I have made any mistake in understanding this.

    @abalkin
    Copy link
    Member

    abalkin commented Dec 31, 2014

    In a comment for bpo-23134, Serhiy suggested that this may not be a bug (msg233215) because C strptime behavior for this case is undefined.

    I disagree. On Mac OSX, strptime man page states:

    """
    The %U and %W format specifiers accept any value within the range 00 to 53 without validat-
    ing against other values supplied (like month or day of the year, for example).
    """

    And for the data="0 2015 2", format="%W %Y %w" case, it produces 2015-12-30 00:00:00 as expected.

    @abalkin
    Copy link
    Member

    abalkin commented Dec 31, 2014

    See also bpo-12006.

    @serhiy-storchaka
    Copy link
    Member

    Thank you Jim for your patch. That all right and the patch fixes the issue. But
    the code looks a little tricky, and it would be more robust to use other
    signal value for julian (e.g. None).

    However I'm not sure that strptime should accept weekdays before the start of
    year. In C the strptime function doesn't return valid date for such input.

    $ ./strptimetest "0 2015 1" "%W %Y %w"
    2015-00--2 00:00:00
    $ ./strptimetest "0 2015 2" "%W %Y %w"
    2015-00--1 00:00:00
    $ ./strptimetest "0 2015 3" "%W %Y %w"
    2015-00-00 00:00:00
    $ ./strptimetest "0 2015 4" "%W %Y %w"
    2015-01-01 00:00:00
    $ ./strptimetest "0 2015 5" "%W %Y %w"
    2015-01-02 00:00:00
    $ ./strptimetest "0 2015 6" "%W %Y %w"
    2015-01-03 00:00:00
    $ ./strptimetest "0 2015 7" "%W %Y %w"
    2015-01-00 00:00:00

    May be Python strptime should raise an exception for weekdays before the start
    of year as well as it raise an exception for weekdays out of range 0-6.

    I'm not sure that any patch should be applied to maintained releases.

    @serhiy-storchaka
    Copy link
    Member

    Here is alternative patch which rejects week-weekday combinations out of specified year.

    @jamercee
    Copy link
    Mannequin Author

    jamercee mannequin commented Dec 31, 2014

    I understand. Actually, raising an exception would be perfectly acceptable as well (possibly even more desirable).

    I too experimented with the c-lib strptime() and discovered the negative values of tm_mday. These results are good too -- as they clearly show the number of days before the new year.

    I don't think it's important for the python version to return the same values, but it would be better for it to at least be predictable.

    @abalkin
    Copy link
    Member

    abalkin commented Dec 31, 2014

    For any valid string, strptime followed by strftime should return
    the same string.

    Not necessarily. String to datetime mapping implemented by strptime can be many to one. For example,

    >>> datetime.strptime("2014 366", "%Y %j") == datetime.strptime("2015 1", "%Y %j")
    True

    in this case, strptime-strftime may not round-trip.

    >>> datetime.strptime("2014 366", "%Y %j").strftime("%Y %j")
    '2015 001'

    @SaimadhavHeblikar
    Copy link
    Mannequin

    SaimadhavHeblikar mannequin commented Dec 31, 2014

    >Not necessarily. String to datetime mapping implemented by strptime can be many to one. For example,

    I was referring to C strptime and strftime.
    But thanks for posting the round trip example. I was unaware of it.

    @abalkin
    Copy link
    Member

    abalkin commented Dec 31, 2014

    I would prefer to accept "denormalized" "%Y %W %w" combinations. Note that while Mac OSX and glibc versions of strptime produce different results, neither implementation signals an error by returning NULL.

    In C the strptime function doesn't return valid date for such input.

    This is not true. In C, struct tm is not required to contain normalized data. I am attaching a modified version of Serhiy's strptimetest.c:

    $ diff -u strptimetest.c strptimetest2.c
    --- strptimetest.c	2014-12-30 13:45:17.000000000 -0500
    +++ strptimetest2.c	2014-12-31 12:56:17.000000000 -0500
    @@ -16,6 +16,7 @@
             exit(1);
         }
         strptime(argv[1], argv[2], &tm);
    +    mktime(&tm);
         strftime(buf, sizeof(buf), "%Y-%m-%d %H:%M:%S", &tm);
         puts(buf);
         return 0;

    With this modification, I get on Linux

    $ ./a.out "0 2015 2" "%W %Y %w"
    2014-11-29 00:00:00

    which is still wrong, but I think this is a glibc issue: mktime accepted strptime result as valid even though it failed to normalize it correctly.

    @abalkin
    Copy link
    Member

    abalkin commented Dec 31, 2014

    Also, given that "2015 0 0", "2015 0 1", and "2015 0 3", all currently work with '%Y %W %w' format specification and produce dates in the year 2014, I think raising an error in all those cases is more likely to break user code than correcting the value in the edge case.

    @abalkin
    Copy link
    Member

    abalkin commented Jan 2, 2015

    Note that round-trip also fails in weeks 52-53. For example,

    >>> datetime.strptime('2014 53 6', '%Y %W %w')
    datetime.datetime(2015, 1, 10, 0, 0)
    >>> datetime.strptime('2014 53 6', '%Y %W %w').strftime('%Y %W %w')
    '2015 01 6'

    If we decide to make "2015 0 0" invalid for format '%Y %W %w', we should probably invalidate the entire week 53 in 2014 for consistency. However, I don't think there are any C implementations that would have a problem with such dates.

    Overall, I am inclined to accept Jim's solution for 3.5, but I am not sure about the maintenance branches.

    @jamercee
    Copy link
    Mannequin Author

    jamercee mannequin commented Jan 2, 2015

    All the proposed patches are acceptable from my point of view, but it would be really great if we could get the fix in the 2.x line. It seems unlikely there could be any legacy code that would depend on the bug to exist (ie: to only be wrong when then date requested is exactly two days before the first date of the new year).

    @abalkin
    Copy link
    Member

    abalkin commented Mar 19, 2015

    Serhiy,

    Your latest patch LGTM.

    @abalkin abalkin added the type-bug An unexpected behavior, bug, or error label Mar 19, 2015
    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Mar 19, 2015

    New changeset 350ea309953e by Serhiy Storchaka in branch '2.7':
    Issue bpo-23136: _strptime now uniformly handles all days in week 0, including
    https://hg.python.org/cpython/rev/350ea309953e

    New changeset 93e1da502338 by Serhiy Storchaka in branch '3.4':
    Issue bpo-23136: _strptime now uniformly handles all days in week 0, including
    https://hg.python.org/cpython/rev/93e1da502338

    New changeset 750c6d53890a by Serhiy Storchaka in branch 'default':
    Issue bpo-23136: _strptime now uniformly handles all days in week 0, including
    https://hg.python.org/cpython/rev/750c6d53890a

    @serhiy-storchaka
    Copy link
    Member

    Thank you for the review Alexander. Thank you for your contribution Jim.

    @abalkin
    Copy link
    Member

    abalkin commented Mar 19, 2015

    "including Jan 30" - did you mean Dec 31?

    @serhiy-storchaka
    Copy link
    Member

    Oh, I meant Dec 30.

    @abalkin
    Copy link
    Member

    abalkin commented Mar 19, 2015

    I don't think it is worth the effort to try to fix the commit message, but it would be nice to have the NEWS text corrected.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Mar 19, 2015

    New changeset d9f3dd1d6b48 by Serhiy Storchaka in branch '2.7':
    Fixed Misc/NEWS entry for issue bpo-23136.
    https://hg.python.org/cpython/rev/d9f3dd1d6b48

    New changeset afca9867e1bb by Serhiy Storchaka in branch '3.4':
    Fixed Misc/NEWS entry for issue bpo-23136.
    https://hg.python.org/cpython/rev/afca9867e1bb

    New changeset 18bc81ce6090 by Serhiy Storchaka in branch 'default':
    Fixed Misc/NEWS entry for issue bpo-23136.
    https://hg.python.org/cpython/rev/18bc81ce6090

    @serhiy-storchaka
    Copy link
    Member

    Is it possible to fix the commit message without breaking any of cloned
    repositories that already have pulled it?

    @abalkin
    Copy link
    Member

    abalkin commented Mar 19, 2015

    Probably not.

    @jamercee
    Copy link
    Mannequin Author

    jamercee mannequin commented Mar 19, 2015

    Thanks for all your work !

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants