Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

datetime.strptime emits IndexError on parsing 'z' as %z #87461

Closed
itchyny mannequin opened this issue Feb 22, 2021 · 7 comments
Closed

datetime.strptime emits IndexError on parsing 'z' as %z #87461

itchyny mannequin opened this issue Feb 22, 2021 · 7 comments
Labels
3.9 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@itchyny
Copy link
Mannequin

itchyny mannequin commented Feb 22, 2021

BPO 43295
Nosy @abalkin, @vstinner, @pganssle, @miss-islington, @itchyny, @noormichael
PRs
  • bpo-43295: Fix error handling of datetime.strptime format string '%z' #24627
  • [3.9] bpo-43295: Fix error handling of datetime.strptime format string '%z' (GH-24627) #24728
  • [3.8] bpo-43295: Fix error handling of datetime.strptime format string '%z' (GH-24627) #24729
  • [3.9] bpo-43295: Fix error handling of datetime.strptime format string '%z' (GH-24627) #25695
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2021-05-20.10:06:03.288>
    created_at = <Date 2021-02-22.13:58:19.856>
    labels = ['type-bug', 'library', '3.9']
    title = "datetime.strptime emits IndexError on parsing 'z' as %z"
    updated_at = <Date 2021-05-20.10:06:03.286>
    user = 'https://github.com/itchyny'

    bugs.python.org fields:

    activity = <Date 2021-05-20.10:06:03.286>
    actor = 'vstinner'
    assignee = 'none'
    closed = True
    closed_date = <Date 2021-05-20.10:06:03.288>
    closer = 'vstinner'
    components = ['Library (Lib)']
    creation = <Date 2021-02-22.13:58:19.856>
    creator = 'itchyny'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 43295
    keywords = ['patch']
    message_count = 7.0
    messages = ['387514', '387548', '387550', '387701', '388034', '393992', '394007']
    nosy_count = 6.0
    nosy_names = ['belopolsky', 'vstinner', 'p-ganssle', 'miss-islington', 'itchyny', 'noormichael']
    pr_nums = ['24627', '24728', '24729', '25695']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue43295'
    versions = ['Python 3.9']

    @itchyny
    Copy link
    Mannequin Author

    itchyny mannequin commented Feb 22, 2021

    In Python 3.9.2, parsing 'z' (small letter) as '%z' (time zone offset) using datetime.strptime emits an IndexError.

    >>> from datetime import datetime
    >>> datetime.strptime('z', '%z')
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/usr/local/lib/python3.9/_strptime.py", line 568, in _strptime_datetime
        tt, fraction, gmtoff_fraction = _strptime(data_string, format)
      File "/usr/local/lib/python3.9/_strptime.py", line 453, in _strptime
        if z[3] == ':':
    IndexError: string index out of range

    I expect ValueError (or some another useful error) as follows.
    ValueError: time data 'z' does not match format '%z'

    This is caused by compiling '%z' to a pattern containing 'Z' (for UTC) with the IGNORECASE flag and accessing z[3] without noticing 'z' is accepted by the regexp.

    @itchyny itchyny mannequin added 3.9 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Feb 22, 2021
    @itchyny
    Copy link
    Mannequin Author

    itchyny mannequin commented Feb 23, 2021

    I noticed another unexpected�effect of the IGNORECASE flag. It enables some non-ascii characters to match against the alphabets.

    >>> from datetime import datetime
    >>> datetime.strptime("Apr\u0130l", "%B")
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/usr/local/lib/python3.9/_strptime.py", line 568, in _strptime_datetime
        tt, fraction, gmtoff_fraction = _strptime(data_string, format)
      File "/usr/local/lib/python3.9/_strptime.py", line 391, in _strptime
        month = locale_time.f_month.index(found_dict['B'].lower())
    ValueError: 'apri̇l' is not in list

    I expect time data does not match error. The ASCII flag will disable matching unexpected unicode characters.

    @noormichael
    Copy link
    Mannequin

    noormichael mannequin commented Feb 23, 2021

    I will address the original issue regarding '%z', but the second issue actually has to do with the Unicode representation of Turkish characters. In Turkish, the letter I ('\u0049') is a capital ı ('\u0131') and the letter İ ('\u0130') is a capital i ('\u0069'). In Python however, the lowercase of I is i, as in English.

    >>> '\u0049'.lower()
    'i'
    >>> '\u0130'.lower()
    'i̇'

    We see that the lowercase forms of both I and İ are i, consistent with English in one case and Turkish in the other.

    @itchyny
    Copy link
    Mannequin Author

    itchyny mannequin commented Feb 26, 2021

    @noormichael Thank you for submitting a patch, I confirmed the original issue is fixed. I'm ok this ticket is closed. Regarding the second issue, I learned it is a Turkish character (thanks!), but the error is same type so will not cause such a critical issue.

    @miss-islington
    Copy link
    Contributor

    New changeset 04f6fbb by Noor Michael in branch 'master':
    bpo-43295: Fix error handling of datetime.strptime format string '%z' (GH-24627)
    04f6fbb

    @pganssle
    Copy link
    Member

    New changeset c87b81d by Miss Islington (bot) in branch '3.9':
    bpo-43295: Fix error handling of datetime.strptime format string '%z' (GH-24627) (bpo-25695)
    c87b81d

    @vstinner
    Copy link
    Member

    It seems like the issue is fixed, I close it.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.9 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants