Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Date parsing helpers in email module incorrectly raise IndexError for some malformed inputs #89164

Closed
wbolster mannequin opened this issue Aug 25, 2021 · 9 comments
Closed
Labels
3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes 3.10 only security fixes 3.11 only security fixes topic-email type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@wbolster
Copy link
Mannequin

wbolster mannequin commented Aug 25, 2021

BPO 45001
Nosy @warsaw, @ned-deily, @bitdancer, @ambv, @wbolster, @miss-islington
PRs
  • bpo-45001: Make email date parsing more robust against malformed input #27946
  • [3.10] bpo-45001: Make email date parsing more robust against malformed input (GH-27946) #27972
  • [3.9] bpo-45001: Make email date parsing more robust against malformed input (GH-27946) #27973
  • [3.8] bpo-45001: Make email date parsing more robust against malformed input (GH-27946) #27974
  • [3.7] bpo-45001: Make email date parsing more robust against malformed input (GH-27946) #27975
  • [3.6] bpo-45001: Make email date parsing more robust against malformed input (GH-27946) #27976
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2021-08-30.18:58:53.311>
    created_at = <Date 2021-08-25.13:23:27.104>
    labels = ['3.8', 'expert-email', '3.10', '3.11', '3.7', 'type-crash', '3.9']
    title = 'Date parsing helpers in email module incorrectly raise IndexError for some malformed inputs'
    updated_at = <Date 2021-08-30.18:58:53.311>
    user = 'https://github.com/wbolster'

    bugs.python.org fields:

    activity = <Date 2021-08-30.18:58:53.311>
    actor = 'ned.deily'
    assignee = 'none'
    closed = True
    closed_date = <Date 2021-08-30.18:58:53.311>
    closer = 'ned.deily'
    components = ['email']
    creation = <Date 2021-08-25.13:23:27.104>
    creator = 'wbolster'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 45001
    keywords = ['patch']
    message_count = 9.0
    messages = ['400261', '400262', '400351', '400356', '400357', '400358', '400655', '400657', '400658']
    nosy_count = 6.0
    nosy_names = ['barry', 'ned.deily', 'r.david.murray', 'lukasz.langa', 'wbolster', 'miss-islington']
    pr_nums = ['27946', '27972', '27973', '27974', '27975', '27976']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'crash'
    url = 'https://bugs.python.org/issue45001'
    versions = ['Python 3.6', 'Python 3.7', 'Python 3.8', 'Python 3.9', 'Python 3.10', 'Python 3.11']

    @wbolster
    Copy link
    Mannequin Author

    wbolster mannequin commented Aug 25, 2021

    Various date parsing utilities in the email module, such as email.utils.parsedate(), are supposed to gracefully handle invalid input, typically by raising an appropriate exception or by returning None.

    The internal email._parseaddr._parsedate_tz() helper used by some of these date parsing routines tries to be robust against malformed input, but unfortunately it can still crash ungracefully when a non-empty but whitespace-only input is passed. This manifests as an unexpected IndexError.

    In practice, this can happen when parsing an email with only a newline inside a ‘Date:’ header, which unfortunately happens occasionally in the real world.

    Here's a minimal example:

    $ python
    Python 3.9.6 (default, Jun 30 2021, 10:22:16) 
    [GCC 11.1.0] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import email.utils
    >>> email.utils.parsedate('foo')
    >>> email.utils.parsedate(' ')
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/usr/lib/python3.9/email/_parseaddr.py", line 176, in parsedate
        t = parsedate_tz(data)
      File "/usr/lib/python3.9/email/_parseaddr.py", line 50, in parsedate_tz
        res = _parsedate_tz(data)
      File "/usr/lib/python3.9/email/_parseaddr.py", line 72, in _parsedate_tz
        if data[0].endswith(',') or data[0].lower() in _daynames:
    IndexError: list index out of range

    The fix is rather straight-forward; will open a pull request shortly.

    @wbolster wbolster mannequin added 3.10 only security fixes 3.11 only security fixes 3.9 only security fixes topic-email type-crash A hard crash of the interpreter, possibly with a core dump labels Aug 25, 2021
    @wbolster
    Copy link
    Mannequin Author

    wbolster mannequin commented Aug 25, 2021

    pull request with fix at #27946

    @ambv
    Copy link
    Contributor

    ambv commented Aug 26, 2021

    New changeset 989f6a3 by wouter bolsterlee in branch 'main':
    bpo-45001: Make email date parsing more robust against malformed input (GH-27946)
    989f6a3

    @miss-islington
    Copy link
    Contributor

    New changeset 9a79242 by Miss Islington (bot) in branch '3.10':
    bpo-45001: Make email date parsing more robust against malformed input (GH-27946)
    9a79242

    @ambv
    Copy link
    Contributor

    ambv commented Aug 26, 2021

    New changeset 2cdbd3b by Miss Islington (bot) in branch '3.9':
    bpo-45001: Make email date parsing more robust against malformed input (GH-27946) (GH-27973)
    2cdbd3b

    @ambv
    Copy link
    Contributor

    ambv commented Aug 26, 2021

    New changeset 81148c6 by Miss Islington (bot) in branch '3.8':
    bpo-45001: Make email date parsing more robust against malformed input (GH-27946) (GH-27974)
    81148c6

    @ambv ambv added 3.7 (EOL) end of life 3.8 only security fixes labels Aug 26, 2021
    @ned-deily
    Copy link
    Member

    New changeset e9b85af by Miss Islington (bot) in branch '3.7':
    bpo-45001: Make email date parsing more robust against malformed input (GH-27946) (GH-27975)
    e9b85af

    @ned-deily
    Copy link
    Member

    New changeset da9d6c5 by Miss Islington (bot) in branch '3.6':
    bpo-45001: Make email date parsing more robust against malformed input (GH-27946) (GH-27976)
    da9d6c5

    @ned-deily
    Copy link
    Member

    Thanks for the PR!

    wbolster added a commit to wbolster/cpython that referenced this issue Jul 22, 2022
    Similar to bpo-45001 (pythonGH-89164), this makes email date parsing more
    robust against malformed input. parsedate_tz() is supposed to return
    None for malformed input, but could crash on certain inputs, e.g.
    
        >>> email.utils.parsedate_tz('17 June , 2022')
        IndexError: string index out of range
    
    Fixes pythongh-95087.
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes 3.10 only security fixes 3.11 only security fixes topic-email type-crash A hard crash of the interpreter, possibly with a core dump
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants