Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exception parsing invalid email address headers starting or ending with dot #75171

Closed
timb07 mannequin opened this issue Jul 22, 2017 · 5 comments
Closed

Exception parsing invalid email address headers starting or ending with dot #75171

timb07 mannequin opened this issue Jul 22, 2017 · 5 comments
Labels
3.9 only security fixes 3.10 only security fixes 3.11 only security fixes stdlib Python modules in the Lib dir topic-email type-bug An unexpected behavior, bug, or error

Comments

@timb07
Copy link
Mannequin

timb07 mannequin commented Jul 22, 2017

BPO 30988
Nosy @warsaw, @ncoghlan, @kfogel, @bitdancer, @timb07, @cnicodeme, @malvidin, @iritkatriel
PRs
  • gh-75171: Fix parsing address headers with dots start/end display name #2811
  • gh-75171: Fix parsing invalid email address headers starting or ending with a dot #15600
  • bpo-30988: Add InvalidHeaderDefect for Trailing Periods #18687
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2017-07-22.04:50:02.706>
    labels = ['3.11', 'type-bug', 'expert-email', '3.10', '3.9']
    title = 'Exception parsing invalid email address headers starting or ending with dot'
    updated_at = <Date 2021-12-06.11:29:24.562>
    user = 'https://github.com/timb07'

    bugs.python.org fields:

    activity = <Date 2021-12-06.11:29:24.562>
    actor = 'iritkatriel'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['email']
    creation = <Date 2017-07-22.04:50:02.706>
    creator = 'timb07'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 30988
    keywords = ['patch']
    message_count = 4.0
    messages = ['298836', '357638', '383564', '407789']
    nosy_count = 8.0
    nosy_names = ['barry', 'ncoghlan', 'kfogel', 'r.david.murray', 'timb07', 'cnicodeme', 'Steven Hilton', 'iritkatriel']
    pr_nums = ['2811', '15600', '18687']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue30988'
    versions = ['Python 3.9', 'Python 3.10', 'Python 3.11']

    Linked PRs

    @timb07
    Copy link
    Mannequin Author

    timb07 mannequin commented Jul 22, 2017

    Email addresses with a display name starting with a dot ("."), or ending with a dot without whitespace before the angle bracket trigger exceptions when accessing the header, after creating the message object with the "default" policy.

    For example:

    >>> import email
    >>> from email.policy import default
    >>> email.message_from_bytes(b'To: . Doe <jxd@example.com>')['to']
    '. Doe <jxd@example.com>'
    >>> email.message_from_bytes(b'To: . Doe <jxd@example.com>', policy=default)['to']
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/Users/bhat/git/cpython/Lib/email/message.py", line 391, in __getitem__
        return self.get(name)
      File "/Users/bhat/git/cpython/Lib/email/message.py", line 471, in get
        return self.policy.header_fetch_parse(k, v)
      File "/Users/bhat/git/cpython/Lib/email/policy.py", line 162, in header_fetch_parse
        return self.header_factory(name, value)
      File "/Users/bhat/git/cpython/Lib/email/headerregistry.py", line 586, in __call__
        return self[name](name, value)
      File "/Users/bhat/git/cpython/Lib/email/headerregistry.py", line 197, in __new__
        cls.parse(value, kwds)
      File "/Users/bhat/git/cpython/Lib/email/headerregistry.py", line 344, in parse
        for mb in addr.all_mailboxes]))
      File "/Users/bhat/git/cpython/Lib/email/headerregistry.py", line 344, in <listcomp>
        for mb in addr.all_mailboxes]))
      File "/Users/bhat/git/cpython/Lib/email/_header_value_parser.py", line 834, in display_name
        return self[0].display_name
      File "/Users/bhat/git/cpython/Lib/email/_header_value_parser.py", line 768, in display_name
        return self[0].display_name
      File "/Users/bhat/git/cpython/Lib/email/_header_value_parser.py", line 931, in display_name
        if res[0][0].token_type == 'cfws':
    AttributeError: 'str' object has no attribute 'token_type'
    >>>
    >>> email.message_from_bytes(b'To: John X.<jxd@example.com>')['to']
    'John X.<jxd@example.com>'
    >>> email.message_from_bytes(b'To: John X.<jxd@example.com>', policy=default)['to']
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/Users/bhat/git/cpython/Lib/email/message.py", line 391, in __getitem__
        return self.get(name)
      File "/Users/bhat/git/cpython/Lib/email/message.py", line 471, in get
        return self.policy.header_fetch_parse(k, v)
      File "/Users/bhat/git/cpython/Lib/email/policy.py", line 162, in header_fetch_parse
        return self.header_factory(name, value)
      File "/Users/bhat/git/cpython/Lib/email/headerregistry.py", line 586, in __call__
        return self[name](name, value)
      File "/Users/bhat/git/cpython/Lib/email/headerregistry.py", line 197, in __new__
        cls.parse(value, kwds)
      File "/Users/bhat/git/cpython/Lib/email/headerregistry.py", line 344, in parse
        for mb in addr.all_mailboxes]))
      File "/Users/bhat/git/cpython/Lib/email/headerregistry.py", line 344, in <listcomp>
        for mb in addr.all_mailboxes]))
      File "/Users/bhat/git/cpython/Lib/email/_header_value_parser.py", line 834, in display_name
        return self[0].display_name
      File "/Users/bhat/git/cpython/Lib/email/_header_value_parser.py", line 768, in display_name
        return self[0].display_name
      File "/Users/bhat/git/cpython/Lib/email/_header_value_parser.py", line 936, in display_name
        if res[-1][-1].token_type == 'cfws':
    AttributeError: 'str' object has no attribute 'token_type'

    @timb07 timb07 mannequin added 3.7 (EOL) end of life topic-email type-bug An unexpected behavior, bug, or error labels Jul 22, 2017
    @cnicodeme
    Copy link
    Mannequin

    cnicodeme mannequin commented Nov 29, 2019

    Hi!

    I confirm this problem too, also with the SMTPUTF8 policy.

    I was able to reproduce this error on my end (Python v3.7.5).

    Note that when calling message_from_bytes without policy, there is no errors.

    @kfogel
    Copy link
    Mannequin

    kfogel mannequin commented Dec 22, 2020

    I can also confirm this bug, with Python 3.9.1 on Debian GNU/Linux ('testing' distro up-to-date as of 2020-12-21).

    1. Create a parser p with p = email.parser.HeaderParser(policy=email.policy.default).

    2. Parse a single problematic (as described below) message: msg_headers = p.parsestr(msg_str)

    3. Try to get the To field: msg_headers.get_all('to', [ ]) and get an exception raised: AttributeError: 'str' object has no attribute 'token_type' (with the same stack trace as OP shows).

    Here is a minimal problematic message you can reproduce this with (i.e., just make msg_str have this string value):

    From nobody Mon Dec 21 12:00:00  2020
    From: sender@example.com
    To: . <jrandom@example.com>
    Date: Mon, 21 Dec 2020 12:00:00 -0000
    Message-ID: <87ab5rvds7.fsf@example.com>
    Subject: This is the Subject header.
    
    Here is the body of the message.
    

    Note that *any* number of dots for the recipient's name would also result in an error. The above example uses just ".", but it could be "..", "...", ".................", etc.

    @iritkatriel
    Copy link
    Member

    Reproduced on 3.11.

    @iritkatriel iritkatriel added 3.9 only security fixes 3.10 only security fixes 3.11 only security fixes and removed 3.7 (EOL) end of life labels Dec 6, 2021
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @wmfidelis
    Copy link

    I had the same issue and converting it to a tuple solved the issue for me.

    def sanitize_address(addr, encoding): """ Format a pair of (name, address) or an email address string. """ address = None if not isinstance(addr, tuple): addr = force_str(addr) try: token, rest = parser.get_mailbox(addr) except (HeaderParseError, ValueError, IndexError): raise ValueError('Invalid address "%s"' % addr) else: if rest: # The entire email address must be parsed. raise ValueError( 'Invalid address; only %s could be parsed from "%s"' % (token, addr) ) nm = token.display_name or '' localpart = token.local_part domain = token.domain or '' else: nm, address = addr localpart, domain = address.rsplit('@', 1)

        this function in django allowed me to pair the name and address as a tuple which eliminates all errors involved with special characters like a dot, a comma, @ sign, etc
    

    @iritkatriel iritkatriel added the stdlib Python modules in the Lib dir label Nov 23, 2023
    serhiy-storchaka added a commit that referenced this issue Apr 17, 2024
    …g with a dot (GH-15600)
    
    Co-authored-by: Tim Bell <timothybell@gmail.com>
    Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
    miss-islington pushed a commit to miss-islington/cpython that referenced this issue Apr 17, 2024
    … ending with a dot (pythonGH-15600)
    
    (cherry picked from commit 8cc9adb)
    
    Co-authored-by: tsufeki <tsufeki@ymail.com>
    Co-authored-by: Tim Bell <timothybell@gmail.com>
    Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
    serhiy-storchaka added a commit that referenced this issue Apr 17, 2024
    …r ending with a dot (GH-15600) (GH-117964)
    
    (cherry picked from commit 8cc9adb)
    
    Co-authored-by: tsufeki <tsufeki@ymail.com>
    Co-authored-by: Tim Bell <timothybell@gmail.com>
    Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
    diegorusso pushed a commit to diegorusso/cpython that referenced this issue Apr 17, 2024
    … ending with a dot (pythonGH-15600)
    
    Co-authored-by: Tim Bell <timothybell@gmail.com>
    Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.9 only security fixes 3.10 only security fixes 3.11 only security fixes stdlib Python modules in the Lib dir topic-email type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants