Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exception when parsing an email using email.parser.BytesParser #67933

Closed
Elmer mannequin opened this issue Mar 23, 2015 · 3 comments
Closed

Exception when parsing an email using email.parser.BytesParser #67933

Elmer mannequin opened this issue Mar 23, 2015 · 3 comments
Labels
topic-email type-bug An unexpected behavior, bug, or error

Comments

@Elmer
Copy link
Mannequin

Elmer mannequin commented Mar 23, 2015

BPO 23745
Nosy @warsaw, @bitdancer
Files
  • testmail.eml: email source that triggers exception
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2015-03-30.01:57:00.478>
    created_at = <Date 2015-03-23.07:43:43.175>
    labels = ['type-bug', 'expert-email']
    title = 'Exception when parsing an email using email.parser.BytesParser'
    updated_at = <Date 2015-03-30.01:57:00.477>
    user = 'https://bugs.python.org/Elmer'

    bugs.python.org fields:

    activity = <Date 2015-03-30.01:57:00.477>
    actor = 'r.david.murray'
    assignee = 'none'
    closed = True
    closed_date = <Date 2015-03-30.01:57:00.478>
    closer = 'r.david.murray'
    components = ['email']
    creation = <Date 2015-03-23.07:43:43.175>
    creator = 'Elmer'
    dependencies = []
    files = ['38647']
    hgrepos = []
    issue_num = 23745
    keywords = []
    message_count = 3.0
    messages = ['238987', '239555', '239557']
    nosy_count = 4.0
    nosy_names = ['barry', 'r.david.murray', 'python-dev', 'Elmer']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue23745'
    versions = ['Python 3.4']

    @Elmer
    Copy link
    Mannequin Author

    Elmer mannequin commented Mar 23, 2015

    I am working with a large dataset of emails and loading one of them resulted in an exception: "TypeError: unorderable types: ValueTerminal() < CFWSList()"

    I have attached the (anonymised and minimised) email source of the email that triggered the exception.

    $ python
    Python 3.4.2 (default, Nov 12 2014, 18:23:59) 
    [GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.54)] on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import email
    >>> from email import parser, policy
    >>> 
    >>> f = open("testmail.eml",'rb')
    >>> src = f.read()
    >>> f.close()
    >>> 
    >>> msg = email.parser.BytesParser(_class=email.message.EmailMessage, policy=email.policy.default).parsebytes(src)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/email/parser.py", line 124, in parsebytes
        return self.parser.parsestr(text, headersonly)
      File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/email/parser.py", line 68, in parsestr
        return self.parse(StringIO(text), headersonly=headersonly)
      File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/email/parser.py", line 57, in parse
        feedparser.feed(data)
      File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/email/feedparser.py", line 178, in feed
        self._call_parse()
      File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/email/feedparser.py", line 182, in _call_parse
        self._parse()
      File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/email/feedparser.py", line 384, in _parsegen
        for retval in self._parsegen():
      File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/email/feedparser.py", line 255, in _parsegen
        if self._cur.get_content_type() == 'message/delivery-status':
      File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/email/message.py", line 579, in get_content_type
        value = self.get('content-type', missing)
      File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/email/message.py", line 472, in get
        return self.policy.header_fetch_parse(k, v)
      File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/email/policy.py", line 145, in header_fetch_parse
        return self.header_factory(name, ''.join(value.splitlines()))
      File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/email/headerregistry.py", line 583, in __call__
        return self[name](name, value)
      File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/email/headerregistry.py", line 194, in __new__
        cls.parse(value, kwds)
      File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/email/headerregistry.py", line 441, in parse
        kwds['decoded'] = str(parse_tree)
      File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/email/_header_value_parser.py", line 195, in __str__
        return ''.join(str(x) for x in self)
      File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/email/_header_value_parser.py", line 195, in <genexpr>
        return ''.join(str(x) for x in self)
      File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/email/_header_value_parser.py", line 1136, in __str__
        for name, value in self.params:
      File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/email/_header_value_parser.py", line 1101, in params
        parts = sorted(parts)
    TypeError: unorderable types: ValueTerminal() < CFWSList()

    @Elmer Elmer mannequin added topic-email type-bug An unexpected behavior, bug, or error labels Mar 23, 2015
    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Mar 30, 2015

    New changeset dc10c52c6539 by R David Murray in branch '3.4':
    bpo-23745: handle duplicate MIME parameter names in new parser.
    https://hg.python.org/cpython/rev/dc10c52c6539

    New changeset fe9a578d5f38 by R David Murray in branch 'default':
    Merge: bpo-23745: handle duplicate MIME parameter names in new parser.
    https://hg.python.org/cpython/rev/fe9a578d5f38

    @bitdancer
    Copy link
    Member

    The issue arose from the duplicated parameter name. I fixed it by (mostly) copying the error recovery used by the older api (get_param).

    Note that you don't need to specify both policy and _class. If you use the new policies (such as default), it automatically uses EmailMessage for the _class.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    topic-email type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant