New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exception when parsing an email using email.parser.BytesParser #67933
Comments
I am working with a large dataset of emails and loading one of them resulted in an exception: "TypeError: unorderable types: ValueTerminal() < CFWSList()" I have attached the (anonymised and minimised) email source of the email that triggered the exception. $ python
Python 3.4.2 (default, Nov 12 2014, 18:23:59)
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.54)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import email
>>> from email import parser, policy
>>>
>>> f = open("testmail.eml",'rb')
>>> src = f.read()
>>> f.close()
>>>
>>> msg = email.parser.BytesParser(_class=email.message.EmailMessage, policy=email.policy.default).parsebytes(src)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/email/parser.py", line 124, in parsebytes
return self.parser.parsestr(text, headersonly)
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/email/parser.py", line 68, in parsestr
return self.parse(StringIO(text), headersonly=headersonly)
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/email/parser.py", line 57, in parse
feedparser.feed(data)
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/email/feedparser.py", line 178, in feed
self._call_parse()
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/email/feedparser.py", line 182, in _call_parse
self._parse()
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/email/feedparser.py", line 384, in _parsegen
for retval in self._parsegen():
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/email/feedparser.py", line 255, in _parsegen
if self._cur.get_content_type() == 'message/delivery-status':
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/email/message.py", line 579, in get_content_type
value = self.get('content-type', missing)
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/email/message.py", line 472, in get
return self.policy.header_fetch_parse(k, v)
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/email/policy.py", line 145, in header_fetch_parse
return self.header_factory(name, ''.join(value.splitlines()))
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/email/headerregistry.py", line 583, in __call__
return self[name](name, value)
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/email/headerregistry.py", line 194, in __new__
cls.parse(value, kwds)
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/email/headerregistry.py", line 441, in parse
kwds['decoded'] = str(parse_tree)
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/email/_header_value_parser.py", line 195, in __str__
return ''.join(str(x) for x in self)
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/email/_header_value_parser.py", line 195, in <genexpr>
return ''.join(str(x) for x in self)
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/email/_header_value_parser.py", line 1136, in __str__
for name, value in self.params:
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/email/_header_value_parser.py", line 1101, in params
parts = sorted(parts)
TypeError: unorderable types: ValueTerminal() < CFWSList() |
New changeset dc10c52c6539 by R David Murray in branch '3.4': New changeset fe9a578d5f38 by R David Murray in branch 'default': |
The issue arose from the duplicated parameter name. I fixed it by (mostly) copying the error recovery used by the older api (get_param). Note that you don't need to specify both policy and _class. If you use the new policies (such as default), it automatically uses EmailMessage for the _class. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: