Message308353
This is related to https://bugs.python.org/issue27321 but a different exception is thrown for a different reason. This is caused by a defective spam message. I don't actually have the offending message from the wild, but the attached bad_email_2.eml illustrates the problem.
The defect is the message declares the content charset as us-ascii, but the body contains non-ascii. When the message is parsed into an email.message.Message object and the objects as_string() method is called, UnicodeEncodeError is thrown as follows:
>>> import email
>>> with open('bad_email_2.eml', 'rb') as fp:
... msg = email.message_from_binary_file(fp)
...
>>> msg.as_string()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.5/email/message.py", line 159, in as_string
g.flatten(self, unixfrom=unixfrom)
File "/usr/lib/python3.5/email/generator.py", line 115, in flatten
self._write(msg)
File "/usr/lib/python3.5/email/generator.py", line 181, in _write
self._dispatch(msg)
File "/usr/lib/python3.5/email/generator.py", line 214, in _dispatch
meth(msg)
File "/usr/lib/python3.5/email/generator.py", line 243, in _handle_text
msg.set_payload(payload, charset)
File "/usr/lib/python3.5/email/message.py", line 316, in set_payload
payload = payload.encode(charset.output_charset)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 31-33: ordinal not in range(128) |
|
Date |
User |
Action |
Args |
2017-12-15 00:25:27 | msapiro | set | recipients:
+ msapiro, barry, r.david.murray |
2017-12-15 00:25:27 | msapiro | set | messageid: <1513297527.86.0.213398074469.issue32330@psf.upfronthosting.co.za> |
2017-12-15 00:25:27 | msapiro | link | issue32330 messages |
2017-12-15 00:25:27 | msapiro | create | |
|