Issue34220
Created on 2018-07-25 12:19 by altvod, last changed 2018-07-25 14:51 by r.david.murray. This issue is now closed.
Messages (3) | |||
---|---|---|---|
msg322348 - (view) | Author: Grigory Statsenko (altvod) | Date: 2018-07-25 12:19 | |
I have the following code that creates a simple email message with a) a pure-ASCII subject, b) non-ASCII subject (I made it into a unittest): import email.generator import email.policy from email.mime.multipart import MIMEMultipart from email.mime.text import MIMEText from unittest import TestCase def create_message(subject, sender, recipients, body): msg = MIMEMultipart() msg.set_charset('utf-8') msg.policy = email.policy.SMTP msg.attach(MIMEText(body, 'html')) msg['Subject'] = subject msg['From'] = sender msg['To'] = ';'.join(recipients) return msg class TestEmailMessage(TestCase): def _make_message(self, subject): return create_message( subject=subject, sender='me@site.com', recipients=['me@site.com'], body='Some text', ) def test_ascii_message_no_len_limit(self): # very long subject consisting of a single word subject = 'Q' * 100 msg = self._make_message(subject) self.assertTrue(str(msg)) def test_non_ascii_message_no_len_limit(self): # very long subject consisting of a single word subject = 'Ц' * 100 msg = self._make_message(subject) self.assertTrue(str(msg)) The ASCII one passes, while the non-ASCII version fails with the following exception: Traceback (most recent call last): File "/home/grigory/.pyenv/versions/3.6.4/lib/python3.6/unittest/case.py", line 59, in testPartExecutor yield File "/home/grigory/.pyenv/versions/3.6.4/lib/python3.6/unittest/case.py", line 605, in run testMethod() File "/home/grigory/PycharmProjects/smtptest/test_message.py", line 36, in test_non_ascii_message_no_len_limit self.assertTrue(str(msg)) File "/home/grigory/.pyenv/versions/3.6.4/lib/python3.6/email/message.py", line 135, in __str__ return self.as_string() File "/home/grigory/.pyenv/versions/3.6.4/lib/python3.6/email/message.py", line 158, in as_string g.flatten(self, unixfrom=unixfrom) File "/home/grigory/.pyenv/versions/3.6.4/lib/python3.6/email/generator.py", line 116, in flatten self._write(msg) File "/home/grigory/.pyenv/versions/3.6.4/lib/python3.6/email/generator.py", line 195, in _write self._write_headers(msg) File "/home/grigory/.pyenv/versions/3.6.4/lib/python3.6/email/generator.py", line 222, in _write_headers self.write(self.policy.fold(h, v)) File "/home/grigory/.pyenv/versions/3.6.4/lib/python3.6/email/policy.py", line 183, in fold return self._fold(name, value, refold_binary=True) File "/home/grigory/.pyenv/versions/3.6.4/lib/python3.6/email/policy.py", line 205, in _fold return value.fold(policy=self) File "/home/grigory/.pyenv/versions/3.6.4/lib/python3.6/email/headerregistry.py", line 258, in fold return header.fold(policy=policy) File "/home/grigory/.pyenv/versions/3.6.4/lib/python3.6/email/_header_value_parser.py", line 144, in fold return _refold_parse_tree(self, policy=policy) File "/home/grigory/.pyenv/versions/3.6.4/lib/python3.6/email/_header_value_parser.py", line 2645, in _refold_parse_tree part.ew_combine_allowed, charset) File "/home/grigory/.pyenv/versions/3.6.4/lib/python3.6/email/_header_value_parser.py", line 2722, in _fold_as_ew first_part = to_encode[:text_space] TypeError: slice indices must be integers or None or have an __index__ method The problem is that _fold_as_ew treats maxlen as an integer, but it can also have inf and None as valid values. In my case it's inf, but None can also get there if the HTTP email policy is used and its max_line_length value is not overridden when serializing. I am supposing that the correct behavior in both of these cases should be no wrapping at all. And/or maybe one of these (inf & None) should be converted to the other at some point, so only one special case has to handled in the low-level code This behavior is new in Python 3.6. It works in 3.5. Also fails in 3.7 and 3.8 |
|||
msg322356 - (view) | Author: Karthikeyan Singaravelan (xtreak) * ![]() |
Date: 2018-07-25 14:22 | |
I took all the commits made to Lib/email from 3.5 to latest of 3.6 branch with `git log --oneline --format="%h" upstream/3.5..upstream/3.6 Lib/email > commits.txt` I could see the test fails with a87ba60 and passes with d94ef8f. Probably something to do with a87ba60fe56ae2ebe80ab9ada6d280a6a1f3d552 that had a rewrite the email header folding algorithm as I can see from the issue https://bugs.python.org/issue27240 cpython git:(master) ✗ ./python Python 3.8.0a0 (heads/bpo34193-dirty:bfdde5a, Jul 25 2018, 07:51:50) [GCC 5.4.0 20160609] on linux Type "help", "copyright", "credits" or "license" for more information. >>> # commit a87ba60 cpython git:(master) $ git checkout a87ba60 Lib/email && ./python -m unittest bpo34220.py && git reset --quiet HEAD . && git checkout . .E ====================================================================== ERROR: test_non_ascii_message_no_len_limit (bpo34220.TestEmailMessage) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/cpython/bpo34220.py", line 35, in test_non_ascii_message_no_len_limit self.assertTrue(str(msg)) File "/home/cpython/Lib/email/message.py", line 135, in __str__ return self.as_string() File "/home/cpython/Lib/email/message.py", line 158, in as_string g.flatten(self, unixfrom=unixfrom) File "/home/cpython/Lib/email/generator.py", line 116, in flatten self._write(msg) File "/home/cpython/Lib/email/generator.py", line 195, in _write self._write_headers(msg) File "/home/cpython/Lib/email/generator.py", line 222, in _write_headers self.write(self.policy.fold(h, v)) File "/home/cpython/Lib/email/policy.py", line 183, in fold return self._fold(name, value, refold_binary=True) File "/home/cpython/Lib/email/policy.py", line 205, in _fold return value.fold(policy=self) File "/home/cpython/Lib/email/headerregistry.py", line 258, in fold return header.fold(policy=policy) File "/home/cpython/Lib/email/_header_value_parser.py", line 144, in fold return _refold_parse_tree(self, policy=policy) File "/home/cpython/Lib/email/_header_value_parser.py", line 2645, in _refold_parse_tree part.ew_combine_allowed, charset) File "/home/cpython/Lib/email/_header_value_parser.py", line 2722, in _fold_as_ew first_part = to_encode[:text_space] TypeError: slice indices must be integers or None or have an __index__ method ---------------------------------------------------------------------- Ran 2 tests in 0.022s FAILED (errors=1) # commit d94ef8f cpython git:(master) $ git checkout d94ef8f Lib/email && ./python -m unittest bpo34220.py && git reset --quiet HEAD . && git checkout . .. ---------------------------------------------------------------------- Ran 2 tests in 0.017s OK Hope I am correct on the above approach and there are no C code related changes that need to be made to recompile Python. Thanks |
|||
msg322360 - (view) | Author: R. David Murray (r.david.murray) * ![]() |
Date: 2018-07-25 14:51 | |
Thanks for the report. This is a duplicate of #33524. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2018-07-25 14:51:04 | r.david.murray | set | status: open -> closed superseder: non-ascii characters in headers causes TypeError on email.policy.Policy.fold messages: + msg322360 resolution: duplicate stage: resolved |
2018-07-25 14:22:17 | xtreak | set | messages: + msg322356 |
2018-07-25 12:56:19 | xtreak | set | nosy:
+ xtreak |
2018-07-25 12:19:20 | altvod | create |