Issue34222
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2018-07-25 12:56 by altvod, last changed 2022-04-11 14:59 by admin. This issue is now closed.
Pull Requests | |||
---|---|---|---|
URL | Status | Linked | Edit |
PR 8990 | closed | python-dev, 2018-08-29 11:06 |
Messages (4) | |||
---|---|---|---|
msg322351 - (view) | Author: Grigory Statsenko (altvod) | Date: 2018-07-25 12:56 | |
(Discovered together with https://bugs.python.org/msg322348) Email message serialization (in function _fold_as_ew) enters an infinite loop when folding non-ASCII headers whose words (after encoding) are longer than the given maxlen. Besides being stuck in an infinite loop, it keeps appending to the `lines` list, so its memory usage keeps on growing also infinitely. The code keeps appending encoded empty strings to the list like this: lines: [ 'Subject: =?utf-8?q??=', ' =?utf-8?q??=', ' =?utf-8?q??=', ' =?utf-8?q??=', ' =?utf-8?q??=', ' =?utf-8?q??=', ' ' ] (and it keeps on growing) Here is my code that can reproduce this issue (as a unittest): import email.generator import email.policy from email.mime.multipart import MIMEMultipart from email.mime.text import MIMEText from unittest import TestCase def create_message(subject, sender, recipients, body): msg = MIMEMultipart() msg.set_charset('utf-8') msg.policy = email.policy.SMTP msg.attach(MIMEText(body, 'html')) msg['Subject'] = subject msg['From'] = sender msg['To'] = ';'.join(recipients) return msg class TestEmailMessage(TestCase): def _make_message(self, subject): return create_message( subject=subject, sender='me@site.com', recipients=['me@site.com'], body='Some text', ) def test_ascii_message_with_len_limit(self): # very long subject consisting of a single word subject = 'Q' * 100 msg = self._make_message(subject) self.assertTrue(msg.as_string(maxheaderlen=76)) def test_non_ascii_message_with_len_limit(self): # very long subject consisting of a single word subject = 'Ц' * 100 msg = self._make_message(subject) self.assertTrue(msg.as_string(maxheaderlen=76)) The ASCII test passes, but the non-ASCII one never finishes. From what I can tell, the problem is in line 2728 of email/_header_value_parser.py: first_part = first_part[:-excess] where `excess` is calculated from the encoded string (which is several times longer than the original one), but it truncates the original (non-encoded string). The problem arises when `excess` is actually greater than `first_part` So, it attempts to encode the exact same part of the header and fails in every iteration, instead appending an empty string to the list and encoding it as ' =?utf-8?q??=' What this amounts to is that it's now practically impossible to send emails with non-ACSII subjects without either disregarding the RFC recommendations and requirements for line length or risking hangs and memory leaks. Just like in https://bugs.python.org/msg322348, this behavior is new in Python 3.6. Also does not work in 3.7 and 3.8 |
|||
msg334564 - (view) | Author: Ivan Krivosheev (ikrivosheev) * | Date: 2019-01-30 15:47 | |
Hello Grigory. I using our patch in my project. I have some problems with your fixes. Source text: Subject: test Венесуэла собирается пересмотреть стоимость заключенных с Россией контрактов на поставку вооружений, а также отношения с Москвой в целом. Об этом заявил назначенный оппозицией специальный представитель Венесуэлы при Организации американских государств (ОАГ) Густаво Тарре Брисеньо на выступлении в вашингтонском Центре стратегических и международных исследований, передает Encoded text using thunderbird: Subject: =?UTF-8?B?dGVzdCDQktC10L3QtdGB0YPRjdC70LAg0YHQvtCx0LjRgNCw0LXRgtGB?= =?UTF-8?B?0Y8g0L/QtdGA0LXRgdC80L7RgtGA0LXRgtGMINGB0YLQvtC40LzQvtGB0YLRjCA=?= =?UTF-8?B?0LfQsNC60LvRjtGH0LXQvdC90YvRhSDRgSDQoNC+0YHRgdC40LXQuSDQutC+0L0=?= =?UTF-8?B?0YLRgNCw0LrRgtC+0LIg0L3QsCDQv9C+0YHRgtCw0LLQutGDINCy0L7QvtGA0YM=?= =?UTF-8?B?0LbQtdC90LjQuSwg0LAg0YLQsNC60LbQtSDQvtGC0L3QvtGI0LXQvdC40Y8g0YEg?= =?UTF-8?B?0JzQvtGB0LrQstC+0Lkg0LIg0YbQtdC70L7QvC4g0J7QsSDRjdGC0L7QvCDQt9Cw?= =?UTF-8?B?0Y/QstC40Lsg0L3QsNC30L3QsNGH0LXQvdC90YvQuSDQvtC/0L/QvtC30LjRhtC4?= =?UTF-8?B?0LXQuSDRgdC/0LXRhtC40LDQu9GM0L3Ri9C5INC/0YDQtdC00YHRgtCw0LLQuNGC?= =?UTF-8?B?0LXQu9GMINCS0LXQvdC10YHRg9GN0LvRiyDQv9GA0Lgg0J7RgNCz0LDQvdC40Lc=?= =?UTF-8?B?0LDRhtC40Lgg0LDQvNC10YDQuNC60LDQvdGB0LrQuNGFINCz0L7RgdGD0LTQsNGA?= =?UTF-8?B?0YHRgtCyICjQntCQ0JMpINCT0YPRgdGC0LDQstC+INCi0LDRgNGA0LUg0JHRgNC4?= =?UTF-8?B?0YHQtdC90YzQviDQvdCwINCy0YvRgdGC0YPQv9C70LXQvdC40Lgg0LIg0LLQsNGI?= =?UTF-8?B?0LjQvdCz0YLQvtC90YHQutC+0Lwg0KbQtdC90YLRgNC1INGB0YLRgNCw0YLQtdCz?= =?UTF-8?B?0LjRh9C10YHQutC40YUg0Lgg0LzQtdC20LTRg9C90LDRgNC+0LTQvdGL0YUg0Lg=?= =?UTF-8?B?0YHRgdC70LXQtNC+0LLQsNC90LjQuSwg0L/QtdGA0LXQtNCw0LXRgg==?= Text after decode and encode in python with our patch: Subject: test =?utf-8?b?0JLQtdC90LXRgdGD0Y3Qu9CwINGB0L7QsdC40YDQsNC10YLRgdGP?= =?utf-8?b?0L/=?utf-8?q?QtdGA0LXRgdC80L7RgtGA0LXRgtGM=3F=3D_=D1=81=D1=82?= =?utf-8?b?0L7QuNC80L7RgdGC0Ywg0LfQsNC60LvRjtGH0LXQvdC90YvRhSDRgSDQoNC+0YE=?= =?utf-8?b?0YHQuNC10Lkg0LrQvtC90YLRgNCw0LrRgtC+0LIg0L3QsCDQv9C+0YHRgtCw0LI=?= =?utf-8?b?0LrRgyDQstC+0L7RgNGD0LbQtdC90LjQuSwg0LAg0YLQsNC60LbQtSDQvtGC0L0=?= =?utf-8?b?0L7RiNC10L3QuNGPINGBINCc0L7RgdC60LLQvtC5INCyINGG0LXQu9C+0LwuINCe?= =?utf-8?b?0LEg0Y3RgtC+0Lwg0LfQsNGP0LLQuNC7INC90LDQt9C90LDRh9C10L3QvdGL0Lkg?= =?utf-8?b?0L7Qv9C/0L7Qt9C40YbQuNC10Lkg0YHQv9C10YbQuNCw0LvRjNC90YvQuSDQv9GA?= =?utf-8?b?0LXQtNGB0YLQsNCy0LjRgtC10LvRjCDQktC10L3QtdGB0YPRjdC70Ysg0L/RgNC4?= =?utf-8?b?0J7RgNCz0LDQvdC40LfQsNGG0LjQuCDQsNC80LXRgNC40LrQsNC90YHQutC40YU=?= =?utf-8?b?0LPQvtGB0YPQtNCw0YDRgdGC0LIgKNCe0JDQkykg0JPRg9GB0YLQsNCy0L4g0KI=?= =?utf-8?b?0LDRgNGA0LUg0JHRgNC40YHQtdC90YzQviDQvdCwINCy0YvRgdGC0YPQv9C70LU=?= =?utf-8?b?0L3QuNC4INCyINCy0LDRiNC40L3Qs9GC0L7QvdGB0LrQvtC8INCm0LXQvdGC0YA=?= =?utf-8?b?0LUg0YHRgtGA0LDRgtC10LPQuNGH0LXRgdC60LjRhSDQuCDQvNC10LbQtNGD0L0=?= =?utf-8?b?0LDRgNC+0LTQvdGL0YUg0LjRgdGB0LvQtdC00L7QstCw0L3QuNC5LCDQv9C10YA=?= =?utf-8?b?0LXQtNCw0LXRgg==?= Result text: Subject: test Венесуэла собирается =?utf-8?b?0L/QtdGA0LXRgdC80L7RgtGA0LXRgtGM?= стоимость заключенных с Россией контрактов на поставку вооружений, а также отношения с Москвой в целом. Об этом заявил назначенный оппозицией специальный представитель Венесуэлы приОрганизации американскихгосударств (ОАГ) Густаво Тарре Брисеньо на выступлении в вашингтонском Центре стратегических и международных исследований, передает If need, i can write simple code for reproduce bug. |
|||
msg343695 - (view) | Author: Abhilash Raj (maxking) * | Date: 2019-05-27 22:54 | |
IMO, this is a duplicate of https://bugs.python.org/issue33529 (which was reported before this one was). I have tested that the fix for bpo-33529 does indeed fix the test case which has been provided above. The Pull Request for bpo-33529 has been merged, so I believe this issue and associated PR can be closed. |
|||
msg344743 - (view) | Author: Cheryl Sabella (cheryl.sabella) * | Date: 2019-06-05 16:17 | |
Thanks for the report, @maxking. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:59:03 | admin | set | github: 78403 |
2019-06-05 16:17:36 | cheryl.sabella | set | nosy:
+ cheryl.sabella messages: + msg344743 |
2019-06-05 16:17:07 | cheryl.sabella | set | status: open -> closed superseder: [security] Infinite loop on folding email (_fold_as_ew()) if an header has no spaces resolution: duplicate stage: patch review -> resolved |
2019-05-27 22:54:22 | maxking | set | nosy:
+ maxking messages: + msg343695 |
2019-01-30 15:47:56 | ikrivosheev | set | nosy:
+ ikrivosheev messages: + msg334564 |
2018-08-29 11:06:52 | python-dev | set | keywords:
+ patch stage: patch review pull_requests: + pull_request8463 |
2018-07-25 12:56:44 | altvod | create |