Also support SMTPUTF8 in smtplib's send_message method. #68406

bitdancer · 2015-05-17T16:22:00Z

BPO	24218
Nosy	@warsaw, @bitdancer, @soltysh, @matrixise, @jenstroeger
Files	send_message_smtputf8.patch

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2015-05-17.23:45:06.566>
created_at = <Date 2015-05-17.16:21:59.940>
labels = ['type-feature', 'expert-email']
title = "Also support SMTPUTF8 in smtplib's send_message method."
updated_at = <Date 2018-08-17.23:03:08.448>
user = 'https://github.com/bitdancer'

bugs.python.org fields:

activity = <Date 2018-08-17.23:03:08.448>
actor = '_savage'
assignee = 'none'
closed = True
closed_date = <Date 2015-05-17.23:45:06.566>
closer = 'r.david.murray'
components = ['email']
creation = <Date 2015-05-17.16:21:59.940>
creator = 'r.david.murray'
dependencies = []
files = ['39409']
hgrepos = []
issue_num = 24218
keywords = ['patch']
message_count = 17.0
messages = ['243413', '243425', '243426', '243433', '243435', '322761', '322767', '322772', '322773', '322789', '323044', '323216', '323559', '323565', '323634', '323662', '323687']
nosy_count = 6.0
nosy_names = ['barry', 'r.david.murray', 'python-dev', 'maciej.szulik', 'matrixise', '_savage']
pr_nums = []
priority = 'normal'
resolution = 'fixed'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue24218'
versions = ['Python 3.5']

bitdancer · 2015-05-17T16:21:59Z

Now that I've committed bpo-24211, we can also add SMTPUTF8 support to smptlib's send_message command. See attached patch.

soltysh · 2015-05-17T19:03:34Z

David one small comment regarding typo in smtplib.py, but most importantly I'd suggest adding additional test case to cover the path (the newly added one) where you get a UnicodeEncodeError upon encoding from or to with UTF8 and then failing to find SMTPUTF8 on the server side. I see we already have test case to cover SMTPNotSupportedError but this covers just the case where the server does not have SMTPUTF8.

bitdancer · 2015-05-17T19:05:16Z

Oh, right, that's what I get for doing this at the end of a long chain of patch reviews :). I added that code after I'd written the test and forgot to go back and test it.

python-dev · 2015-05-17T23:38:16Z

New changeset 30795a477f85 by R David Murray in branch 'default':
bpo-24218: Add SMTPUTF8 support to send_message.
https://hg.python.org/cpython/rev/30795a477f85

bitdancer · 2015-05-17T23:45:07Z

Thanks Maciej.

jenstroeger · 2018-07-31T12:21:43Z

I was about to open an issue when I found this one.

Consider an email message with the following:

message = EmailMessage()
message["From"] = Address(addr_spec="bar@foo.com", display_name="Jens Troeger")
message["To"] = Address(addr_spec="foo@bar.com", display_name="Martín Córdoba")

It’s important here that the email itself is ascii encodable, but the names are not. With that combination, send_message() falsely assumes plain text addresses (see https://github.com/python/cpython/blob/master/Lib/smtplib.py#L949 where it checks only email addresses, not display names!) and therefore the international flag stays False.

As a result of that, flattening the email object (https://github.com/python/cpython/blob/master/Lib/smtplib.py#L964) incorrectly inserts multiple linefeeds, thus breaking the email header, thus mangling the entire email:

flatmsg: b'From: Jens Troeger <jens@talaera.com>\r\nTo: Fernando =?utf-8?q?Mart=C3=ADn_C=C3=B3rdoba?= <foo@bar.com>\r\r\r\r\r\nSubject:\r\n Confirmation: …\r\n…'

I think a proper fix would be in line 949, where email addresses and display names should be checked for encoding.

The comment to that function should also be adjusted to mention display names?

Note also that the attached patch does not test the above scenario, and should probably be extended as well.

jenstroeger · 2018-07-31T12:52:31Z

(continuing the previous message msg322761)

…unless the addresses should be checked separately from the display names, in which case the BytesGenerator’s flatten() function should be fixed.

Without reading the RFC, please let me know how to continue from here.

bitdancer · 2018-07-31T13:18:19Z

Well, posting on a closed issue is generally not the best way :)

The current behavior with regards to the SMTPUTF8 flag is correct (it only matters for *addresses*, display names can already be transmitted if they contain non-ascii using non SMTPUTF8 methods).

The multiple carriage returns is a bug, and there is an open issue for it, though I'm not finding it at the moment.

matrixise · 2018-07-31T13:20:21Z

Hi David,

What is the related issue with the new lines?

On 31 Jul 2018, at 15:18, R. David Murray <report@bugs.python.org> wrote:

R. David Murray <rdmurray@bitdance.com> added the comment:

Well, posting on a closed issue is generally not the best way :)

The current behavior with regards to the SMTPUTF8 flag is correct (it only matters for *addresses*, display names can already be transmitted if they contain non-ascii using non SMTPUTF8 methods).

The multiple carriage returns is a bug, and there is an open issue for it, though I'm not finding it at the moment.

----------

Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue24218\>

Python-bugs-list mailing list
Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/stephane%40wirtel.be

jenstroeger · 2018-07-31T15:36:03Z

Well, posting on a closed issue is generally not the best way :)

Fair enough ;)

The multiple carriage returns is a bug, and there is an open issue for it, though I'm not finding it at the moment.

Oh good, yes that should be fixed!

My current workaround is setting international = True always, which prevents the multiple CRs. Not pretty but it works…

jenstroeger · 2018-08-03T10:54:08Z

So that’s interesting. I thought that setting international = True (see line https://github.com/python/cpython/blob/master/Lib/smtplib.py#L947) would be a neat workaround, but the opposite.

When delivering those emails to Gmail I started seeing

Failed to send email: (555, b'5.5.2 Syntax error, goodbye. s53-v6sm1864855qts.5 - gsmtp', 'foo@bar.com')

and it turns out (according to the IETF message linter, https://tools.ietf.org/tools/msglint/) that:

-----------
UNKNOWN: unknown header 'User-Agent' at line 4
ERROR: missing mandatory header 'date' lines 1-7
ERROR: missing mandatory header 'return-path' lines 1-7
OK: found part text/plain line 9
WARNING: line 13 too long (109 chars); text/plain shouldn't need folding (RFC
2046-4.1.1)
WARNING: line 15 too long (124 chars); text/plain shouldn't need folding (RFC
2046-4.1.1)
WARNING: Character set mislabelled as 'utf-8' when 'us-ascii' suffices, body
part ending line 22
-----------

It seems that now “Date” and “Return-Path” header entries are missing when the email is generated.

I reverted the initial change. Any updates on the multiple CR problem when flattening?

jenstroeger · 2018-08-06T18:51:49Z

David, I tried to find the mentioned '\r\r…\n' issue but I could not find it here. However, from an initial investigation into the BytesGenerator, here is what’s happening.

Flattening the body and attachments of the EmailMessage object works, and eventually _write_headers() is called to flatten the headers which happens entry by entry (https://github.com/python/cpython/blob/master/Lib/email/generator.py#L417-L418). Flattening a header entry is a recursive process over the parse tree of the entry, which builds the flattened and encoded final string by descending into the parse tree and encoding & concatenating the individual “parts” (tokens of the header entry).

Given the parse tree for a header entry like "Martín Córdoba <foo@bar.com>" eventually results in the correct flattened string:

'=?utf-8?q?Mart=C3=ADn_C=C3=B3rdoba?= <foo@bar.com>\r\n'

at the bottom of the recursion for this “Mailbox” part. The recursive callstack is then:

_refold_parse_tree _header_value_parser.py:2687
fold [Mailbox] _header_value_parser.py:144
_refold_parse_tree _header_value_parser.py:2630
fold [Address] _header_value_parser.py:144
_refold_parse_tree _header_value_parser.py:2630
fold [AddressList] _header_value_parser.py:144
_refold_parse_tree _header_value_parser.py:2630
fold [Header] _header_value_parser.py:144
fold [_UniqueAddressHeader] headerregistry.py:258
_fold [EmailPolicy] policy.py:205
fold_binary [EmailPolicy] policy.py:199
_write_headers [BytesGenerator] generator.py:418
_write [BytesGenerator] generator.py:195

The problem now arises from the interplay of

    # https://github.com/python/cpython/blob/master/Lib/email/_header_value_parser.py#L2629
    encoded_part = part.fold(policy=policy)[:-1] # strip nl

which strips the '\n' from the returned string, and

# https://github.com/python/cpython/blob/master/Lib/email/_header_value_parser.py#L2686
return policy.linesep.join(lines) + policy.linesep

which adds the policy’s line separation string linesep="\r\n" to the end of the flattened string upon unrolling the recursion.

I am not sure about a proper fix here, but considering that the linesep policy can be any string length (in this case len("\r\n") == 2) a fixed truncation of one character [:-1] seems wrong. Instead, using:

    encoded_part = part.fold(policy=policy)[:-len(policy.linesep)] # strip nl

seems to work for entries with and without Unicode characters in their display names.

David, please advise on how to proceed from here.

jenstroeger · 2018-08-15T09:05:57Z

@david, any thoughts on this?

bitdancer · 2018-08-15T13:19:31Z

Sorry, I haven't had time to look at it yet :( Not sure when I will, things are more than a bit busy for me right now. Ping me again in two weeks if I haven't responded, please. The proposed solution sounds reasonable, though, so you could also propose a PR with tests and that would take me less time to review and I might get to it sooner.

jenstroeger · 2018-08-17T04:19:47Z

Thanks David: PR on Github (which is R/O) or where should I submit to?

bitdancer · 2018-08-17T17:04:57Z

check out https://devguide.python.org. (Basically, banch and generate a PR on github). And please open a new issue for this.

jenstroeger · 2018-08-17T23:03:08Z

New issue: https://bugs.python.org/issue34424

bitdancer added type-feature A feature request or enhancement topic-email labels May 17, 2015

bitdancer closed this as completed May 17, 2015

ezio-melotti transferred this issue from another repository Apr 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Also support SMTPUTF8 in smtplib's send_message method. #68406

Also support SMTPUTF8 in smtplib's send_message method. #68406

bitdancer commented May 17, 2015

bitdancer commented May 17, 2015

soltysh commented May 17, 2015

bitdancer commented May 17, 2015

python-dev mannequin commented May 17, 2015

bitdancer commented May 17, 2015

jenstroeger mannequin commented Jul 31, 2018

jenstroeger mannequin commented Jul 31, 2018

bitdancer commented Jul 31, 2018

matrixise commented Jul 31, 2018

jenstroeger mannequin commented Jul 31, 2018

jenstroeger mannequin commented Aug 3, 2018

jenstroeger mannequin commented Aug 6, 2018

jenstroeger mannequin commented Aug 15, 2018

bitdancer commented Aug 15, 2018

jenstroeger mannequin commented Aug 17, 2018

bitdancer commented Aug 17, 2018

jenstroeger mannequin commented Aug 17, 2018

Also support SMTPUTF8 in smtplib's send_message method. #68406

Also support SMTPUTF8 in smtplib's send_message method. #68406

Comments

bitdancer commented May 17, 2015

bitdancer commented May 17, 2015

soltysh commented May 17, 2015

bitdancer commented May 17, 2015

python-dev mannequin commented May 17, 2015

bitdancer commented May 17, 2015

jenstroeger mannequin commented Jul 31, 2018

jenstroeger mannequin commented Jul 31, 2018

bitdancer commented Jul 31, 2018

matrixise commented Jul 31, 2018

jenstroeger mannequin commented Jul 31, 2018

jenstroeger mannequin commented Aug 3, 2018

jenstroeger mannequin commented Aug 6, 2018

jenstroeger mannequin commented Aug 15, 2018

bitdancer commented Aug 15, 2018

jenstroeger mannequin commented Aug 17, 2018

bitdancer commented Aug 17, 2018

jenstroeger mannequin commented Aug 17, 2018