This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author r.david.murray
Recipients barry, dracos, r.david.murray
Date 2019-05-12.13:10:20
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1557666620.49.0.23923232036.issue36893@roundup.psfhosted.org>
In-reply-to
Content
In order to legitimately have a non-ascii localpart, you *must* be using RFC6532 and RFC6531.  In the email package you do this by using policy=SMTPUTF8, or setting utf8=True in your custom Policy.  In smtplib you do this by specifying smtputf8 in the mail_options list to sendmail, or passing a message with a policy that has utf8=True to send_message.

I notice in answering this report that this is not really documented clearly.  The information is there, but only if you already know how the RFCs work.  Some variation of the text above should be added to the smtplib documentation, and an example of using SMTPUTF8 should be added to the email examples chapter.

However, you are correct, there are couple of bugs here.

The rendering done by as_string (and as_bytes) is the best that we can do without raising an error...but we should probably be raising an error if the rendering policy does not have utf8=True and we don't have an "original source line" from parsing a message (which is the case here), rather than using the incorrect RFC2047 encoding.

The second bug, the one you are reporting, is that we apparently missed the constructor of Address when we were adding RFC6532 support.  If you look at the comment above that code, it is purposefully trying to raise an error if the addr_spec is invalid and it was provided by the *application* (as opposed to email.Parser).  But with RFC6532 support, it should be valid to have a local part that has non-ascii in an Address, and the error, as I noted above, should be raised only at serialization time and when we don't have an original source string.  So that raise should be modified to explicitly ignore the NonASCIILocalPartDefect.  (Really, Address should take a policy argument.  That's a bigger change, but it would be the "right way" to fix this.)

Raising the error on serialization could cause some breakage if existing programs are "getting away" with specifying non-ascii local parts but not doing it via addr_spec.  It is breakage that should happen, I think, but we may want to only do it in a feature release.
History
Date User Action Args
2019-05-12 13:10:20r.david.murraysetrecipients: + r.david.murray, barry, dracos
2019-05-12 13:10:20r.david.murraysetmessageid: <1557666620.49.0.23923232036.issue36893@roundup.psfhosted.org>
2019-05-12 13:10:20r.david.murraylinkissue36893 messages
2019-05-12 13:10:20r.david.murraycreate