classification
Title: Implement RFC 6855 (IMAP Support for UTF-8) in imaplib.
Type: enhancement Stage: resolved
Components: email Versions: Python 3.5
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: BreamoreBoy, Håkan Lövdahl, barry, eric.smith, jesstess, maciej.szulik, pitrou, python-dev, r.david.murray, zvyn
Priority: normal Keywords: patch

Created on 2014-06-17 23:28 by zvyn, last changed 2015-05-16 18:06 by r.david.murray. This issue is now closed.

Files
File name Uploaded Description Edit
imaplib_utf8_no_doc.patch zvyn, 2014-07-01 00:50 Proposed solution. review
imaplib_utf8_issue21800.patch r.david.murray, 2015-05-04 00:35 review
issue21800.patch maciej.szulik, 2015-05-08 21:01 review
Messages (10)
msg222000 - (view) Author: Milan Oberkirch (zvyn) * Date: 2014-07-01 00:50
I made a patch implementing the following changes to the IMAP4 class:
- add a method 'enable_UTF8_accept()' sending "ENABLE UTF8=ACCEPT" to the server and setting internal encoding to UTF-8
- use the UTF8 extencion in the 'append()' method if the internal encoding is UTF-8
- add a keyword argument 'enable_UTF8=False' to the init method to trigger 'enable_UTF8_accept()' as soon as the authentication is done
- always use UTF-8 for encoding credentials in authentication (before encoding it to base64)

Does this look like a good idea to you? (I'll make a patch including docs when we agree on the API.)
msg236713 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2015-02-26 22:54
The patch contains changes to code and tests have been added so can we have a formal review please.
msg242538 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-05-04 00:35
Here is an updated patch based on Milan's work, including docs.  I've tweaked the API slightly: no dedicated method for doing the enable (instead it is inlined in authenticate), I added 'enable' to the exposed API (with a doc caveat about not using it for UTF8=ACCEPT), added None as a valid value for enable_UTF8 with the meaning "enable if possible" (True is now "must enable"), and exposed a utf8_enabled attribute so the program can easily tell what mode to use when specifying enable_UTF8=None.  Someday, None should become the default.  "None" is not the best choice for value, especially when it is not the default, so perhaps someone could suggest better values for that keyword.

It would be great if Milan or Maciej could give me a review (or anyone else who feels like it).  I want to get this in before the beta deadline.
msg242620 - (view) Author: Maciej Szulik (maciej.szulik) * Date: 2015-05-05 21:17
David I did the review and there's one thing that worries me the most, actually two:
1. changing the usual meaning of None in the IMAP's __init__ method, where None has the same meaning as True, where I think it should be the opposite.
2. I'm not sure we want to have UTF8 enabled based on the init's flag. I've seen our IMAP library as a wrapper around protocol itself. Whereas the user must be aware of required steps needed to proceed. In this case enabling UTF8 support is just the next command the client can, but doesn't have to sent directly, but only in AUTH state.
msg242622 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-05-05 22:27
Well, the problem with that is that we then have to parse the capability to see if it is utf8 that is being enabled.  I don't like that as an API, it feels fragile.  Since capabilities cannot later be disabled, there's no functional reason to keep it separate.   However, it would solve the problem of values to use in the init flag, and would remove the caveat that you shouldn't use UTF8=ENABLE in an explicit enable call, so perhaps it is best after all.

Do you have any interest in updating the patch?  I won't be able to get back to it until this weekend.
msg242648 - (view) Author: Maciej Szulik (maciej.szulik) * Date: 2015-05-06 07:39
Yes, I can update that (that IMAP testing bug - http://bugs.python.org/issue22137, is taking me longer than I expected it ;)). 
I just want to make sure if I understand you correctly what's needs to be done is removing the utf8_enable code from init, we will enable ascii by default and only explicit call to enable method will enable it, am I missing something?
msg242662 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-05-06 12:57
An explicit call to enable with an argument string that contains 'UTF8=ACCEPT'.  I did not go far enough through the BNF to determine if enable can be passed more than one capability at a time, but I suspect it can.  In which case we should factor out the capability parsing such that we can reuse it for parsing the enable argument.
msg242778 - (view) Author: Maciej Szulik (maciej.szulik) * Date: 2015-05-08 21:01
David, I've changed according to your suggestion, appreciate review.
msg242873 - (view) Author: Roundup Robot (python-dev) Date: 2015-05-10 23:24
New changeset 195343b5e64f by R David Murray in branch 'default':
#21800: Add RFC 6855 support to imaplib.
https://hg.python.org/cpython/rev/195343b5e64f
msg242876 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-05-11 00:56
Thanks, Maciek (and Milan).

I tweaked your patch slightly (mostly doc changes...I moved the discussion of the utf8 RFC into the enable method only, and added back the docs for utf8_enabled).  I made some review comments about the changes other than that doc reorg that I made before commit, just FYI.
History
Date User Action Args
2015-05-16 18:06:45r.david.murraysetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2015-05-11 00:56:40r.david.murraysetmessages: + msg242876
2015-05-10 23:24:51python-devsetnosy: + python-dev
messages: + msg242873
2015-05-08 21:01:58maciej.szuliksetfiles: + issue21800.patch

messages: + msg242778
2015-05-06 12:57:52r.david.murraysetmessages: + msg242662
2015-05-06 07:39:59maciej.szuliksetmessages: + msg242648
2015-05-05 22:27:32r.david.murraysetmessages: + msg242622
2015-05-05 21:17:35maciej.szuliksetmessages: + msg242620
2015-05-04 00:35:08r.david.murraysetfiles: + imaplib_utf8_issue21800.patch

nosy: + eric.smith, maciej.szulik
messages: + msg242538

stage: needs patch -> patch review
2015-03-07 15:42:26Håkan Lövdahlsetnosy: + Håkan Lövdahl
2015-02-26 22:54:23BreamoreBoysetnosy: + BreamoreBoy
messages: + msg236713
2014-07-01 00:50:15zvynsetfiles: + imaplib_utf8_no_doc.patch
keywords: + patch
messages: + msg222000
2014-06-23 13:14:42Jim.Jewettsettype: enhancement
stage: needs patch
2014-06-17 23:28:22zvyncreate