msg82408 - (view) |
Author: James Henstridge (jamesh) |
Date: 2009-02-18 05:36 |
The IMAP4rev1 specification allows for non-ASCII mailbox names using a
modified UTF-7 encoding (section 5.1.3 of RFC 2060 or 3501). However,
the imaplib routines taking a mailbox name just pass the string straight
through without any encoding.
It would be useful if Python provided an encoder/decoder for the
modified UTF-7 encoding, and optionally if imaplib would perform the
encoding and decoding at the appropriate points.
|
msg82411 - (view) |
Author: Martin v. Löwis (loewis) * |
Date: 2009-02-18 06:28 |
Can you provide a patch?
|
msg82510 - (view) |
Author: James Henstridge (jamesh) |
Date: 2009-02-20 03:00 |
I'll have a go at implementing the algorithm. It looks like the
modifications to UTF-7 are large enough that you can't do a search and
replace on the output of the existing UTF-7 codec, so it'll probably
require new code.
Would String2Mailbox and Mailbox2String utility functions be appropriate
here?
|
msg82529 - (view) |
Author: Jean-Paul Calderone (exarkun) * |
Date: 2009-02-20 12:58 |
IMAP4 UTF-7 is implemented in Twisted -
<http://twistedmatrix.com/trac/browser/trunk/twisted/mail/imap4.py#L5385>,
<http://twistedmatrix.com/trac/browser/trunk/twisted/mail/test/test_imap.py#L58>.
Feel free to re-use any of that code that would be helpful.
|
msg82539 - (view) |
Author: Martin v. Löwis (loewis) * |
Date: 2009-02-20 17:22 |
I don't have a good understanding of imaplib; if you think it's
appropriate to provide the conversion through two functions, I trust you.
|
msg82795 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2009-02-27 00:04 |
> The IMAP4rev1 specification allows for non-ASCII mailbox
> names using a modified UTF-7 encoding
UTF-7 already sounds like something horrible for me, but a *modified*
UTF-7 encoding is something a little bit more strange for me. Why not
reusing directly UTF-7.
(sorry, it's an off topic dummy question)
|
msg82797 - (view) |
Author: Jean-Paul Calderone (exarkun) * |
Date: 2009-02-27 00:14 |
> UTF-7 already sounds like something horrible for me, but a *modified*
> UTF-7 encoding is something a little bit more strange for me. Why not
> reusing directly UTF-7.
UTF-7 wasn't horrible for its time, but its time has very likely passed.
Alas, changing a standard like IMAP4 is so difficult, this mistake will
be with us for a long time to come.
As for why IMAP4 uses a modified form of UTF-7, the RFC addresses this:
The purpose of these modifications is to correct the following
problems with UTF-7:
1) UTF-7 uses the "+" character for shifting; this conflicts with
the common use of "+" in mailbox names, in particular USENET
newsgroup names.
2) UTF-7's encoding is BASE64 which uses the "/" character; this
conflicts with the use of "/" as a popular hierarchy delimiter.
3) UTF-7 prohibits the unencoded usage of "\"; this conflicts with
the use of "\" as a popular hierarchy delimiter.
4) UTF-7 prohibits the unencoded usage of "~"; this conflicts with
the use of "~" in some servers as a home directory indicator.
5) UTF-7 permits multiple alternate forms to represent the same
string; in particular, printable US-ASCII characters can be
represented in encoded form.
Whether you are convinced by these arguments or not is, of course,
entirely up to you. Note also, however, that the modified UTF-7 is not
mandated by the RFC:
By convention, international mailbox names in IMAP4rev1 are specified
using a modified version of the UTF-7 encoding described in [UTF-7].
Modified UTF-7 may also be usable in servers that implement an
earlier version of this protocol.
However, it seems stupid to say that the choice if encoding is only a
convention since there is no other way to communicate the choice of
encoding between client and server.
|
msg127176 - (view) |
Author: Hiroaki Kawai (Hiroaki.Kawai) |
Date: 2011-01-27 10:54 |
twisted's code does not work good for "\t", "\r", "\n", those characters must encoded in modified base64 form according to RFC 3501.
|
msg132013 - (view) |
Author: Александр Цамутали (astsmtl) |
Date: 2011-03-24 18:35 |
So noone is working on this issue ATM?
|
msg148224 - (view) |
Author: Babak M (BabakM) |
Date: 2011-11-24 02:39 |
There's a working implementation of this in PloneMailList.
http://svn.plone.org/svn/collective/mxmImapClient/trunk/imapUTF7.py
|
msg151859 - (view) |
Author: C Fraire (cfraire) |
Date: 2012-01-23 22:50 |
I've used the PloneMailList implementation in another project. It works well to add 'imap4-utf-7' as codec.
The twisted imap implementation seems to have been updated to properly support non-printable ASCII, but the twisted imap API is problematic for imaplib because twisted seems to expect its arguments to already be Python unicode.
So can we be specific about what kind of API change would satisfy this issue:
1) a number of API methods take one or more mailbox arguments. Of course, imaplib currently expects these to be ASCII, but what kind of argument should the methods take? UTF? Unicode? So would the library need a class property to describe an optional specified input encoding? Would it be expected to take Python unicode?
2) some methods, such as list and lsub, return mailbox names UTF-7 encoded and embedded in larger ASCII strings. Would imaplib be expected to alter the contents of these large strings and transform them into another other encoding (when a switch as described in 1) is active)?
|
msg215115 - (view) |
Author: Jesús Cea Avión (jcea) * |
Date: 2014-03-29 05:45 |
Being bitten by this today.
|
msg215116 - (view) |
Author: Jesús Cea Avión (jcea) * |
Date: 2014-03-29 05:48 |
Point 2 of cfraire message is a big issue.
What about leaving this problem to the library user simply providing two helper functions in the module to encode/decode mUTF-7?.
|
msg215117 - (view) |
Author: Jesús Cea Avión (jcea) * |
Date: 2014-03-29 06:24 |
Or a new encoder/decoder in "codecs" module.
|
msg228939 - (view) |
Author: Jean-Paul Calderone (exarkun) * |
Date: 2014-10-10 01:32 |
> the twisted imap API is problematic for imaplib because twisted seems to expect its arguments to already be Python unicode.
Could you elaborate on this? As far as I can tell, it works fine:
>>> import twisted.mail.imap4
>>> print u"Hello, \N{SNOWMAN}".encode('imap4-utf-7')
Hello, &JgM-
>>> print b'Hello, &JgM-'.decode('imap4-utf-7')
Hello, ☃
>>>
What would you expect to work differently?
|
msg228949 - (view) |
Author: Hiroaki Kawai (Hiroaki.Kawai) |
Date: 2014-10-10 04:15 |
>> the twisted imap API is problematic for imaplib because twisted seems to expect its arguments to already be Python unicode.
> Could you elaborate on this? As far as I can tell, it works fine:
twisted imap4-utf-7 seems to be improved in this 2 years. :-)
|
msg228980 - (view) |
Author: Jesús Cea Avión (jcea) * |
Date: 2014-10-10 10:28 |
First step is to provide mUTF-7 in Python 3.5. Then we can try to update imaplib. I am specially worried about the points cfraire raises in http://bugs.python.org/issue5305#msg151859. Lets see.
|
msg229056 - (view) |
Author: C Fraire (cfraire) |
Date: 2014-10-11 03:20 |
>> the twisted imap API is problematic for imaplib because twisted seems to expect its arguments to already be Python unicode.
>Could you elaborate on this? As far as I can tell, it works fine:
I wasn't addressing encode/decode specifically. Both twisted and PloneMailList offer implementations with same encoding name, "imap4-utf-7".
I meant that it's difficult for the twisted API to inform what might be done for imaplib since twisted takes full unicode but imaplib expects only unicode-ASCII subset.
The first part of jamesh's original issue is just encoder/decoder, so either twisted or PloneMailList would seem to suffice. I was addressing jamesh's second part whether "optionally if imaplib would perform the encoding and decoding at the appropriate points."
Point 2 of my response seems the more difficult. imaplib list and lsub return str instances with ASCII + utf-7 stuffed together. (twisted avoids this by returning tuples of unicode, if I understand correctly).
|
msg248173 - (view) |
Author: Jesús Cea Avión (jcea) * |
Date: 2015-08-07 04:40 |
Ping.
|
msg315014 - (view) |
Author: Alexander Harkness (bearbin) * |
Date: 2018-04-06 09:44 |
ssu
|
|
Date |
User |
Action |
Args |
2022-04-11 14:56:45 | admin | set | github: 49555 |
2018-04-21 09:01:04 | mcepl | set | nosy:
+ mcepl
|
2018-04-06 09:44:45 | bearbin | set | nosy:
+ bearbin messages:
+ msg315014
|
2015-08-07 04:40:56 | jcea | set | messages:
+ msg248173 |
2015-07-22 16:34:30 | astsmtl | set | type: enhancement |
2015-04-22 01:16:46 | Jean-Paul Calderone | set | nosy:
- exarkun
|
2014-10-11 03:20:45 | cfraire | set | messages:
+ msg229056 |
2014-10-10 10:28:40 | jcea | set | dependencies:
+ Add mUTF-7 codec (UTF-7 modified for IMAP) messages:
+ msg228980 |
2014-10-10 04:15:21 | Hiroaki.Kawai | set | messages:
+ msg228949 |
2014-10-10 01:32:31 | exarkun | set | messages:
+ msg228939 |
2014-03-29 06:24:45 | jcea | set | messages:
+ msg215117 |
2014-03-29 05:48:08 | jcea | set | messages:
+ msg215116 |
2014-03-29 05:45:53 | jcea | set | messages:
+ msg215115 |
2014-03-29 05:45:31 | jcea | set | nosy:
+ jcea
versions:
+ Python 3.5, - Python 3.1, Python 2.7 |
2013-12-12 10:44:58 | dveeden | set | nosy:
+ dveeden
|
2012-01-23 22:50:13 | cfraire | set | nosy:
+ cfraire messages:
+ msg151859
|
2011-11-24 02:39:41 | BabakM | set | nosy:
+ BabakM messages:
+ msg148224
|
2011-03-24 18:35:27 | astsmtl | set | nosy:
+ astsmtl messages:
+ msg132013
|
2011-01-27 10:54:02 | Hiroaki.Kawai | set | nosy:
+ Hiroaki.Kawai messages:
+ msg127176
|
2009-02-27 00:14:06 | exarkun | set | messages:
+ msg82797 |
2009-02-27 00:04:47 | vstinner | set | nosy:
+ vstinner messages:
+ msg82795 |
2009-02-20 17:22:04 | loewis | set | messages:
+ msg82539 |
2009-02-20 12:58:55 | exarkun | set | nosy:
+ exarkun messages:
+ msg82529 |
2009-02-20 03:00:08 | jamesh | set | messages:
+ msg82510 |
2009-02-18 06:28:41 | loewis | set | nosy:
+ loewis messages:
+ msg82411 |
2009-02-18 05:36:09 | jamesh | create | |