msg56154 - (view) |
Author: Robert T McQuaid (rtmq) |
Date: 2007-09-27 05:49 |
imaplib does not run under Python 3.
The following two-line python program, named testimap.py,
works when run from a Windows XP system shell prompt
using Python 2.5.1, but fails with Python 3.0. It
appears that the logic does not follow the distinction
between characters and bytes in Python 3.
import imaplib
mail=imaplib.IMAP4("mail.rtmq.infosathse.com")
e:\python25\python testimap.py
e:\python30\python testimap.py 2>f:syserr
The last line produced the trace:
Traceback (most recent call last):
File "testimap.py", line 10, in <module>
mail=imaplib.IMAP4("mail.rtmq.infosathse.com")
File "e:\python30\lib\imaplib.py", line 184, in __init__
self.welcome = self._get_response()
File "e:\python30\lib\imaplib.py", line 962, in _get_response
self._append_untagged(typ, dat)
File "e:\python30\lib\imaplib.py", line 800, in _append_untagged
if typ in ur:
TypeError: unhashable type: 'bytes'
|
msg56156 - (view) |
Author: Martin v. Löwis (loewis) * |
Date: 2007-09-27 06:10 |
Would you like to work on a patch?
|
msg56163 - (view) |
Author: Raghuram Devarakonda (draghuram) |
Date: 2007-09-27 14:39 |
Just to further understand the issue, I added "imaplib.Debug=5" and here
is the output preceding the exception stack trace(I replaced the real
IMAP server name)
***************
20:19.52 imaplib version 2.58
20:19.52 new IMAP4 connection, tag=LOLD
20:19.52 < * OK Microsoft Exchange Server 2003 IMAP4rev1 server
version 6.5.7638.1 (imapserver.com) ready.
20:19.52 matched r'\* (?P<type>[A-Z-]+)( (?P<data>.*))?' =>
(b'OK', b' Microsoft Exchange Server 2003 IMAP4rev1 server version
6.5.7638.1 (imapserver.com) ready.', b'Microsoft Exchange Server 2003
IMAP4rev1 server version 6.5.7638.1 (imapserver.com) ready.')
***************
So it appears that the response is of type "bytes" which in turn is due
to reading the socket in binary mode (self.file =
self.sock.makefile('rb')).
I would like to see how the problem can be fixed but any pointers are
appreciated.
|
msg56193 - (view) |
Author: Raghuram Devarakonda (draghuram) |
Date: 2007-09-28 18:41 |
I have gone through the python-3000 discussions about similar problems
in other stdlib modules (email, imghdr, sndhdr etc) and found PEP 3137
(Immutable Bytes and Mutable Buffer). Since that work is in progress, I
don't think it is worthwhile to fix this problem at this point.
|
msg57242 - (view) |
Author: Christian Heimes (christian.heimes) * |
Date: 2007-11-08 13:53 |
The transition is done. Can you work on a patch and maybe add some
tests, too? It helps when you start Python with the -bb flag:
$ ./python -bb -c 'import imaplib; imaplib.Debug=5;
imaplib.IMAP4("mail.rtmq.infosathse.com")'
52:01.86 imaplib version 2.58
52:01.86 new IMAP4 connection, tag=PNFO
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/heimes/dev/python/py3k/Lib/imaplib.py", line 184, in __init__
self.welcome = self._get_response()
File "/home/heimes/dev/python/py3k/Lib/imaplib.py", line 907, in
_get_response
resp = self._get_line()
File "/home/heimes/dev/python/py3k/Lib/imaplib.py", line 1009, in
_get_line
self._mesg('< %s' % line)
File "/home/heimes/dev/python/py3k/Lib/warnings.py", line 62, in warn
globals)
File "/home/heimes/dev/python/py3k/Lib/warnings.py", line 102, in
warn_explicit
raise message
BytesWarning: str() on a bytes instance
|
msg57254 - (view) |
Author: Raghuram Devarakonda (draghuram) |
Date: 2007-11-08 14:59 |
I will see what I can do but it may take a while.
|
msg57430 - (view) |
Author: Raghuram Devarakonda (draghuram) |
Date: 2007-11-12 21:42 |
Index: Lib/imaplib.py
===================================================================
--- Lib/imaplib.py (revision 58956)
+++ Lib/imaplib.py (working copy)
@@ -228,7 +228,7 @@
self.port = port
self.sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
self.sock.connect((host, port))
- self.file = self.sock.makefile('rb')
+ self.file = self.sock.makefile('r', encoding='ASCII', newline='')
def read(self, size):
-------------
This patch fixes the issue but I am not entirely sure that it is
correct. I quickly looked at IMAP RFC and there does seem to be spec for
CHARSET in which case, that will have to be used instead of ASCII. It
requires more research and imap knowledge which I can't claim.
As for the tests, we need a imap server to connect to. Perhaps, google
wouldn't mind being used for this purpose?
|
msg59609 - (view) |
Author: Jean-Paul Calderone (exarkun) * |
Date: 2008-01-09 16:17 |
You're correct in pointing out that IMAP4 supports arbitrary encodings,
so simply hard-coding ASCII is not correct. The encoding isn't
connection-level, but applies to particular sequences of bytes in the
connection stream. To correctly interpret the bytes as characters,
decoding must be integrated with the rest of the protocol implementation.
|
msg61918 - (view) |
Author: Bill Janssen (janssen) * |
Date: 2008-01-31 18:03 |
IMAP doesn't really support multiple charsets (just looked at RFC 3501).
There are two places where character sets other than ASCII is used.
One is in the SEARCH command; there's an optional parameter which can
indicate that the search strings are in a non-ASCII character set. The
other is in transmission of message literals (email messages) back and
forth.
So probably setting the default encoding at this level isn't quite
right, as you should definitely be reading raw bytes from the socket,
not characters, but it isn't too far off. Looks like _command() needs a
bit of work (it shouldn't try to quote bytes, only strings), and the
documentation need to be improved, to say that non-ASCII search strings
and message bodies should be passed as bytes encoded according to the
specified CHARSET, but with those fixes it should work. Assuming that
bytes are hashable in Python 3K.
|
msg71894 - (view) |
Author: Neal Norwitz (nnorwitz) * |
Date: 2008-08-24 22:22 |
Is this still a problem?
|
msg71989 - (view) |
Author: Ismail Donmez (donmez) * |
Date: 2008-08-26 17:50 |
Still fails with beta2:
>>> import imaplib
>>> mail=imaplib.IMAP4("mail.rtmq.infosathse.com")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.0/imaplib.py", line 185, in __init__
self.welcome = self._get_response()
File "/usr/local/lib/python3.0/imaplib.py", line 912, in _get_response
if self._match(self.tagre, resp):
File "/usr/local/lib/python3.0/imaplib.py", line 1021, in _match
self.mo = cre.match(s)
TypeError: can't use a string pattern on a bytes-like object
|
msg71992 - (view) |
Author: Neal Norwitz (nnorwitz) * |
Date: 2008-08-26 18:37 |
This may not be a real release blocker, but I want to raise the
priority. It is a regression and we should try to fix it, especially if
it's easy.
|
msg72459 - (view) |
Author: Barry A. Warsaw (barry) * |
Date: 2008-09-04 02:12 |
This should be fixed but it's not a release blocker.
|
msg72479 - (view) |
Author: Bill Janssen (janssen) * |
Date: 2008-09-04 04:58 |
Take a look at the thread here:
http://mailman2.u.washington.edu/mailman/htdig/imap-protocol/2008-February/000811.html
I think the summary is, arbitrary bytes may occur in some places, but
they're likely to be UTF-8. Otherwise, it's mainly ASCII, but purposely
left vague to see what convention developed.
|
msg74731 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2008-10-14 11:27 |
Here is a patch for imaplib:
- add encoding attribute to IMAP4 class (as ftplib and see also issue
3727 for my poplib patch)
- use makefile('r', encoding=self.encoding) instead of a binary file
(mode='rb')
- remove duplicate code in IMAP4_SSL
I choosed ISO-8859-1 as the default charset. I tested the library on
my local IMAP4 server using IMAP4 and IMAP4_SSL classes. But the
library needs more unit tests as done for poplib.
|
msg74752 - (view) |
Author: Bill Janssen (janssen) * |
Date: 2008-10-14 15:57 |
Victor, what kind of content have you tried this with? For instance, have
you passed unencoded (Content-Transfer-Encoding: binary) binary data through
it, by mailing a JPEG, for instance? These things are strings really only
at the application level; the data is still bytes. In addition, the use of
Latin-1 goes against the explicit directives of the IMAP group, doesn't it?
They're pushing UTF-8.
Bill
On Tue, Oct 14, 2008 at 4:27 AM, STINNER Victor <report@bugs.python.org>wrote:
>
> STINNER Victor <victor.stinner@haypocalc.com> added the comment:
>
> Here is a patch for imaplib:
> - add encoding attribute to IMAP4 class (as ftplib and see also issue
> 3727 for my poplib patch)
> - use makefile('r', encoding=self.encoding) instead of a binary file
> (mode='rb')
> - remove duplicate code in IMAP4_SSL
>
> I choosed ISO-8859-1 as the default charset. I tested the library on
> my local IMAP4 server using IMAP4 and IMAP4_SSL classes. But the
> library needs more unit tests as done for poplib.
>
> ----------
> keywords: +patch
> nosy: +haypo
> Added file: http://bugs.python.org/file11786/imaplib_unicode.patch
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue1210>
> _______________________________________
>
|
msg74760 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2008-10-14 18:14 |
IMAP_stream() is also broken because it uses os.popen2() which has
been deprecated since long time and now replaced by subprocess.
Here is a patch replacing os.popen2() by subprocess, but also using
transparent conversion from/to unicode using io.TextIOWrapper().
|
msg74761 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2008-10-14 18:21 |
> what kind of content have you tried this with?
I only tried the most basic commands like capability(). I retried with
search() and... hey, search() has a charset argument!? It should reuse
self.encoding. Same for sort().
Then I tried to get the content of an email but fetch(num, '(RFC822)')
fails with "imaplib.abort: command: FETCH => unexpected
response: 'Return-Path: <example@example.com'". RFC822 is not
supported by imaplib? The test also fails with Python 2.5.
|
msg74767 - (view) |
Author: Bill Janssen (janssen) * |
Date: 2008-10-14 19:31 |
Maybe the first thing to do is to expand the Lib/test/test_imaplib.py
file, which right now is pretty darn minimal. We really need an IMAP
server somewhere to test against, with a standard library of varied
messages.
Perhaps Python.org is running an IMAP server?
|
msg74775 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2008-10-14 22:14 |
The server can send raw 8 bits email in any charset (charset is
specified in the email headers). That's why I think that it's better
to keep bytes instead of the unicode conversion using a fixed charset.
Each email can use a different charset.
Types used in my new patch:
- unicode:
* IMAP commands (charset=ASCII)
* untagged_responses keys (charset=ASCII)
- bytes:
* answer
* regex
* tagre attribute
* untagged_responses values
I chooosed to keep unicode for some variables to minimize the changes
in imaplib library and to keep readable code.
Patch TODO:
- Remove the assert (added for quicker debugging)
- Test more functions
- Restore _checkquote() in _command() method or use
_quote()/_checkquote() in method which need it. login() already quote
the password (but why not the login?)
I also wrote a patch for a "pure bytes string" version, but the patch
is complex, long and the resulting module source code is hard to read.
|
msg74778 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2008-10-14 22:34 |
New version of my bytes patch:
- fix IMAP4_stream: use subprocess.Popen() as my previous
imap_stream.patch but use bytes instead of characters
- fix IMAP4_SSL: sslobj wasn't set in IMAP4_SSL.open() but used, for
example, in read() method; remove duplicate method (simplify the code)
- IMAP4.read(): call file.read() multiple times if the result is
smaller than size (needed especially for the SSL version); FIXME: does
this function raise an error of EOF or just loop forever? should we
stop the loop if data is b''?
|
msg74779 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2008-10-14 22:43 |
Oops, my previous patch didn't include changes to the documentation.
New patch changes:
- fix the documentation: os.popen2() => subprocess.Popen(); no more
ssl() method: use socket()
- use a buffer of 4096 bytes in read() method (as suggested in socket
documentation)
- break read() loop if read() returns an empty bytes string
|
msg75282 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2008-10-28 15:02 |
Can anyone review my last patch (imaplib_bytes-3.patch)?
|
msg75479 - (view) |
Author: Barry A. Warsaw (barry) * |
Date: 2008-11-03 23:58 |
The assertion on line 813 is indented incorrectly. Please fix that.
I'm concerned we really need better test coverage for this code, but
it's doubtful we'll get that before 3.0 final is released. I think this
is the best we're going to do, and nothing else about the code jumps out
at me.
Go ahead and land it after that minor fix.
|
msg75501 - (view) |
Author: STINNER Victor (vstinner) * |
Date: 2008-11-04 18:34 |
Le Tuesday 04 November 2008 00:59:02 Barry A. Warsaw, vous avez écrit :
> The assertion on line 813 is indented incorrectly. Please fix that.
Ooops. I'm using the following command because my editor is configured to
remove the trailing spaces:
svn diff --diff-cmd="/usr/bin/diff" -x "-ub"
The line 813 was an assertion. I added many assertions to check types (for
easier debug) but there are not needed anymore (my code is bugfreee, haha, no
it's a joke). The new attached patch has no more assertion.
|
msg75527 - (view) |
Author: Christian Heimes (christian.heimes) * |
Date: 2008-11-05 19:40 |
Committed in r67107
|
|
Date |
User |
Action |
Args |
2022-04-11 14:56:27 | admin | set | github: 45551 |
2008-11-05 19:40:11 | christian.heimes | set | status: open -> closed resolution: accepted -> fixed messages:
+ msg75527 |
2008-11-04 18:34:03 | vstinner | set | files:
+ imaplib_bytes-4.patch messages:
+ msg75501 |
2008-11-04 18:30:11 | draghuram | set | nosy:
- draghuram |
2008-11-04 18:29:07 | exarkun | set | nosy:
- exarkun |
2008-11-04 18:27:16 | vstinner | set | files:
- imaplib_bytes-3.patch |
2008-11-04 02:41:35 | benjamin.peterson | set | assignee: benjamin.peterson nosy:
+ benjamin.peterson |
2008-11-03 23:59:01 | barry | set | keywords:
- needs review messages:
+ msg75479 |
2008-10-28 15:02:02 | vstinner | set | keywords:
+ needs review messages:
+ msg75282 |
2008-10-14 22:43:31 | vstinner | set | files:
+ imaplib_bytes-3.patch messages:
+ msg74779 |
2008-10-14 22:41:32 | vstinner | set | files:
- imaplib_bytes-2.patch |
2008-10-14 22:34:57 | vstinner | set | files:
- imaplib_bytes.patch |
2008-10-14 22:34:51 | vstinner | set | files:
+ imaplib_bytes-2.patch messages:
+ msg74778 |
2008-10-14 22:20:33 | vstinner | set | files:
- imaplib_stream.patch |
2008-10-14 22:14:05 | vstinner | set | files:
+ imaplib_bytes.patch messages:
+ msg74775 |
2008-10-14 21:55:01 | vstinner | set | files:
- imaplib_unicode.patch |
2008-10-14 19:31:11 | janssen | set | messages:
+ msg74767 |
2008-10-14 18:21:13 | vstinner | set | messages:
+ msg74761 |
2008-10-14 18:14:21 | vstinner | set | files:
+ imaplib_stream.patch messages:
+ msg74760 |
2008-10-14 17:36:08 | vstinner | set | files:
- unnamed |
2008-10-14 15:57:11 | janssen | set | files:
+ unnamed messages:
+ msg74752 |
2008-10-14 11:27:46 | vstinner | set | files:
+ imaplib_unicode.patch nosy:
+ vstinner messages:
+ msg74731 keywords:
+ patch |
2008-10-02 12:54:03 | barry | set | priority: deferred blocker -> release blocker |
2008-09-26 22:18:07 | barry | set | priority: release blocker -> deferred blocker |
2008-09-18 05:42:32 | barry | set | priority: deferred blocker -> release blocker |
2008-09-04 04:58:21 | janssen | set | messages:
+ msg72479 |
2008-09-04 02:12:11 | barry | set | priority: release blocker -> deferred blocker nosy:
+ barry messages:
+ msg72459 |
2008-08-26 18:37:30 | nnorwitz | set | priority: normal -> release blocker messages:
+ msg71992 |
2008-08-26 17:50:39 | donmez | set | nosy:
+ donmez messages:
+ msg71989 |
2008-08-24 22:22:34 | nnorwitz | set | nosy:
+ nnorwitz type: crash -> behavior messages:
+ msg71894 |
2008-01-31 18:03:17 | janssen | set | nosy:
+ janssen messages:
+ msg61918 |
2008-01-09 16:17:32 | exarkun | set | nosy:
+ exarkun messages:
+ msg59609 |
2008-01-06 22:29:45 | admin | set | keywords:
- py3k versions:
Python 3.0 |
2007-11-12 21:42:33 | draghuram | set | messages:
+ msg57430 |
2007-11-08 14:59:57 | draghuram | set | messages:
+ msg57254 |
2007-11-08 13:53:28 | christian.heimes | set | nosy:
+ christian.heimes messages:
+ msg57242 |
2007-11-04 13:49:32 | christian.heimes | set | priority: normal keywords:
+ py3k resolution: accepted |
2007-09-28 18:41:35 | draghuram | set | messages:
+ msg56193 |
2007-09-27 14:39:47 | draghuram | set | messages:
+ msg56163 |
2007-09-27 14:22:43 | draghuram | set | nosy:
+ draghuram |
2007-09-27 06:10:00 | loewis | set | nosy:
+ loewis messages:
+ msg56156 |
2007-09-27 05:49:34 | rtmq | create | |