classification
Title: imaplib: incorrect quoting in commands
Type: behavior Stage: patch review
Components: email, Library (Lib) Versions: Python 3.3
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Mauro.Cicognini, anadelonbrin, barry, bjshan, dmbaggett, dveeden, maciej.szulik, meatballhat, r.david.murray
Priority: low Keywords: easy, patch

Created on 2004-03-16 06:36 by anadelonbrin, last changed 2015-04-13 17:34 by maciej.szulik.

Files
File name Uploaded Description Edit
adjust-mustquote-re-in-imaplib.patch meatballhat, 2010-08-04 03:08 exactly what dmbaggett described
Messages (10)
msg20244 - (view) Author: Tony Meyer (anadelonbrin) Date: 2004-03-16 06:36
imaplib incorrectly chooses to quote some arguments.

In particular, doing "UID FETCH # BODY.PEEK[]" results
in the BODY.PEEK[] being quoted, and it should not
(according to the RFC), which means the command fails.
 This is demonstrated below.  It's possible (and
likely) that other UID FETCH arguments are incorrectly
quoted.

This occurs with anon cvs python of 16/3/04, and 2.3.3.
 Windows XP SP1.

I'm happy to provide more info if required, just let me
know.  I could try and work up a patch, but it would be
better from someone really familiar with imaplib so
that I don't screw up legitimate quoting.

>>> import imaplib
>>> i = imaplib.IMAP4("server")
>>> i.login("username", "password")
('OK', ['LOGIN Ok.'])
>>> i.select()
('OK', ['38'])
>>> i.debug = 4
>>> i.uid("FETCH", "96", "BODY")
  29:14.23 > GKGP7 UID FETCH 96 BODY
  29:14.40 < * 31 FETCH (UID 96 BODY (("text" "plain"
("charset" "iso-8859-1") NIL NIL "quoted-printable" 32
0)("text" "html" ("charset" "iso-8859-1") NIL NIL
"quoted-printable" 368 10) "alternative"))
  29:14.40 < GKGP7 OK FETCH completed.
('OK', ['31 (UID 96 BODY (("text" "plain" ("charset"
"iso-8859-1") NIL NIL "quoted-printable" 32 0)("text"
"html" ("charset" "iso-8859-1") NIL NIL
"quoted-printable" 368 10) "alternative"))'])
>>> i.uid("FETCH", "96", "BODY.PEEK[]")
  29:17.04 > GKGP8 UID FETCH 96 "BODY.PEEK[]"
  29:17.21 < GKGP8 NO Error in IMAP command received by
server.
  29:17.21 NO response: Error in IMAP command received
by server.
('NO', ['Error in IMAP command received by server.'])
>>> i.logout()
  29:31.26 > GKGP9 LOGOUT
  29:31.42 < * BYE Courier-IMAP server shutting down
  29:31.42 BYE response: Courier-IMAP server shutting down
  29:31.42 < GKGP9 OK LOGOUT completed
('BYE', ['Courier-IMAP server shutting down'])
>>> 
msg20245 - (view) Author: Tony Meyer (anadelonbrin) Date: 2004-03-16 06:48
Logged In: YES 
user_id=552329

Sorry, I missed the bit in the docs that points out that
stuff is always quoted and that using () avoids it.  Still,
it does seem that imaplib would be doing it's job better if
it followed correct quoting, rather than always quoting.  It
would certainly be easier to use for people familiar with
IMAP, but unfamiliar with imaplib.
msg87655 - (view) Author: Dave Baggett (dmbaggett) Date: 2009-05-12 19:03
I'm not sure this causes the behavior reported here, but I believe there
really is a bug in imaplib.

In particular, it seems wrong to me that this line:

mustquote = re.compile(r"[^\w!#$%&'*+,.:;<=>?^`|~-]")

has \w in it. Should that be \s?

I found this when I noticed that SELECT commands on mailboxes with
spaces in their names failed.
msg87656 - (view) Author: Dave Baggett (dmbaggett) Date: 2009-05-12 20:00
OK, I missed the initial caret in the regex. The mustquote regex is
listing everything that needn't be quoted, and then negating. I still
think it's wrong, though. According to BNF given in the Formal Syntax
section of RFC 3501, you must must quote atom-specials, which are
defined thus:

atom-specials   = "(" / ")" / "{" / SP / CTL / list-wildcards /
                  quoted-specials / resp-specials
list-wildcards  = "%" / "*"
quoted-specials = DQUOTE / "\"
resp-specials   = "]"

So I think this regex should do it:

mustquote = re.compile(r'[()\s%*"]|"{"|"\\"|"\]"')

Changing status to bug.
msg112741 - (view) Author: Dan Buch (meatballhat) Date: 2010-08-04 03:08
I'm attaching a patch which does exactly what dmbaggett recommended w.r.t. the mustquote regex.  All current tests pass, but I'm not sure if the current tests even cover this code (how is coverage measured in the stdlib tests?)

On a related note, the `_checkquote` method which uses the `mustquote` regex is dead code in Python 3.2+, AFAICT.
msg112976 - (view) Author: Dave Baggett (dmbaggett) Date: 2010-08-05 12:57
Piers Lauder, author of imaplib, emailed me the following comment about this bug:
------------
The regex for "mustquote_cre" looks bizarre, and I regret to say I can
no longer remember its genesis.

Note however, that the term CTL in the RFC definition for "atom-specials" means "any ASCII control character and DEL, 0x00 - 0x1f, 0x7f", and so maybe defining what is NOT an atom-special was considered easier.

The suggested replacement regex may not match these...?
-------------

It seems like we need to enumerate the control characters in the regex to be absolutely correct here.
msg114330 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2010-08-19 06:49
I agree with Dan's comment in msg112741 about dead code so this only applies to 2.7.  I'll raise a new issue to get the dead code removed from py3k.
msg181950 - (view) Author: Mauro Cicognini (Mauro.Cicognini) Date: 2013-02-12 10:42
The removal of the dead code causes imaplib under py3k to lose the quoting functionality that is described in documentation (except for passwords, that do get always quoted as stated).

I submit that we give at least a temporary warning in the docs, and that the _quote() function is exposed to let the programmer do their own quoting when they feel it necessary.
msg181957 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2013-02-12 14:00
I don't understand what you mean by removing dead code leading to loss of functionality, unless you mean that the removal of the call to the quoting code in Python3 led to a loss of functionality relative to Python2, in which case I agree.  It also led to the behavior being out of sync with the documentation.

Reviewing the history of the changes (see issue 1210), it appears to me that the removal of the call to _checkquote was in fact unintentional and is a bug (it existed in the port-to-python3 patch submitted by Victor, but was commented out, which probably means he had it commented out for testing and did not notice he had not restored it).  This makes the later removal of _checkquote incorrect as well.

This error is primarily a consequence of imaplib having very few tests.

This module needs a lot of love in Python3 from someone, and it is not an easy topic to wrap ones head around.  It's on my list of things to look at, but there are a bunch of things ahead of it.

For the immediate issue, it is working as documented in Python2.7, so there is nothing to do there.  For Python3 we haven't had the quoting since the start, so we have the opportunity to consider changing the quoting rules if we wish...and we may have have no choice, since the new behavior has been in released versions for several Python3 versions now and starting to quote like Python2.7 did might break otherwise working code.  I don't have an opinion on how to fix this this yet, since while I know more about the IMAP protocol than I did a year ago, I still don't know enough to even write the tests....
msg181977 - (view) Author: Mauro Cicognini (Mauro.Cicognini) Date: 2013-02-12 21:59
David, that is exactly what I meant: functionality for Python 3 is less than the functionality available for Python 2, and behavior is completely out of sync with the documentation.

Bug or not, and independent of the root cause (I don't know if anyone will ever come up with more or better tests), I agree that this behavior for Python 3 has been around for a long time. I don't think it will break any significant code, but that's just my personal opinion; the main point is that re-introducing quoting functionality in imaplib would be a significant effort, also because reading the other relevant threads it is apparent that correct implementation is not there in Python 2 either.

I think that we have an easy way: 1) fix the documentation for Python 3 so that it reflects behavior and 2) expose the _quote() private method with a new static function that the user will be able to call when they deem it necessary.
They get to come up with a good regex or other ways to detect strings that need quoting, but still it's better than relying (like I did) on the advertised functionality and having to dig into the bowels of the module to understand the source of some bizarre bugs...
History
Date User Action Args
2015-04-13 17:34:22maciej.szuliksetnosy: + maciej.szulik
2015-03-16 15:16:53r.david.murraysetnosy: + bjshan
2015-03-16 15:16:36r.david.murraylinkissue23678 superseder
2014-02-03 17:01:18BreamoreBoysetnosy: - BreamoreBoy
2013-09-06 14:02:13dveedensetnosy: + dveeden
2013-02-12 21:59:45Mauro.Cicogninisetmessages: + msg181977
2013-02-12 14:00:16r.david.murraysetnosy: + barry
messages: + msg181957
components: + email
2013-02-12 10:42:11Mauro.Cicogninisetnosy: + Mauro.Cicognini

messages: + msg181950
versions: + Python 3.3, - Python 2.7
2010-08-19 06:49:18BreamoreBoysetnosy: + BreamoreBoy

messages: + msg114330
stage: test needed -> patch review
2010-08-05 12:57:37dmbaggettsetmessages: + msg112976
2010-08-04 03:08:10meatballhatsetfiles: + adjust-mustquote-re-in-imaplib.patch

nosy: + r.david.murray, meatballhat
messages: + msg112741

keywords: + patch
2009-05-12 20:00:43dmbaggettsettype: enhancement -> behavior
messages: + msg87656
2009-05-12 19:03:03dmbaggettsetnosy: + dmbaggett
messages: + msg87655
2009-02-14 12:31:39ajaksu2setstage: test needed
versions: + Python 2.7, - Python 2.6
2008-01-20 18:49:10christian.heimessetkeywords: + easy
type: enhancement
versions: + Python 2.6, - Python 2.3
2004-03-16 06:36:58anadelonbrincreate