classification
Title: imaplib.IMAP4_stream subprocess is opened unbuffered but ignores short reads
Type: Stage: resolved
Components: Library (Lib) Versions: Python 3.4, Python 3.2, Python 3.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: detrout, gregory.p.smith, pitrou, python-dev, r.david.murray
Priority: normal Keywords: patch

Created on 2013-03-17 05:27 by gregory.p.smith, last changed 2013-03-19 18:03 by r.david.murray. This issue is now closed.

Files
File name Uploaded Description Edit
imaplib-buff.patch detrout, 2013-03-18 03:35 small patch to force bufffering imaplib popen streams review
imaplib-bufsize.patch detrout, 2013-03-19 04:50 review
Messages (10)
msg184363 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2013-03-17 05:31
imaplib.IMAP4_stream subprocess is opened unbuffered but ignores short reads when reading the message body.  Depending on timing, message body size and kernel pipe buffer size and phase of the moon and whether you're debugging the thing or not... It can fail to read the entire message body before wrongly assuming it has and attempting to read the terminating b')\r\n' of the IMAP protocol.

Bug discovered during a debugging session at the PyCon 2013 Python 3 Porting Clinic BOF.
msg184364 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2013-03-17 05:37
The error does not happen when running the same code under 2.7, despite the same default bufsize=0 subprocess behavior.  This is likely due to differences in the Python 2.x old style io library when os.fdopen(fd, 'rb', bufsize) is used vs 3.x when io.open(fd, 'rb', bufsize) is used for  Popen.stdout.

One workaround is to add a non-zero bufsize to the subprocess.Popen call in imaplib.IMAP4_stream.

I'm not sure if subprocess should be updated or if subprocess's docs on what it means for a pipe to be unbuffered (read(n) is a single syscall rather than a loop until n bytes or EOF) should be updated.
msg184372 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-03-17 14:06
os.fdopen() in 2.x would always create a FILE*, and therefore inherit fread()'s semantics even in "unbuffered" mode. In 3.x, unbuffered I/O instead calls read() directly, and happily returns partial reads; this is by design.

So, I guess imaplib should be fixed  :-)
msg184373 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-03-17 14:08
I don't think there's any reason to open the subprocess in unbuffered mode (you aren't sharing the stdio streams with anyone else). Just be careful to call flush() on stdin before attempting to read any response from stdout.
msg184388 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2013-03-17 17:07
Yes imaplib can be fixed pretty easily and should use buffered IO regardless.

I'm pondering if the default behavior of subprocess needs fixing as
existing python 2.x code being ported to 3 doesn't expect this changed
behavior of the PIPE file objects.  It probably does.

Thankfully my subprocess32 backport on 2.x doesn't suffer from the
problem as it still uses os.fdopen.
msg184417 - (view) Author: Diane Trout (detrout) Date: 2013-03-18 03:35
So as a first stab at fixing this. I modified imaplib to wrap the process.stdin / process.stdout from with io.BufferedWriter / io.BufferedReader. I didn't use the TextIOWrapper as the imaplib wanted to work with the raw \r\n. 

The change seems to have fixed the problem I was having, I also checked out 82724:ef8ea052bcc4 and tried running "./python -m test -j3 " before and after the buffer wrapping and it didn't seem to trigger any test case failures.
msg184594 - (view) Author: Diane Trout (detrout) Date: 2013-03-19 04:50
After bumping into r.david.murray in the elevator I got the impression setting the bufsize argument to the Popen call would be a better idea.

I found that BufferedReader/Writer were using a DEFAULT_BUFFER_SIZE set somewhere in the c part of io. To cut down on magic numbers, this imaplib patch imports that constant and uses it on the Popen call. 

It doesn't seem to introduce test failures and still fixes the imap desynchronization problem seen at the porting clinic.
msg184600 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2013-03-19 06:43
that patch looks good for imaplib.

i'll follow up on the subprocess side of things to see if the default
behavior should be changed to better match what happened in 2.7 (or if not:
to make sure the change in behavior is sufficiently documented and not
relied on elsewhere in the stdlib)
msg184651 - (view) Author: Roundup Robot (python-dev) Date: 2013-03-19 17:57
New changeset c5aacf9d1cdc by R David Murray in branch '3.2':
#17443: Fix buffering in IMAP4_stream.
http://hg.python.org/cpython/rev/c5aacf9d1cdc

New changeset 0baa65b3ef76 by R David Murray in branch '3.3':
Merge: #17443: Fix buffering in IMAP4_stream.
http://hg.python.org/cpython/rev/0baa65b3ef76

New changeset 4c6463b96a2c by R David Murray in branch 'default':
Merge: #17443: Fix buffering in IMAP4_stream.
http://hg.python.org/cpython/rev/4c6463b96a2c
msg184652 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2013-03-19 18:03
Thanks, Diane, and expecially thanks for finding this and helping is track down the cause.

We need better test infrastructure for imap...because this occurs only during string litteral reads, I decided that making a test for this with our current imap test infrastructure just wasn't worth it.
History
Date User Action Args
2013-03-19 18:03:46r.david.murraysetstatus: open -> closed
resolution: fixed
messages: + msg184652

stage: needs patch -> resolved
2013-03-19 17:57:51python-devsetnosy: + python-dev
messages: + msg184651
2013-03-19 06:43:02gregory.p.smithsetmessages: + msg184600
2013-03-19 04:50:58detroutsetfiles: + imaplib-bufsize.patch

messages: + msg184594
2013-03-18 03:35:46detroutsetfiles: + imaplib-buff.patch
keywords: + patch
messages: + msg184417
2013-03-17 17:07:34gregory.p.smithsetmessages: + msg184388
2013-03-17 14:08:26pitrousetmessages: + msg184373
2013-03-17 14:06:02pitrousetnosy: + pitrou
messages: + msg184372

components: + Library (Lib)
stage: needs patch
2013-03-17 05:46:20detroutsetnosy: + detrout
2013-03-17 05:37:15gregory.p.smithsetmessages: + msg184364
2013-03-17 05:31:01gregory.p.smithsetnosy: + r.david.murray

messages: + msg184363
versions: + Python 3.2, Python 3.3, Python 3.4
2013-03-17 05:27:19gregory.p.smithcreate