classification
Title: email.FeedParser.BufferedSubFile improperly handles "\r\n"
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.1, Python 3.2, Python 2.7, Python 2.6
process
Status: closed Resolution: duplicate
Dependencies: Superseder: email parser incorrectly breaks headers with a CRLF at 8192
View: 1555570
Assigned To: barry Nosy List: Red HamsterX, ajaksu2, barry, r.david.murray, syeberman
Priority: normal Keywords: easy, patch

Created on 2007-05-19 16:06 by syeberman, last changed 2010-01-12 22:12 by r.david.murray. This issue is now closed.

Files
File name Uploaded Description Edit
BufferedSubFileBug.py syeberman, 2007-05-19 16:09 Script that demonstrates the bug
1721862[2.6].diff Red HamsterX, 2009-07-27 17:56 Unit test and solution for issue 1721862, Python 2.6, r74225
1721862[2.7].diff Red HamsterX, 2009-07-27 17:57 Unit test and solution for issue 1721862, Python 2.7, r74225
1721862[3.1].diff Red HamsterX, 2009-07-27 17:58 Unit test and solution for issue 1721862, Python 3.1, r74225
1721862[3.2].diff Red HamsterX, 2009-07-27 17:59 Unit test and solution for issue 1721862, Python 3.2, r74225
Messages (10)
msg32081 - (view) Author: Sye van der Veen (syeberman) * Date: 2007-05-19 16:06
When email.FeedParser.BufferedSubFile sees "\r" at the end of the pushed-in data, it assumes that it is a Macintosh-style line terminator.  Instead, it should request more data, to ensure that the next character is not "\n", which would make it a Windows-style line terminator.  This affects email.message_from_file, which reads in the data in 8192 byte chunks.  The following code demonstrates this:

====================================
from StringIO import StringIO
from email.FeedParser import \
    BufferedSubFile, NeedMoreData

fp = StringIO( "1\r\n10\r\n100\r\n"
               "1000\r\n10000\r\n" )
bsf = BufferedSubFile( )
while True:
    data = fp.read( 3 )
    if not data:
        break
    bsf.push( data )
    for line in bsf:
        if line is NeedMoreData:
            break
        print repr( line )
bsf.close()
====================================

The output is:
====================================
'1\r\n'
'10\r'
'\n'
'100\r\n'
'1000\r\n'
'10000\r'
'\n'
====================================


msg32082 - (view) Author: Sye van der Veen (syeberman) * Date: 2007-05-19 16:09
File Added: BufferedSubFileBug.py
msg84718 - (view) Author: Daniel Diniz (ajaksu2) (Python triager) Date: 2009-03-30 23:38
Confirmed on trunk.
msg90957 - (view) Author: Neil Tallim (Red HamsterX) Date: 2009-07-26 20:29
Attached a patch containing a unit test based on Sye van der Veen's
example and a solution for this problem.

Written against Python 2.6 (trunk), r74214, which was current at the
time of submission.

This is my first patch, so please let me know if I did something wrong
or overstepped bounds by not noticing that this was assigned until after
writing this fix.
msg90988 - (view) Author: Neil Tallim (Red HamsterX) Date: 2009-07-27 17:50
Confirmed in trunk and all current branches (r74225: 2.6, 2.7, 3.1, 3.2).

Patches for all four active versions will be added momentarily.

Note: my submission yesterday was mistagged, claiming to be for trunk
while it was really for 2.6, which is what this bug was actively marked
with at the time.
msg90990 - (view) Author: Neil Tallim (Red HamsterX) Date: 2009-07-27 17:56
Attached a patch containing a unit test based on Sye van der Veen's
example and a solution for issue 1721862.

Written against Python 2.6, r74225.
msg90991 - (view) Author: Neil Tallim (Red HamsterX) Date: 2009-07-27 17:57
Attached a patch containing a unit test based on Sye van der Veen's
example and a solution for issue 1721862.

Written against Python 2.7, r74225.
msg90992 - (view) Author: Neil Tallim (Red HamsterX) Date: 2009-07-27 17:58
Attached a patch containing a unit test based on Sye van der Veen's
example and a solution for issue 1721862.

Written against Python 3.1, r74225.
msg90993 - (view) Author: Neil Tallim (Red HamsterX) Date: 2009-07-27 17:59
Attached a patch containing a unit test based on Sye van der Veen's
example and a solution for issue 1721862.

Written against Python 3.2, r74225.
msg97665 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-01-12 22:12
This seems to be a duplicate of issue 1555570, which has a simpler patch.
History
Date User Action Args
2010-01-12 22:12:15r.david.murraysetstatus: open -> closed

superseder: email parser incorrectly breaks headers with a CRLF at 8192

nosy: + r.david.murray
messages: + msg97665
resolution: duplicate
stage: test needed -> resolved
2009-07-27 17:59:04Red HamsterXsetfiles: + 1721862[3.2].diff

messages: + msg90993
2009-07-27 17:58:16Red HamsterXsetfiles: + 1721862[3.1].diff

messages: + msg90992
2009-07-27 17:57:26Red HamsterXsetfiles: + 1721862[2.7].diff

messages: + msg90991
2009-07-27 17:56:40Red HamsterXsetfiles: + 1721862[2.6].diff

messages: + msg90990
2009-07-27 17:50:38Red HamsterXsetmessages: + msg90988
versions: + Python 3.1, Python 2.7, Python 3.2
2009-07-27 17:36:49Red HamsterXsetfiles: - 1721862.diff
2009-07-26 20:29:49Red HamsterXsetfiles: + 1721862.diff

nosy: + Red HamsterX
messages: + msg90957

keywords: + patch
2009-04-22 05:08:16ajaksu2setkeywords: + easy
2009-03-30 23:38:18ajaksu2setversions: + Python 2.6, - Python 2.4
nosy: + ajaksu2

messages: + msg84718

type: behavior
stage: test needed
2007-05-19 16:06:25syebermancreate