classification
Title: Split email headers near a space
Type: enhancement Stage: test needed
Components: Library (Lib) Versions: Python 2.7
process
Status: closed Resolution:
Dependencies: Superseder:
Assigned To: r.david.murray Nosy List: BreamoreBoy, barry, dstanek, noamr, r.david.murray
Priority: normal Keywords: easy, patch

Created on 2005-05-29 08:43 by noamr, last changed 2010-10-02 23:48 by r.david.murray. This issue is now closed.

Messages (4)
msg48392 - (view) Author: Noam Raphael (noamr) * Date: 2005-05-29 08:43
Hello,

I recently used Python to automatically send messages
to my gmail account. I was surprised to find out that
some of the words in the subjects of messages were
split by a space character which came from nowhere.

It turns out that the international (Hebrew) subject
was split into multiple lines by the email package,
sometimes in the middle of words. Gmail treats these
line breaks as spaces, so words gets cut into two. I've
checked, and there are email clients which ignore the
line breaks, so the subject looks ok.

I added four lines to the _binsplit function of
email.Header, so that if there is a space character in
the string, it will be splitted there. This fixes the
problem, and subjects look fine again. These four lines
(plus a comment which I wrote) are:

    # Try to find a place in splittable[:i] which is
near a space,
    # and split there, so that clients which interpret
the line break
    # as a separator won't insert a space in the middle
of a word.
    if splittable[i:i+1] != ' ':
        spacepos = splittable.rfind(' ', 0, i)
        if spacepos != -1:
            i = spacepos + 1

These lines should be added before the last three lines
of _binsplit. Sorry about not attaching a diff file - I
currently don't have diff at hand.

Thank you,
Noam Raphael
msg110020 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2010-07-11 14:45
I'd have thought patching something that splits words into two should be treated as a bug, not a feature request, hence could this go into Python 2.7, 3.1 and 3.2?
msg113227 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-08-08 01:19
Yes, if there's a bug here and it can be fixed without a major behavior change then it could be backported.

I'm not clear on what the bug is, though, since there is no example given.  If the Hebrew is encoded as encoded words, it can and will be split in the middle of words, but the RFC2047 reassembly process removes those spaces (ie: this may be/may have been a gmail bug).

Without a test case we can't be sure.
msg117901 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010-10-02 23:48
Since no test case has been provided I am closing this issue.
History
Date User Action Args
2010-10-02 23:48:49r.david.murraysetstatus: open -> closed

messages: + msg117901
2010-08-18 00:14:06dstaneksetnosy: + dstanek
2010-08-08 01:19:09r.david.murraysetmessages: + msg113227
2010-07-11 14:45:28BreamoreBoysetnosy: + BreamoreBoy
messages: + msg110020
2010-05-05 13:44:34barrysetassignee: barry -> r.david.murray

nosy: + r.david.murray
2009-04-22 14:43:50ajaksu2setkeywords: + easy
2009-02-16 01:02:07ajaksu2setstage: test needed
type: enhancement
versions: + Python 2.7
2005-05-29 08:43:15noamrcreate