classification
Title: email.Generator is not idempotent
Type: behavior Stage: resolved
Components: Documentation, email Versions: Python 3.2, Python 3.3, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: r.david.murray Nosy List: BreamoreBoy, barry, python-dev, r.david.murray, rnortman
Priority: normal Keywords: easy, patch

Created on 2006-02-28 17:11 by rnortman, last changed 2012-05-16 02:15 by r.david.murray. This issue is now closed.

Files
File name Uploaded Description Edit
gen_not_quite_idem.patch r.david.murray, 2011-03-14 15:38
Messages (5)
msg27634 - (view) Author: Randall Nortman (rnortman) Date: 2006-02-28 17:11
The documentation for the email.Generator module claims
that the flatten() method is idempotent (i.e., output
identical to the input, if email.Parser.Parser was used
on the input), but it is not in all cases.  The most
obvious example is that you need to disable mangle_from
and set maxheaderlen=0 to disable header wrapping. 
This could be considered common sense, but the
documentation should mention it, as both are enabled by
default.  (unixfrom can also create differences between
input and output, but is disabled by default.)  More
importantly, whitespace is not preserved in headers: if
there are extra spaces between the header name and the
header contents, it will be collapsed to a single space.

This little snippet will demonstrate the problem:

    parser = email.Parser.Parser()
    msg = parser.parse(sys.stdin)
    print msg
    gen = email.Generator.Generator(sys.stdout,
mangle_from_=False, maxheaderlen=0)
    gen.flatten(msg, unixfrom=False)

Feed it a single message with extra spaces beween field
name and field contents in one or more fields, and diff
the input and the output.

It is probably not worth actually making these routines
idempotent, as preserving whitespace is not important
in most applications and would require extra
bookkeeping.  However, as long as the documentation
claims the routines are idempotent, it is a bug not to
be.  In my particular application, it was important to
be truly idempotent, so this was a problem.  Had the
documentation not made false claims, I would have known
from the start that I needed to write my own versions
of the routines in the email module.
msg81576 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2009-02-10 18:36
I think we should update the documentation to mention these exceptions.
msg115008 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2010-08-26 16:44
Does this belong with RDM or docs@python?
msg130830 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-03-14 15:38
Here is a patch that adds a footnote explaining the issue.
msg160796 - (view) Author: Roundup Robot (python-dev) Date: 2012-05-16 02:14
New changeset 8bd30967bc4b by R David Murray in branch '3.2':
#1440472: Explain that email parser/generator isn't *quite* "idempotent"
http://hg.python.org/cpython/rev/8bd30967bc4b

New changeset f534e6363bfb by R David Murray in branch 'default':
merge #1440472: Explain that email parser/generator isn't *quite* "idempotent"
http://hg.python.org/cpython/rev/f534e6363bfb

New changeset 9d99273c2f74 by R David Murray in branch '2.7':
#1440472: Explain that email parser/generator isn't *quite* "idempotent"
http://hg.python.org/cpython/rev/9d99273c2f74

New changeset 180d16af22e9 by R David Murray in branch '2.7':
#1440472: reflow
http://hg.python.org/cpython/rev/180d16af22e9

New changeset d1fbfd9af5c5 by R David Murray in branch '3.2':
#1440472: reflow
http://hg.python.org/cpython/rev/d1fbfd9af5c5

New changeset 4ec95207281c by R David Murray in branch 'default':
merge #1440472: reflow
http://hg.python.org/cpython/rev/4ec95207281c
History
Date User Action Args
2012-05-16 02:15:07r.david.murraysetstatus: open -> closed
stage: needs patch -> resolved
resolution: fixed
components: + email
versions: - Python 3.1
2012-05-16 02:14:10python-devsetnosy: + python-dev
messages: + msg160796
2011-03-14 15:38:26r.david.murraysetfiles: + gen_not_quite_idem.patch

messages: + msg130830
keywords: + patch
nosy: barry, rnortman, r.david.murray, BreamoreBoy
2011-03-13 22:53:32r.david.murraysetnosy: barry, rnortman, r.david.murray, BreamoreBoy
versions: + Python 3.3
2010-08-26 16:44:50BreamoreBoysetnosy: + BreamoreBoy

messages: + msg115008
versions: + Python 3.1, Python 2.7, Python 3.2, - Python 2.6, Python 3.0
2010-05-05 13:42:08barrysetassignee: barry -> r.david.murray

nosy: + r.david.murray
2009-04-22 18:50:10ajaksu2setkeywords: + easy
stage: needs patch
2009-02-10 18:36:15barrysetmessages: + msg81576
2009-02-10 18:09:59ajaksu2settype: behavior
components: + Documentation, - Library (Lib)
versions: + Python 2.6, Python 3.0
2006-02-28 17:11:45rnortmancreate