This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Duplicated Content-Transfer-Encoding header when applying email.encoders
Type: behavior Stage: resolved
Components: Documentation Versions: Python 3.8, Python 3.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: docs@python Nosy List: barry, cancel, cheryl.sabella, docs@python, miss-islington, r.david.murray
Priority: normal Keywords: patch

Created on 2012-06-20 10:38 by cancel, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 5354 merged cheryl.sabella, 2018-01-26 22:41
PR 13709 merged miss-islington, 2019-05-31 20:20
Messages (11)
msg163266 - (view) Author: Sergei Stolyarov (cancel) Date: 2012-06-20 10:38
Here is the test script:

--------------
from email.mime.text import MIMEText
from email.mime.multipart import MIMEMultipart
from email import encoders

msg = MIMEMultipart()
msg['Subject'] = 'Bug test'

text_part = MIMEText('actual content doesnt matter')
text_part.set_charset('utf-8')
msg.attach(text_part)

xml_part = MIMEText(b'<xml>aaa</xml>')
xml_part.set_type('text/xml')
xml_part.set_charset('utf-8')
encoders.encode_base64(xml_part)
msg.attach(xml_part)

print(msg.as_string())
--------------------------

It prints the following:
--------------------------
Content-Type: multipart/mixed; boundary="===============2584752675366770986=="
MIME-Version: 1.0
Subject: Bug test

--===============2584752675366770986==
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset="utf-8"

actual content doesnt matter
--===============2584752675366770986==
Content-Transfer-Encoding: 7bit
MIME-Version: 1.0
Content-Type: text/xml; charset="utf-8"
Content-Transfer-Encoding: base64

PHhtbD5hYWE8L3htbD4=

--===============2584752675366770986==--
--------------------------

And that's incorrect: the header "Content-Transfer-Encoding" set twice. As workaround you can use:

del xml_part['Content-Transfer-Encoding']
msg163275 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-06-20 13:51
First of all, I presume you are running this in python2.7, since it doesn't work in python3.  In Python3 MIMEText takes a string as input, not a bytes object, unless you pass it a _charset parameter.  If you do that, the encoding is done when the object is created, and the explicit call to encodings that you do fails.  If, on the other hand, you pass in a string, the explicit call to encodings fails.

In 2.7, the encoder call does not fail, but as you report this results in an extra header.  This is because MIMEText sets a content type and CTE header using the default values when it is first created.

The explicit call to encoders is not needed  What you want to do in stead is to pass the charset and type when you make the MIMEText call:

  MIMEText('<xml>aaa</xml>', 'xml', 'utf-8')

This is the correct way to do it, and the portable way.  So, you get the right output by using the API the way it was designed.  

That leaves the question of whether or not we should add some documentation (such as: *never* call the functions from the encoders module directly :).

Note that I don't *like* that the current API is that calling set_charset does the body encode if and only if there are no existing headers, but that is the way it has always worked, and there are programs out there that depend on it.  

In theory we could fix the encoders functions to check for existing headers and do the right thing in that case, but it is not something that would rate very high on my priority list.  I'll happily look at a patch if someone wants to propose one, but since the right way to do this exists, I'm going to treat this issue as documentation-only.  

If someone wants to propose a patch for this, please open a new issue.
msg163276 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-06-20 13:58
Barry: I think we should documentationally deprecate the encoders module.  I can't see any utility in a new program calling those functions explicitly, especially if the program ever wants to port to Python3.  Or maybe the Python2 docs would say deprecated in Python3.  

What do you think?
msg163278 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2012-06-20 14:12
On Jun 20, 2012, at 01:58 PM, R. David Murray wrote:

>Barry: I think we should documentationally deprecate the encoders module.  I
>can't see any utility in a new program calling those functions explicitly,
>especially if the program ever wants to port to Python3.  Or maybe the
>Python2 docs would say deprecated in Python3.

I agree that we should document them as deprecated, as long as we include text
explaining why, or providing alternatives (e.g. "you think you need this but
you don't because...")

I think it does make sense to include text in the Py2 docs about this.
msg163279 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2012-06-20 14:13
On Jun 20, 2012, at 01:51 PM, R. David Murray wrote:

>Note that I don't *like* that the current API is that calling set_charset
>does the body encode if and only if there are no existing headers, but that
>is the way it has always worked, and there are programs out there that depend
>on it.

Can we nuke that for email6?
msg163284 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012-06-20 14:58
I think so, yes.  When we have the mimeregistry equivalent of the headerregistry, the new mime Message classes can have a set_charset with a different implementation.  I'll want to talk about the API details on email-sig before I do anything, though.
msg309937 - (view) Author: Cheryl Sabella (cheryl.sabella) * (Python committer) Date: 2018-01-14 22:07
Barry/David,

Is this still a change you wanted to include in the documentation?


Thanks!
msg310013 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2018-01-15 19:43
I believe so.  For python3 I think it should only apply to the legacy API docs (you would use set_content (directly or indirectly) in python3, not set_payload).  I've updated the versions.
msg310816 - (view) Author: Cheryl Sabella (cheryl.sabella) * (Python committer) Date: 2018-01-26 22:42
Hi David, 

I've made a pull request for the way I think you wanted this documented.  Please take a look and let me know if it's even close to what you were suggesting.  Thanks!   :-)
msg344117 - (view) Author: Cheryl Sabella (cheryl.sabella) * (Python committer) Date: 2019-05-31 20:19
New changeset a747c3a5edf21fa5670bc30f5e1d804de89ebf62 by Cheryl Sabella in branch 'master':
bpo-15115: Document deprecation of email.encoders in Python 3 (GH-5354)
https://github.com/python/cpython/commit/a747c3a5edf21fa5670bc30f5e1d804de89ebf62
msg344119 - (view) Author: miss-islington (miss-islington) Date: 2019-05-31 20:26
New changeset 464c1ec65af2c1c1d849d50d9726fa453804e70e by Miss Islington (bot) in branch '3.7':
bpo-15115: Document deprecation of email.encoders in Python 3 (GH-5354)
https://github.com/python/cpython/commit/464c1ec65af2c1c1d849d50d9726fa453804e70e
History
Date User Action Args
2022-04-11 14:57:31adminsetgithub: 59320
2019-05-31 20:26:09miss-islingtonsetnosy: + miss-islington
messages: + msg344119
2019-05-31 20:20:22cheryl.sabellasetstatus: open -> closed
stage: patch review -> resolved
resolution: fixed
versions: + Python 3.8, - Python 2.7, Python 3.6
2019-05-31 20:20:02miss-islingtonsetstage: needs patch -> patch review
pull_requests: + pull_request13596
2019-05-31 20:19:01cheryl.sabellasetmessages: + msg344117
2018-01-26 22:42:44cheryl.sabellasetmessages: + msg310816
stage: patch review -> needs patch
2018-01-26 22:41:06cheryl.sabellasetkeywords: + patch
stage: needs patch -> patch review
pull_requests: + pull_request5200
2018-01-15 19:43:31r.david.murraysetmessages: + msg310013
versions: + Python 3.6, Python 3.7, - Python 3.2, Python 3.3
2018-01-14 22:07:38cheryl.sabellasetnosy: + cheryl.sabella
messages: + msg309937
2012-06-20 14:58:24r.david.murraysetmessages: + msg163284
2012-06-20 14:13:43barrysetmessages: + msg163279
2012-06-20 14:12:08barrysetmessages: + msg163278
2012-06-20 13:58:24r.david.murraysetmessages: + msg163276
2012-06-20 13:51:20r.david.murraysetassignee: docs@python
type: behavior
components: + Documentation, - email
versions: + Python 3.3, - Python 3.1
nosy: + docs@python

messages: + msg163275
stage: needs patch
2012-06-20 10:38:31cancelcreate