This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author hniksic
Recipients JelleZijlstra, barry, docs@python, hniksic, r.david.murray, vidhya
Date 2022-03-04.08:27:09
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1646382429.95.0.524811453247.issue17505@roundup.psfhosted.org>
In-reply-to
Content
> Any suggestions on what needs to be done for current revisions?

Hi! I'm the person who submitted this issue back in 2013. Let's take a look at how things are in Python 3.10:

Python 3.10.2 (main, Jan 13 2022, 19:06:22) [GCC 10.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import email
>>> msg = email.message_from_string('Subject: =?gb2312?b?1eLKx9bQzsSy4srUo6E=?=\n\nfoo\n')
>>> msg['Subject']
'=?gb2312?b?1eLKx9bQzsSy4srUo6E=?='

So the headers are still not decoded by default. The `unicode()` invocation in the original description was just an attempt to get a Unicode string out of a byte string (assuming it was correctly decoded from MIME, which it wasn't). Since Python 3 strings are Unicode already, I'd expect to just get the decoded subject - but that still doesn't happen.

The correct way to make it happen is to specify `policy=email.policy.default`:

>>> msg = email.message_from_string('Subject: =?gb2312?b?1eLKx9bQzsSy4srUo6E=?=\n\nfoo\n', policy=email.policy.default)
>>> msg['Subject']
'这是中文测试!'

The docs should point out that you really _want_ to specify the "default" policy (strangely named, since it's not in fact the default). The current docs only say that `message_from_string()` is "equivalent to Parser().parsestr(s)." and that `policy` is interpreted "as with the Parser class constructor". The docs of the Parser constructor don't document `policy` at all, except for the version when it was added.

So, if you want to work for this, my suggestion would be to improve the docs in the following ways:

* in message_from_string() docs, explain that `policy=email.policy.default` is what you want to send to get the headers decoded.

* in Parser docs, explain what _class and policy arguments do in the constructor, which policies are possible, etc. (These things seem to be explained in the BytesFeedParser, so you might want to just link to that, or include a shortened version.)
History
Date User Action Args
2022-03-04 08:27:10hniksicsetrecipients: + hniksic, barry, r.david.murray, docs@python, JelleZijlstra, vidhya
2022-03-04 08:27:09hniksicsetmessageid: <1646382429.95.0.524811453247.issue17505@roundup.psfhosted.org>
2022-03-04 08:27:09hniksiclinkissue17505 messages
2022-03-04 08:27:09hniksiccreate