Message 191534 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	mlalic
Recipients	barry, mlalic, r.david.murray, serhiy.storchaka
Date	2013-06-20.19:15:23
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1371755723.91.0.402566650656.issue18271@psf.upfronthosting.co.za>
In-reply-to

Content
That will work fine as long as the characters are actually latin. We cannot forget the rest of the unicode character planes. Consider:: >>> message = message_from_string("""MIME-Version: 1.0 ... Content-Type: text/plain; charset=utf-8 ... Content-Disposition: inline ... Content-Transfer-Encoding: 8bit ... ... 한글ᥡ╥ສए""") >>> message.get_payload(decode=True).decode('latin1') '\\ud55c\\uae00\\u1961\\u2565\\u0eaa\\u090f' >>> message.get_payload(decode=True).decode('raw-unicode-escape') '한글ᥡ╥ສए' However, even if latin1 did work, the main point is that a different encoding than the one the message specifies must be used in order to decode the bytes to a unicode string.

That will work fine as long as the characters are actually latin. We cannot forget the rest of the unicode character planes. Consider::

>>> message = message_from_string("""MIME-Version: 1.0
... Content-Type: text/plain; charset=utf-8
... Content-Disposition: inline
... Content-Transfer-Encoding: 8bit
... 
... 한글ᥡ╥ສए""")
>>> message.get_payload(decode=True).decode('latin1')
'\\ud55c\\uae00\\u1961\\u2565\\u0eaa\\u090f'
>>> message.get_payload(decode=True).decode('raw-unicode-escape')
'한글ᥡ╥ສए'

However, even if latin1 did work, the main point is that a different encoding than the one the message specifies must be used in order to decode the bytes to a unicode string.

History
Date	User	Action	Args
2013-06-20 19:15:23	mlalic	set	recipients: + mlalic, barry, r.david.murray, serhiy.storchaka
2013-06-20 19:15:23	mlalic	set	messageid: <1371755723.91.0.402566650656.issue18271@psf.upfronthosting.co.za>
2013-06-20 19:15:23	mlalic	link	issue18271 messages
2013-06-20 19:15:23	mlalic	create