This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author t2y
Recipients ishimoto, methane, t2y
Date 2014-12-14.14:34:41
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1418567689.35.0.572357268777.issue23050@psf.upfronthosting.co.za>
In-reply-to
Content
This patch adds Japanese legacy encodings as below.
https://bitbucket.org/t2y/cpython/branches/compare/japanese-legacy-encoding..default

* eucjp_ms (euc-jp compatible with cp932)
* iso2022_jp_ms (yet another iso-2022-jp compatible with cp932, similar to cp50220)
* cp50220 (http://www.iana.org/assignments/charset-reg/CP50220)
* cp50221 (a variant of cp50220)
* cp50222 (a variant of cp50220)
* cp51932 (http://www.iana.org/assignments/charset-reg/CP51932)

Originally, these character encodings patch was created as result in IPA project in 2005, by Masayuki Moriyama. The result was contributed to several community: libiconv, glibc, perl, PHP, Ruby, PostgreSQL, MySQL, nkf. He had made a patch for Python 2.4.3 at that time, but somehow, no one worked to integrate. That's a crying shame.

These character encodings are legacy, but are still used. Lots of end-user don't care the character encoding. Unfortunately, for historical reason, e-mails are encoded with these legacy encodings on Japanese Windows platform. Actually, my customer recently reported about Mojibake since its e-mail data would be encoded with cp50220 (iso-2022-jp-ms).

References:

* About IPA: http://www.ipa.go.jp/english/about/summary.html
* Mojibake: http://en.wikipedia.org/wiki/Mojibake
* Java encoding names: http://docs.oracle.com/javase/8/docs/technotes/guides/intl/encoding.doc.html

References in Japanese:

* Japanese Legacy Encoding Project: http://legacy-encoding.sourceforge.jp/wiki/
* Project details: http://www.ipa.go.jp/about/jigyoseika/05fy-pro/open/2005-1467d.pdf
History
Date User Action Args
2014-12-14 14:34:51t2ysetrecipients: + t2y, ishimoto, methane
2014-12-14 14:34:49t2ysetmessageid: <1418567689.35.0.572357268777.issue23050@psf.upfronthosting.co.za>
2014-12-14 14:34:49t2ylinkissue23050 messages
2014-12-14 14:34:47t2ycreate