This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author malin
Recipients ezio.melotti, hyeshik.chang, lemburg, loewis, malin, serhiy.storchaka, vstinner
Date 2015-05-18.09:07:56
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1431940077.47.0.522248182793.issue24117@psf.upfronthosting.co.za>
In-reply-to
Content
>> I examined all Chinese codecs
I said it above, but I forgot Taiwan and HongKong are using Chinese as well.

BIG5 and CP950 are using a wrong convert table, test this:
>>> u = b'\xC6\xA1'.decode('big5')
>>> hex(ord(u))
'0x30fe'

This should not happen, 0xC6A1 is neither in BIG5 nor in CP950.
In BIG5-2003 and HKSCS-2008, 0xC6A1 is mapped to U+2460.

I only had a look roughly, please check more.
I won't check HongKong codec anymore, I suggest check it as well.
History
Date User Action Args
2015-05-18 09:07:57malinsetrecipients: + malin, lemburg, loewis, hyeshik.chang, vstinner, ezio.melotti, serhiy.storchaka
2015-05-18 09:07:57malinsetmessageid: <1431940077.47.0.522248182793.issue24117@psf.upfronthosting.co.za>
2015-05-18 09:07:57malinlinkissue24117 messages
2015-05-18 09:07:56malincreate