This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author hyeshik.chang
Recipients hyeshik.chang
Date 2008-02-11.11:58:53
SpamBayes Score 0.18258823
Marked as misclassified No
Message-id <1202731134.77.0.201279568783.issue2066@psf.upfronthosting.co.za>
In-reply-to
Content
This patch adds CNS11643 support into Python unicode codecs.
CNS11643 is a huge character which is used in EUC-TW and ISO-2022-CN.
CJKCodecs have had the CNS11643 support for 4 years at least,
but I dropped it because of its huge size in integrating into Python.
EUC-TW and ISO-2022-CN aren't being used widely while they are
still regarded as part of major encodings yet.

In my patch, disabling the CNS11643 charset support is possible by
adding -DNO_CNS11643 in CFLAGS for light platforms. Mapping source
code size of the charset is 900K and it adds about 350K into
_codecs_tw.so (in POSIX) or python26.dll (in Win32).

What do you think about adding this code?
History
Date User Action Args
2008-02-11 11:58:55hyeshik.changsetspambayes_score: 0.182588 -> 0.18258823
recipients: + hyeshik.chang
2008-02-11 11:58:54hyeshik.changsetspambayes_score: 0.182588 -> 0.182588
messageid: <1202731134.77.0.201279568783.issue2066@psf.upfronthosting.co.za>
2008-02-11 11:58:54hyeshik.changlinkissue2066 messages
2008-02-11 11:58:53hyeshik.changcreate