classification
Title: The cjkcodecs integration
Type: Stage:
Components: Library (Lib) Versions: Python 2.4
process
Status: closed Resolution: accepted
Dependencies: Superseder:
Assigned To: hyeshik.chang Nosy List: hyeshik.chang, loewis
Priority: normal Keywords: patch

Created on 2004-01-09 07:55 by hyeshik.chang, last changed 2004-01-17 14:47 by hyeshik.chang. This issue is now closed.

Messages (5)
msg45224 - (view) Author: Hyeshik Chang (hyeshik.chang) * (Python committer) Date: 2004-01-09 07:55
(finally :)

CJKCodecs includes support for many East Asian legacy
encodings:

* Chinese (PRC): gb2312 gbk gb18030 hz
* Chinese (ROC): big5 cp950
* Japanese: cp932 shift-jis shift-jisx0213 euc-jp
euc-jisx0213 iso-2022-jp iso-2022-jp-1 iso-2022-jp-2
iso-2022-jp-3 iso-2022-jp-ext
* Korean: cp949 euc-kr johab iso-2022-kr

CJKCodecs integration to main python will make CJK
users more comfortable with the default installation
package.

And it's not as big as you might guess. :)

It bloats only 2% by source size:

% du -d0 -k python
37714   python
% du -d0 -k python+cjkcodecs
38504   python+cjkcodecs

And it bloats only 4% by source lines:

% echo `find python.cjkcodecs -type f -exec cat {}
\;|wc -l` "*100/" `find python -type f -exec cat {}
\;|wc -l` "-100" | bc
4
msg45225 - (view) Author: Hyeshik Chang (hyeshik.chang) * (Python committer) Date: 2004-01-09 08:00
Logged In: YES 
user_id=55188

Hmm. SF seems not to accept big patches. (385KB)
I uploaded the patch to
http://people.freebsd.org/~perky/pythoncjkcodecs.diff.bz2 
msg45226 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2004-01-09 21:10
Logged In: YES 
user_id=21627

Can you please make that server report the file type as
application/octet-stream?
msg45227 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2004-01-09 21:24
Logged In: YES 
user_id=21627

These changes look good to me, please apply them.

As for the regrtest modification, please change the tests to
provide a skip_expected setting, which is computed depending
on the presence of the test data - see test_normalization.py
for an example.

It would be good if the header files containing large tables
would contain an indication on how these tables have been
created (e.g. what data source have been used, and what
modification had been applied after the tables where created
from the sources).
msg45228 - (view) Author: Hyeshik Chang (hyeshik.chang) * (Python committer) Date: 2004-01-17 14:47
Logged In: YES 
user_id=55188

Okay. Committed as:

Modified files:

Doc/lib/libcodecs.tex 1.27
Lib/email/test/test_email_codecs.py 1.5
Lib/encodings/aliases.py 1.21
Modules/Setup.dist 1.43
Lib/test/regrtest.py 1.151
setup.py 1.181


Added files:

Lib/encodings/big5.py
Lib/encodings/cp932.py
Lib/encodings/cp949.py
Lib/encodings/cp950.py
Lib/encodings/euc_jisx0213.py
Lib/encodings/euc_jp.py
Lib/encodings/euc_kr.py
Lib/encodings/gb18030.py
Lib/encodings/gb2312.py
Lib/encodings/gbk.py
Lib/encodings/iso2022_jp.py
Lib/encodings/iso2022_jp_1.py
Lib/encodings/iso2022_jp_2.py
Lib/encodings/iso2022_jp_3.py
Lib/encodings/iso2022_jp_ext.py
Lib/encodings/iso2022_kr.py
Lib/encodings/johab.py
Lib/encodings/shift_jis.py
Lib/encodings/shift_jisx0213.py
Lib/test/cjkencodings_test.py
Lib/test/test_codecencodings_cn.py
Lib/test/test_codecencodings_jp.py
Lib/test/test_codecencodings_kr.py
Lib/test/test_codecencodings_tw.py
Lib/test/test_codecmaps_cn.py
Lib/test/test_codecmaps_jp.py
Lib/test/test_codecmaps_kr.py
Lib/test/test_codecmaps_tw.py
Lib/test/test_multibytecodec.py
Lib/test/test_multibytecodec_support.py
Modules/cjkcodecs/README
Modules/cjkcodecs/_big5.c
Modules/cjkcodecs/_cp932.c
Modules/cjkcodecs/_cp949.c
Modules/cjkcodecs/_cp950.c
Modules/cjkcodecs/_euc_jisx0213.c
Modules/cjkcodecs/_euc_jp.c
Modules/cjkcodecs/_euc_kr.c
Modules/cjkcodecs/_gb18030.c
Modules/cjkcodecs/_gb2312.c
Modules/cjkcodecs/_gbk.c
Modules/cjkcodecs/_hz.c
Modules/cjkcodecs/_iso2022_jp.c
Modules/cjkcodecs/_iso2022_jp_1.c
Modules/cjkcodecs/_iso2022_jp_2.c
Modules/cjkcodecs/_iso2022_jp_3.c
Modules/cjkcodecs/_iso2022_jp_ext.c
Modules/cjkcodecs/_iso2022_kr.c
Modules/cjkcodecs/_johab.c
Modules/cjkcodecs/_shift_jis.c
Modules/cjkcodecs/_shift_jisx0213.c
Modules/cjkcodecs/alg_iso8859_1.h
Modules/cjkcodecs/alg_iso8859_7.h
Modules/cjkcodecs/alg_jisx0201.h
Modules/cjkcodecs/cjkcommon.h
Modules/cjkcodecs/codeccommon.h
Modules/cjkcodecs/codecentry.h
Modules/cjkcodecs/iso2022common.h
Modules/cjkcodecs/map_big5.h
Modules/cjkcodecs/map_cp932ext.h
Modules/cjkcodecs/map_cp949.h
Modules/cjkcodecs/map_cp949ext.h
Modules/cjkcodecs/map_cp950ext.h
Modules/cjkcodecs/map_gb18030ext.h
Modules/cjkcodecs/map_gb18030uni.h
Modules/cjkcodecs/map_gb2312.h
Modules/cjkcodecs/map_gbcommon.h
Modules/cjkcodecs/map_gbkext.h
Modules/cjkcodecs/map_jisx0208.h
Modules/cjkcodecs/map_jisx0212.h
Modules/cjkcodecs/map_jisx0213.h
Modules/cjkcodecs/map_jisx0213_pairs.h
Modules/cjkcodecs/map_jisxcommon.h
Modules/cjkcodecs/map_ksx1001.h
Modules/cjkcodecs/mapdata_ja_JP.c
Modules/cjkcodecs/mapdata_ko_KR.c
Modules/cjkcodecs/mapdata_zh_CN.c
Modules/cjkcodecs/mapdata_zh_TW.c
Modules/cjkcodecs/multibytecodec.c
Modules/cjkcodecs/multibytecodec.h
Modules/cjkcodecs/tweak_gbk.h

Thank you! :-)
History
Date User Action Args
2004-01-09 07:55:32hyeshik.changcreate