Author lemburg
Recipients ezio.melotti, lemburg, progfou, vstinner
Date 2014-03-28.10:09:55
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1396001396.48.0.248793171238.issue21081@psf.upfronthosting.co.za>
In-reply-to
Content
Some comments:

* Please provide some background information how widely the encoding is used. I get less than 1000 hits in Google when looking for "TCVN 5712:1993". Now, the encoding was a standard in Vietnam, but it has been updated in 1999 to TCVN 5712:1999. There's also an encoding called VSCII.

* In the file you write "kind of TCVN 5712:1993 VN3 with CP1252 additions". This won't work, since we can only accept codecs which are based on set standards. It would be better to provide a link to an official Unicode character set mapping table and then use the gencodec.py script on this table.

* For Vietnamese, Python already provides cp1258 - how much is this encoding used in comparison to e.g. TCVN 5712:1993 ?

Resources:

 * Vietnamese encodings: http://www.panl10n.net/english/outputs/Survey/Vietnamese.pdf

 * East Asian encodings: http://www.unicode.org/iuc/iuc15/tb1/slides.pdf
History
Date User Action Args
2014-03-28 10:09:56lemburgsetrecipients: + lemburg, vstinner, ezio.melotti, progfou
2014-03-28 10:09:56lemburgsetmessageid: <1396001396.48.0.248793171238.issue21081@psf.upfronthosting.co.za>
2014-03-28 10:09:56lemburglinkissue21081 messages
2014-03-28 10:09:55lemburgcreate