Author ztane
Recipients ezio.melotti, jwilk, lemburg, matorban, progfou, serhiy.storchaka, vstinner, ztane
Date 2016-10-21.20:03:22
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1477080203.39.0.062821452022.issue21081@psf.upfronthosting.co.za>
In-reply-to
Content
I found the full document on SlideShare: http://www.slideshare.net/sacobat/tcvn-5712-1993-cng-ngh-thng-tin-b-m-chun-8bit-k-t-vit-dng-trong-trao-i-thng-tin 

As far as I can understand, they're "subsets" of each other only in the sense that VN1 has the widest mapping of characters, but this also partially overlaps with C0 and C1 ranges of control characters in ISO code pages - there are 139 additional characters!

VN2 then lets the C0 and C1 retain the meanings of ISO-8859 by sacrificing some capital vowels (Ezio perhaps remembers that Italy is Ý in Vietnamese - sorry, can't write it in upper case in VN2). VN3 then sacrifices even more for some more spaces left for possibly application-specific uses (the standard is very vague about that); 

The text of the standard is copy-pasteable at http://luatvn.net/tieu-chuan-viet-nam/tieu-chuan-viet-nam-tcvn5712_1993.2.171673.html - however, it lacks some of the tables.

The standard additionally has both UCS-2 mappings and Unicode names of the characters, but they're in pictures; so it would be preferable to get the mapping from the iconv output, say.
History
Date User Action Args
2016-10-21 20:03:23ztanesetrecipients: + ztane, lemburg, vstinner, jwilk, ezio.melotti, progfou, serhiy.storchaka, matorban
2016-10-21 20:03:23ztanesetmessageid: <1477080203.39.0.062821452022.issue21081@psf.upfronthosting.co.za>
2016-10-21 20:03:23ztanelinkissue21081 messages
2016-10-21 20:03:22ztanecreate