This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author loewis
Recipients Arfrever, Nicholas.Cole, ezio.melotti, inigoserna, loewis, poq, tchrist, vstinner, zeha
Date 2012-03-11.03:14:51
SpamBayes Score 1.9291284e-09
Marked as misclassified No
Message-id <1331435693.42.0.0207181032933.issue12568@psf.upfronthosting.co.za>
In-reply-to
Content
Tom: I don't think Unicode::GCString implements UAX#11 correctly (but this is really out of scope of this issue). In particular, it contains an ad-hoc decision to introduce the EA_Z east-asian width that UAX#11 doesn't talk about.

In most cases, it's probably reasonable to introduce this EA_Z feature. However, there are some significant deviations from UAX#11 here:
- combining characters are given EA_Z in sombok/data/custom.pl, even though UAX#11 assigns A or N. UAX#11 points out that the advance width depends on whether or not the terminal performs character combination or not. It's not clear whether Unicode::GCString aims for "strict" UAX#11, or "advance width".
- control characters are also given EA_Z, even though UAX#11 gives them EA_N. In this case, it's neither UAX#11 width nor advance width since control characters will have various effects on the terminal (in particular for the tab character)
History
Date User Action Args
2012-03-11 03:14:53loewissetrecipients: + loewis, vstinner, ezio.melotti, Arfrever, inigoserna, zeha, poq, Nicholas.Cole, tchrist
2012-03-11 03:14:53loewissetmessageid: <1331435693.42.0.0207181032933.issue12568@psf.upfronthosting.co.za>
2012-03-11 03:14:52loewislinkissue12568 messages
2012-03-11 03:14:51loewiscreate