Message 296505 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	steven.daprano
Recipients	Guillaume Sanchez, steven.daprano
Date	2017-06-21.01:34:07
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1498008848.16.0.236080187603.issue30717@psf.upfronthosting.co.za>
In-reply-to

Content
http://www.unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries talks about grapheme clusters, not "graphemes" alone, and it seems clear to me that they are language dependent. For example, it says: The Unicode Standard provides default algorithms for determining grapheme cluster boundaries, with two variants: legacy grapheme clusters and extended grapheme clusters. The most appropriate variant depends on the language and operation involved. ... These algorithms can be adapted to produce tailored grapheme clusters for specific locales... Nevertheless, even just a basic API to either the legacy grapheme cluster or the extended grapheme cluster algorithms would be a good start. Can I suggest that the unicodedata module might be the right place for it? And thank you for volunteering to do the work on this!

http://www.unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries

talks about *grapheme clusters*, not "graphemes" alone, and it seems clear to me that they are language dependent. For example, it says:

The Unicode Standard provides default algorithms for determining grapheme cluster boundaries, with two variants: legacy grapheme clusters and extended grapheme clusters. The most appropriate variant depends on the language and operation involved. ... These algorithms can be adapted to produce tailored grapheme clusters for specific locales...


Nevertheless, even just a basic API to either the *legacy grapheme cluster* or the *extended grapheme cluster* algorithms would be a good start.

Can I suggest that the unicodedata module might be the right place for it?

And thank you for volunteering to do the work on this!

History
Date	User	Action	Args
2017-06-21 01:34:08	steven.daprano	set	recipients: + steven.daprano, Guillaume Sanchez
2017-06-21 01:34:08	steven.daprano	set	messageid: <1498008848.16.0.236080187603.issue30717@psf.upfronthosting.co.za>
2017-06-21 01:34:08	steven.daprano	link	issue30717 messages
2017-06-21 01:34:07	steven.daprano	create