Message 229101 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	flying sheep
Recipients	flying sheep
Date	2014-10-11.17:58:44
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1413050325.25.0.295627364762.issue22612@psf.upfronthosting.co.za>
In-reply-to

Content
See also #6331. The repo https://github.com/nagisa/unicodeblocks contains pretty much the functionality i’d like to see: a way to get access to information about all blocks, and a way to get the block name a char is in. I propose to include something very similar to those two APIs in unicodedata: unicodedata.Block: class with start, end, and name property. its __contains__ should work for single-char-strings (which tests if that char is in the block) and for ints (which tests if the codepoint is in the block) maybe make it iterable over its chars? unicodedata.blocks: OrderedDict of str (block name) → Block object mappings ordered by Block.start. then blocks.keys() would yield the names in order, and blocks.values() the block objects in order. unicodedata.block_of(chr, name_only=False): returns the Block object for which “chr in block” is True, or its name. --- alternative: make the Block class an unfancy namedtuple without __contains__ method. --- Together with #18234, fixing this bug will complete UnicodeData support in python, i guess.

See also #6331.

The repo https://github.com/nagisa/unicodeblocks contains pretty much the functionality i’d like to see: a way to get access to information about all blocks, and a way to get the block name a char is in.

I propose to include something very similar to those two APIs in unicodedata:

unicodedata.Block: class with start, end, and name property.

its __contains__ should work for single-char-strings (which tests if that char is in the block) and for ints (which tests if the codepoint is in the block)

maybe make it iterable over its chars?

unicodedata.blocks: OrderedDict of str (block name) → Block object mappings ordered by Block.start.

then blocks.keys() would yield the names in order, and blocks.values() the block objects in order.

unicodedata.block_of(chr, name_only=False): returns the Block object for which “chr in block” is True, or its name.

---

alternative: make the Block class an unfancy namedtuple without __contains__ method.

---

Together with #18234, fixing this bug will complete UnicodeData support in python, i guess.

History
Date	User	Action	Args
2014-10-11 17:58:45	flying sheep	set	recipients: + flying sheep
2014-10-11 17:58:45	flying sheep	set	messageid: <1413050325.25.0.295627364762.issue22612@psf.upfronthosting.co.za>
2014-10-11 17:58:45	flying sheep	link	issue22612 messages
2014-10-11 17:58:44	flying sheep	create