This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author flying sheep
Recipients flying sheep
Date 2014-10-11.17:58:44
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1413050325.25.0.295627364762.issue22612@psf.upfronthosting.co.za>
In-reply-to
Content
See also #6331.

The repo https://github.com/nagisa/unicodeblocks contains pretty much the functionality i’d like to see: a way to get access to information about all blocks, and a way to get the block name a char is in.

I propose to include something very similar to those two APIs in unicodedata:

unicodedata.Block: class with start, end, and name property.

its __contains__ should work for single-char-strings (which tests if that char is in the block) and for ints (which tests if the codepoint is in the block)

maybe make it iterable over its chars?

unicodedata.blocks: OrderedDict of str (block name) → Block object mappings ordered by Block.start.

then blocks.keys() would yield the names in order, and blocks.values() the block objects in order.

unicodedata.block_of(chr, name_only=False): returns the Block object for which “chr in block” is True, or its name.

---

alternative: make the Block class an unfancy namedtuple without __contains__ method.

---

Together with #18234, fixing this bug will complete UnicodeData support in python, i guess.
History
Date User Action Args
2014-10-11 17:58:45flying sheepsetrecipients: + flying sheep
2014-10-11 17:58:45flying sheepsetmessageid: <1413050325.25.0.295627364762.issue22612@psf.upfronthosting.co.za>
2014-10-11 17:58:45flying sheeplinkissue22612 messages
2014-10-11 17:58:44flying sheepcreate