Issue 36486: Bugs and inconsistencies in unicodedata

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/80667

classification

Title:	Bugs and inconsistencies in unicodedata
Type:	behavior	Stage:	needs patch
Components:	Documentation, Unicode	Versions:	Python 3.8, Python 3.7

process

Status:	open	Resolution:
Dependencies:		Superseder:
Assigned To:	docs@python	Nosy List:	docs@python, dscorbett, ezio.melotti, lemburg, vstinner
Priority:	normal	Keywords:

Created on 2019-03-30 14:41 by dscorbett, last changed 2022-04-11 14:59 by admin.

Messages (1)
msg339203 - (view)	Author: David Corbett (dscorbett)	Date: 2019-03-30 14:41
In `unicodedata`, the functions `lookup` and `name` have some bugs and inconsistencies. `lookup` matches case-insensitively, except for the algorithmic names of Hangul syllables and CJK unified ideographs, which must be in all caps. The documentation does not explain how character names are fuzzily matched. `lookup` accepts names like “CJK UNIFIED IDEOGRAPH-04E00”, where the code point has a leading zero. `lookup` and `name` don’t implement rule NR2, defined in chapter 4 of Unicode, for Tangut ideographs’ names.

msg339203 - (view)

Author: David Corbett (dscorbett)

Date: 2019-03-30 14:41

In `unicodedata`, the functions `lookup` and `name` have some bugs and inconsistencies.

`lookup` matches case-insensitively, except for the algorithmic names of Hangul syllables and CJK unified ideographs, which must be in all caps. The documentation does not explain how character names are fuzzily matched.

`lookup` accepts names like “CJK UNIFIED IDEOGRAPH-04E00”, where the code point has a leading zero.

`lookup` and `name` don’t implement rule NR2, defined in chapter 4 of Unicode, for Tangut ideographs’ names.

History
Date	User	Action	Args
2022-04-11 14:59:13	admin	set	github: 80667
2019-04-05 18:28:55	terry.reedy	set	stage: needs patch versions: + Python 3.8
2019-03-30 14:48:43	xtreak	set	nosy: + lemburg
2019-03-30 14:41:13	dscorbett	create