Message 191771 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	belopolsky
Recipients	belopolsky, benjamin.peterson, ezio.melotti, lemburg, loewis, serhiy.storchaka
Date	2013-06-24.14:35:14
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1372084514.95.0.693019342702.issue18234@psf.upfronthosting.co.za>
In-reply-to

Content
MAL> Please leave the function as it is, i.e. a 1-1 mapping to the MAL> official, non-changing Unicode name reference (including MAL> spelling errors, etc). Same with code points that have no name. Since we have code points with no name - it is not 1-1 mapping but 1 to 0 or 1. Unicode Standard recommends using "Code Point Labels" "To provide unique, meaningful labels for code points that do not have character names." (Section 4.9.) These labels are not very useful: Control: control-NNNN Reserved: reserved-NNNN Noncharacter: noncharacter-NNNN Private-Use: private-use-NNNN Surrogate: surrogate-NNNN According to the description in NameAliases.txt: # The formal name aliases are part of the Unicode character namespace, which # includes the character names and the names of named character sequences. I believe this means that formal name aliases are as official as the character names. If we don't change the default, what is the downside in adding an optional type argument to unicodedata.name()? After all, according to the standard, aliases are names, just a different type of names.

MAL> Please leave the function as it is, i.e. a 1-1 mapping to the
MAL> official, non-changing Unicode name reference (including
MAL> spelling errors, etc). Same with code points that have no name.

Since we have code points with no name - it is not 1-1 mapping but 1 to 0 or 1.

Unicode Standard recommends using "Code Point Labels" "To provide unique, meaningful labels for code points that do not have character names." (Section 4.9.)

These labels are not very useful:

Control: control-NNNN
Reserved: reserved-NNNN
Noncharacter: noncharacter-NNNN
Private-Use: private-use-NNNN
Surrogate: surrogate-NNNN

According to the description in NameAliases.txt:

# The formal name aliases are part of the Unicode character namespace, which
# includes the character names and the names of named character sequences.

I believe this means that formal name aliases are as official as the character names.

If we don't change the default, what is the downside in adding an optional type argument to unicodedata.name()?  After all, according to the standard, aliases *are* names, just a different *type* of names.

History
Date	User	Action	Args
2013-06-24 14:35:14	belopolsky	set	recipients: + belopolsky, lemburg, loewis, benjamin.peterson, ezio.melotti, serhiy.storchaka
2013-06-24 14:35:14	belopolsky	set	messageid: <1372084514.95.0.693019342702.issue18234@psf.upfronthosting.co.za>
2013-06-24 14:35:14	belopolsky	link	issue18234 messages
2013-06-24 14:35:14	belopolsky	create