This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author steven.daprano
Recipients snoopyjc, steven.daprano
Date 2022-03-07.23:15:42
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <>
The behaviour is technically correct, but confusing and unfortunate, and I don't think we can fix it.

Unicode does not define names for the ASCII control characters. But it does define aliases for them, based on the C0 control char standard.

unicodedata.lookup() looks for aliases as well as names (since version 3.3).

It is unfortunate that we have only a single function for looking up a unicode code point by name, alias, alias-abbreviation, and named-sequence. That keeps the API simple, but in corner cases like this it leads to confusion.

The obvious "fix" is to make name() return the alias if there is no official name to return, but that is a change in behaviour. I have code that assumes that C0 and C1 control characters have no name, and relies on name() raising an exception for them.

Even if we changed the behaviour to return the alias, which alias should be returned, the full alias or the abbreviation?

This doesn't fix the problem that name() and lookup() aren't inverses of each other:

lookup('NUL') -> '\0  # using the abbreviated alias
name('\0') -> 'NULL'  # returns the full alias (or vice versa)

It gets worse with named sequences:

>>> name(c)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: name() argument 1 must be a unicode character, not str
>>> len(c)

So we cannot possibly make name() and lookup() inverses of each other.

What we really should have had is separate functions for name and alias lookups, or better still, to expose the raw unicode tables as mappings and let people create their own higher-level interfaces.
Date User Action Args
2022-03-07 23:15:43steven.dapranosetrecipients: + steven.daprano, snoopyjc
2022-03-07 23:15:43steven.dapranosetmessageid: <>
2022-03-07 23:15:43steven.dapranolinkissue46947 messages
2022-03-07 23:15:43steven.dapranocreate