This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Title: unicode identifiers not accessible or assignable through globals()
Type: behavior Stage:
Components: Unicode Versions: Python 3.9, Python 3.8, Python 3.7, Python 3.6
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: eric.smith, ezio.melotti, outofculture, steven.daprano
Priority: normal Keywords:

Created on 2020-12-19 00:27 by outofculture, last changed 2022-04-11 14:59 by admin.

Messages (5)
msg383336 - (view) Author: Martin Chase (outofculture) Date: 2020-12-19 00:27
This behavior is best described by the code below:

>>> meow = 1
>>> 'meow' in globals()
>>> µmeow = 1e-6
>>> 'µmeow' in globals()
>>> globals()['woof'] = 1
>>> woof
>>> globals()['µwoof'] = 1e-6
>>> µwoof
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'μwoof' is not defined
>>> import sys
>>> sys.getdefaultencoding()
>>> [(k, bytes(k, 'utf-8')) for k in globals()]
..., ('μmeow', b'\xce\xbcmeow'), ('µwoof', b'\xc2\xb5woof')]
>>> 'µ'.encode('utf-8')

Testing was done on linux and windows, variously using 3.6.12, 3.7.6, 3.8.6 and 3.9.0+.
msg383338 - (view) Author: Martin Chase (outofculture) Date: 2020-12-19 00:33
Oh, I just gave a cursory using `locals()`, and the same misbehavior is present.

A workaround, for anyone needing to assign or access unicode globals, is to use `exec`, e.g. `exec("µmeow = 1e-6", globals())`.
msg383340 - (view) Author: Steven D'Aprano (steven.daprano) * (Python committer) Date: 2020-12-19 00:42
I'm pretty sure this is not a bug, but is working as designed.

The interpreter normalises unicode identifiers, but key lookup in the dict does not.

Sorry, I don't have time right now to give a more detailed answer, but there are two distinct mu characters:

    μ U+03BC
    µ U+00B5

and my prediction is that the identifier is normalised to the first, the actual Greek mu, but you are looking up the second, the micro sign.

(I'll be able to give a longer response in a couple of hours, if still needed.)
msg383348 - (view) Author: Martin Chase (outofculture) Date: 2020-12-19 01:11
Ah! So then the proper code for me would be e.g.:

>>> globals()[unicodedata.normalize("NFKC", "µmeow")]

Yes, it's clear when I read that the normalization is going to happen. Is it also worth adding a note in the documentation for `globals()` and `locals()`? That's the first place I looked to try to find out what was wrong.
msg383462 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2020-12-20 21:48
I think documenting this with globals() and locals() is a good idea. That's the place this is most likely to trip someone up.
Date User Action Args
2022-04-11 14:59:39adminsetgithub: 86846
2021-01-20 10:46:27vstinnersetnosy: - vstinner
2020-12-20 21:48:01eric.smithsetnosy: + eric.smith
messages: + msg383462
2020-12-19 01:11:25outofculturesetmessages: + msg383348
2020-12-19 00:42:26steven.dapranosetnosy: + steven.daprano
messages: + msg383340
2020-12-19 00:33:21outofculturesetmessages: + msg383338
2020-12-19 00:27:18outofculturecreate