This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: unpacked keyword arguments are not unicode normalized
Type: Stage:
Components: Unicode Versions: Python 3.4
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: benjamin.peterson, ezio.melotti, r.david.murray, sheppard, vstinner
Priority: normal Keywords:

Created on 2014-12-20 02:37 by sheppard, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Messages (5)
msg232956 - (view) Author: S. Andrew Sheppard (sheppard) Date: 2014-12-20 02:37
I came across unexpected behavior working with unpacking keyword arguments in Python 3.  It appears to be related to the automatic normalization of unicode characters to NFKC (PEP 3131), which converts e.g. MICRO SIGN to GREEK SMALL LETTER MU.  This conversion is applied to regular keyword arguments but not when unpacking arguments via **.

This issue arose while I was working with some automatically generated namedtuple classes, but I was able to reproduce it with a simple function:

def test(μ):
    print(μ)

>>> test(µ="test1") # chr(181)
test1

>>> test(μ="test2") # chr(956)
test2

>>> test(**{'μ': "test3"}) # chr(956)
test3

>>> test(**{'µ': "test4"}) # chr(181)

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: test() got an unexpected keyword argument 'µ'


I can obviously work around this, but wanted to bring it up in case it's a bug.  My naive expectation would be that unpacked ** keys should be treated exactly like normal keyword arguments.
msg232958 - (view) Author: S. Andrew Sheppard (sheppard) Date: 2014-12-20 02:51
Here's a simple namedtuple example for good measure.

from collections import namedtuple
Test = namedtuple("Test", [chr(181)])

>>> Test(**{chr(956): "test1"})
Test(µ='test1')

>>> Test(**{chr(181): "test1"})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: __new__() got an unexpected keyword argument 'µ'
msg232959 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2014-12-20 03:27
I suspect that the normalization is happening in the parsing phase.  That is, the keyword argument gets normalized when the python source is compiled, but the dictionary key is, of course, *not* normalized, since it is a literal string.  If I'm right, I think is not a bug.
msg232960 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2014-12-20 04:08
Yeah, kwarg dicts do not have the same checks applied to them as syntactic keyword args. It would be weird if, for example, dict(**mydict) normalized the keys of mydict.
msg232963 - (view) Author: S. Andrew Sheppard (sheppard) Date: 2014-12-20 05:10
Fair enough.  For future reference by anyone coming across this issue, here's a simplified version of the workaround I used:

from unicodedata import normalize
def normalize_keys(data):
    return {
        normalize('NFKC', key): value
        for key, value in data.items()
    }

def test(μ):
    print(μ)

>>> test(**normalize_keys({'µ': "test4"}))
test4
History
Date User Action Args
2022-04-11 14:58:11adminsetgithub: 67280
2014-12-20 05:10:10sheppardsetmessages: + msg232963
2014-12-20 04:08:48benjamin.petersonsetstatus: open -> closed

nosy: + benjamin.peterson
messages: + msg232960

resolution: not a bug
2014-12-20 03:27:57r.david.murraysetnosy: + r.david.murray
messages: + msg232959
2014-12-20 02:51:03sheppardsetmessages: + msg232958
2014-12-20 02:37:18sheppardcreate