This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: json dump silently converts int keys to string
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.6
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: My-Tien Nguyen, Stub2, eric.smith, facundobatista, steve.dower, stub, xtreak
Priority: normal Keywords:

Created on 2018-10-13 10:16 by My-Tien Nguyen, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (9)
msg327649 - (view) Author: My-Tien Nguyen (My-Tien Nguyen) Date: 2018-10-13 10:24
When int keys are silently converted to string on json serialization, the user needs to remember to convert it back to int on loading.
I think that a warning should be shown at least.

In my case I serialize a dict to json with int keys, later load it back into a dict (resulting in a dict with string keys) and test for existence of an int key in the dict which will then return False incorrectly.

I am aware that json does not support int keys, but this can be easily forgotten.
msg327650 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python committer) Date: 2018-10-13 11:48
Thanks for the report. There was a related issue few days back issue32816. I think this is a documented behavior at https://docs.python.org/3.8/library/json.html#json.dumps . Having a warning in place might break code and I don't know if there is a safe way to introduce this as a code level warning given that this is a documented behavior in Python 2 and 3. I think this is the case with other languages too like JavaScript itself converting int to string without warning adhering to JSON standard. Correct me if I am wrong or other languages have a warning related to this

> Note: Keys in key/value pairs of JSON are always of the type str. When a dictionary is converted into JSON, all the keys of the dictionary are coerced to strings. As a result of this, if a dictionary is converted into JSON and then back into a dictionary, the dictionary may not equal the original one. That is, loads(dumps(x)) != x if x has non-string keys

You can try doing json.loads(data, parse_int=int) but it will try converting the values.

>>> json.loads(json.dumps({1:'1'}), parse_int=int)
{'1': '1'}
>>> json.loads(json.dumps({1:1}), parse_int=int) 
{'1': 1}


Thanks
msg327654 - (view) Author: My-Tien Nguyen (My-Tien Nguyen) Date: 2018-10-13 14:56
I don’t think, “other languages do that too” is a good argument here. This would apply if behaving differently would break user expectation. But here we would do nothing more than explicitly inform the user of a relevant operation. If they already expected that behaviour, they can disregard the warning.

I don’t see how `parse_int`would help me here, I would need a `parse_str=int`, but then it would try to parse every string, and I don’t see the use case for that.

I would suggest a warning similar to this:

--- json/encoder.py
+++ json/encoder.py
@@ -1,6 +1,7 @@
 """Implementation of JSONEncoder
 """
 import re
+import warnings
 
 try:
     from _json import encode_basestring_ascii as c_encode_basestring_ascii
@@ -353,7 +354,9 @@
             items = sorted(dct.items(), key=lambda kv: kv[0])
         else:
             items = dct.items()
+        non_str_key = False
         for key, value in items:
+            non_str_key = non_str_key or not isinstance(key, str)
             if isinstance(key, str):
                 pass
             # JavaScript is weakly typed for these, so it makes sense to
@@ -403,6 +406,8 @@
                 else:
                     chunks = _iterencode(value, _current_indent_level)
                 yield from chunks
+        if non_str_key:
+            warnings.warn("Encountered non-string key(s), converted to string.", RuntimeWarning)
         if newline_indent is not None:
             _current_indent_level -= 1
             yield '\n' + _indent * _current_indent_level
msg327684 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2018-10-14 00:25
I can't think of another place where we issue a warning for anything similar. I'm opposed to any changes here: it's clearly documented behavior.

It's like being surprised .ini files convert to strings: it's just how that format works.
msg327688 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2018-10-14 03:12
Agreed with Eric. json.dump needs to produce valid JSON, which requires keys to be strings.

Try using pickle if you need to preserve full Python semantics.
msg327703 - (view) Author: My-Tien Nguyen (My-Tien Nguyen) Date: 2018-10-14 11:32
Sure, I can do that, but wanted to propose this regardless. I guess this is a disagreement on a language design level.
As a proponent of strong typing I wouldn’t have allowed non-string keys in the first place, and if they are allowed I would warn about conversion. This is also more aligned with the “explicit is better than implicit” principle.
msg361302 - (view) Author: Facundo Batista (facundobatista) * (Python committer) Date: 2020-02-03 14:42
I understand (and agree with) the merits of automatically converting the int to str when dumping to a string.

However, this result really surprised me:

>>> json.dumps({1:2, "1":3})
'{"1": 2, "1": 3}'

Is it a valid JSON?
msg365838 - (view) Author: Stub (Stub2) Date: 2020-04-06 06:03
Similarly, keys can be lost entirely:

>>> json.dumps({1:2, 1.0:3})
'{"1": 3}'
msg365840 - (view) Author: Stuart Bishop (stub) Date: 2020-04-06 06:14
(sorry, my example is normal Python behavior. {1:1, 1.0:2} == {1:2} , {1.0:1} == {1:1} )
History
Date User Action Args
2022-04-11 14:59:07adminsetgithub: 79153
2020-04-06 06:14:52stubsetnosy: + stub
messages: + msg365840
2020-04-06 06:03:29Stub2setnosy: + Stub2
messages: + msg365838
2020-02-03 14:42:30facundobatistasetnosy: + facundobatista
messages: + msg361302
2018-10-14 11:46:08eric.smithsetresolution: not a bug
2018-10-14 11:33:13My-Tien Nguyensetstatus: open -> closed
2018-10-14 11:32:16My-Tien Nguyensetstatus: closed -> open
resolution: not a bug -> (no value)
messages: + msg327703
2018-10-14 03:12:57steve.dowersetstatus: open -> closed

nosy: + steve.dower
messages: + msg327688

resolution: not a bug
stage: resolved
2018-10-14 00:25:04eric.smithsetnosy: + eric.smith
messages: + msg327684
2018-10-13 14:56:17My-Tien Nguyensetmessages: + msg327654
2018-10-13 11:48:46xtreaksetnosy: + xtreak
messages: + msg327650
2018-10-13 10:24:23My-Tien Nguyensettype: behavior
messages: + msg327649
components: + Library (Lib)
versions: + Python 3.6
2018-10-13 10:16:46My-Tien Nguyencreate