This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: json module should issue warning about duplicate keys
Type: enhancement Stage:
Components: Library (Lib) Versions: Python 3.11
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: bob.ippolito Nosy List: Zeturic, andrei.avk, bob.ippolito, corona10, rhettinger
Priority: normal Keywords:

Created on 2021-08-31 00:30 by Zeturic, last changed 2022-04-11 14:59 by admin.

Messages (5)
msg400678 - (view) Author: Kevin Mills (Zeturic) Date: 2021-08-31 00:30
The json module will allow the following without complaint:

import json
d1 = {1: "fromstring", "1": "fromnumber"}
string = json.dumps(d1)
print(string)
d2 = json.loads(string)
print(d2)

And it prints:

{"1": "fromstring", "1": "fromnumber"}
{'1': 'fromnumber'}

This would be extremely confusing to anyone who doesn't already know that JSON keys have to be strings. Not only does `d1 != d2` (which the documentation does mention as a possibility after a round trip through JSON), but `len(d1) != len(d2)` and `d1['1'] != d2['1']`, even though '1' is in both.

I suggest that if json.dump or json.dumps notices that it is producing a JSON document with duplicate keys, it should issue a warning. Similarly, if json.load or json.loads notices that it is reading a JSON document with duplicate keys, it should also issue a warning.
msg400696 - (view) Author: Kevin Mills (Zeturic) Date: 2021-08-31 07:08
Sorry to the people I'm pinging, but I just noticed the initial dictionary in my example code is wrong. I figured I should fix it before anybody tested it and got confused about it not matching up with my description of the results.

It should've been:

import json
d1 = {"1": "fromstring", 1: "fromnumber"}
string = json.dumps(d1)
print(string)
d2 = json.loads(string)
print(d2)
msg406181 - (view) Author: Andrei Kulakov (andrei.avk) * (Python triager) Date: 2021-11-12 00:14
In general this sounds reasonable; - but a couple of thoughts / comments:

- If you have a dict with mixed numbers in str format and in number format (i.e. ints as numbers and ints as strings in your case), you are creating problems in many potential places. The core of the problem is logically inconsistent keys rather than the step of conversion to JSON. So the most useful place for warning would be when adding a new key, but that wouldn't be practical.

- Even if something is to be done at conversion to JSON, it's not clear if it should be a warning (would that be enough when the conversion is a logical bug?), or it should be some kind of strict=True mode that raises a ValueError?
msg406182 - (view) Author: Andrei Kulakov (andrei.avk) * (Python triager) Date: 2021-11-12 01:06
Another good option would be to use typed dict like `mydict : dict[int,str] = {}`; and use typed values when populating the dict; this way a type checker will warn you of inconsistent key types.
msg406302 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2021-11-14 01:10
-0 on doing this. The suggested warning/error adds overhead that everyone would pay for but would almost never be of benefit.  I haven't seen this particular problem arise in practice.  The likely reasons it doesn't come up are 1) that generated data doesn't normally produce mixed type keys, 2) because mixed type keys don't round-trip, and 3) even using numeric keys only (not mixed) is uncommon because it results in poor outcomes that fail round-trip invariants.

Andrei Kulakov is right in saying that such data suggests deeper problems with the design and that static typing would be beneficial.

One last thought:  Even with regular dicts, we don't normally warn about encountering duplicate keys:

    >>> dict([(1, 'run'), (1, 'zoo'), (3, 'tree')])
    {1: 'zoo', 3: 'tree'}
History
Date User Action Args
2022-04-11 14:59:49adminsetgithub: 89217
2021-11-14 01:49:22rhettingersetassignee: bob.ippolito
2021-11-14 01:10:43rhettingersetnosy: + rhettinger
messages: + msg406302
2021-11-12 01:06:23andrei.avksetmessages: + msg406182
2021-11-12 00:14:55andrei.avksetnosy: + andrei.avk
messages: + msg406181
2021-08-31 07:08:50Zeturicsetmessages: + msg400696
2021-08-31 05:58:32corona10setnosy: + corona10
2021-08-31 01:36:47rhettingersetnosy: + bob.ippolito
2021-08-31 00:30:37Zeturiccreate