msg212215 - (view) |
Author: Chris Adams (acdha) |
Date: 2014-02-25 21:15 |
Currently the stdlib json module requires a custom serializer to avoid throwing a TypeError on collections.deque instances:
Python 3.3.4 (default, Feb 12 2014, 09:35:54)
[GCC 4.2.1 Compatible Apple LLVM 5.0 (clang-500.2.79)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from collections import deque
>>> import json
>>> d = deque(range(0, 10))
>>> json.dumps(d)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/Cellar/python3/3.3.4/Frameworks/Python.framework/Versions/3.3/lib/python3.3/json/__init__.py", line 233, in dumps
return _default_encoder.encode(obj)
File "/usr/local/Cellar/python3/3.3.4/Frameworks/Python.framework/Versions/3.3/lib/python3.3/json/encoder.py", line 191, in encode
chunks = self.iterencode(o, _one_shot=True)
File "/usr/local/Cellar/python3/3.3.4/Frameworks/Python.framework/Versions/3.3/lib/python3.3/json/encoder.py", line 249, in iterencode
return _iterencode(o, 0)
File "/usr/local/Cellar/python3/3.3.4/Frameworks/Python.framework/Versions/3.3/lib/python3.3/json/encoder.py", line 173, in default
raise TypeError(repr(o) + " is not JSON serializable")
TypeError: deque([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) is not JSON serializable
|
msg212235 - (view) |
Author: R. David Murray (r.david.murray) *  |
Date: 2014-02-26 02:22 |
json is only designed to serialize standard data types out of the box. Anything else is an extension. I presume you are asking for this because a deque looks more-or-less like a list. I'm not sure that's reason enough, but we'll see what others think.
|
msg212264 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2014-02-26 15:34 |
The problem is that it would be deserialized as a list; this breaks the general expectation that serialization formats should round-trip.
(yes, tuple already does this; but I think it is less of a problem for tuples, since the list API is a superset of the tuple API except for hashing)
So, perhaps we could ship an optional serializer (under which form?) accepting any sequence type (and perhaps any mapping type?), but it shouldn't be the default.
|
msg212275 - (view) |
Author: Gareth Rees (gdr@garethrees.org) *  |
Date: 2014-02-26 16:43 |
The JSON implementation uses these tests to determine how to serialize a Python object:
isinstance(o, (list, tuple))
isinstance(o, dict)
So any subclasses of list and tuple are serialized as a list, and any subclass of dict is serialized as an object. For example:
>>> json.dumps(collections.defaultdict())
'{}'
>>> json.dumps(collections.OrderedDict())
'{}'
>>> json.dumps(collections.namedtuple('mytuple', ())())
'[]'
When deserialized, you'll get back a plain dictionary or list, so there's no round-trip property here.
The tests could perhaps be changed to:
isinstance(o, collections.abc.Sequence)
isinstance(o, collections.abc.Mapping)
I'm not a JSON expert, so I have no informed opinion on whether this is a good idea or not, but in any case, this change wouldn't help with deques, as a deque is not a Sequence. That's because deques don't have an index method (see issue10059 and issue12543).
|
msg288996 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2017-03-05 02:45 |
See also, the same feature request from Tarek Ziadé, http://bugs.python.org/issue29663 , "collections.deque could be serialized in JSON as a simple array. The only thing we can lose in the process is the maxlen value, but I think it's a decent behaviour to ignore it when encoding and to set it to None when decoding."
+1 from me as well. This isn't really different that how we handle tuples and I can see that it would be useful to be able to dump a deque into JSON. I concur that it is reasonable to ignore maxlen because that is primarily a convenience feature (auto-popping on overflow) rather than something that is intrinsic to the semantics of data itself.
For now, just adding deque support is reasonable. We can't just do all sequences because string/bytearray like objects would need to be excluded.
|
msg289000 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2017-03-05 06:01 |
See issue27362 for more general approach.
|
msg289005 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2017-03-05 08:50 |
For now, just hardcoding deque support is fine.
Support for a __json__ attribute or JSON array registry is a topic for another day. Even then, I don't think that within the standard library support for JSONification should have its responsibility shifted outside the of json module itself.
|
msg289013 - (view) |
Author: R. David Murray (r.david.murray) *  |
Date: 2017-03-05 14:58 |
I disagree, I think a __json__ protocol is sensible. But this is why it needs to be discussed on python-dev or python-ideas first :) In the meantime adding deque support like we added enum support is reasonable, but IMO we shouldn't go to crazy adding support for non-base types before talking about a __json__ protocol.
|
msg289022 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2017-03-05 16:35 |
There is a difference. An __json__ attribute would have to convert to a list first. Adding support directly to the json module would allow the deque to be read directly.
I think you all are leaning towards premature generalization and making this harder than it needs to be. Chris and Tarek's proposal is a reasonable and straight-forward, but it is not being pushed towards PEP territory and I think Guido would need to opine on whether to enshrine yet another dunder method that would infest the library and privilege the json serialization format over all formats.
|
msg289024 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2017-03-05 16:52 |
FWIW, one of the design goal for deques was to make them easily substitutable for lists when needed. This feature request is a nice-to-have that moves us a little closer.
That said, I think a __json__ attribute is too big of a hammer for this simple proposal.
Also, please add Bob Ippolito to all JSON issues. He has excellent design sensibilities and considerable contact with users of the json module.
|
msg290555 - (view) |
Author: Lisa Roach (lisroach) *  |
Date: 2017-03-27 01:43 |
I made PR 830 for this issue, it seems to be a nice feature to have in my opinion.
Let me know if I should add some unit tests :)
|
msg290561 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2017-03-27 03:24 |
Thanks Lisa.
|
msg290565 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2017-03-27 06:54 |
Seems there are reference leaks. And I afraid that importing a module for every serialized object can significantly hit the performance. Can you run some benchmarks?
> An __json__ attribute would have to convert to a list first. Adding support directly to the json module would allow the deque to be read directly.
With PR 830 the deque is converted to a list by json encoder.
|
msg337279 - (view) |
Author: Lisa Roach (lisroach) *  |
Date: 2019-03-06 05:21 |
Serhiy might be right, it looks significantly worse with benchmarking:
lisroach$ python3 -m timeit "import json; json.dumps(['test'])"
100000 loops, best of 3: 2.73 usec per loop
lisroach$ ./python.exe -m timeit "import json; json.dumps(['test'])"
10000 loops, best of 5: 21.2 usec per loop
lisroach$ python3 -m timeit "import json; json.dumps(10000)"
100000 loops, best of 3: 2.49 usec per loop
lisroach$ ./python.exe -m timeit "import json; json.dumps(10000)"
20000 loops, best of 5: 16.3 usec per loop
|
msg380417 - (view) |
Author: Ken Jin (kj) *  |
Date: 2020-11-05 15:47 |
Sorry to butt into this conversation, but I wanted to add that I have interest in this feature - deques are the fourth most common container types I use in Python. Is there any way I can help to get this PR across the finish line?
So far I've forked the PR, rebased it, then applied some changes (docs, news, and performance) to try to lessen the impact of checking for deque:
(Python/master branch)
>>> timeit.timeit(stmt="json.dumps(['test'])", setup="import json", number=1_000_000)
2.2583862999999997
>>> timeit.timeit(stmt="json.dumps(10000)", setup="import json", number=1_000_000)
1.9845121999999975
(Python/pr_830 branch)
>>> timeit.timeit(stmt="json.dumps(['test'])", setup="import json", number=1_000_000)
2.324303399999991
>>> timeit.timeit(stmt="json.dumps(10000)", setup="import json", number=1_000_000)
1.9680711999999971
The PR branch is here https://github.com/Fidget-Spinner/cpython/tree/pr_830.
I'm not a Git wizard, so I don't know what's the best next step. Do I
a. Make a PR against Lisa's PR (or)
b. Make a brand new PR against cpython master ?
If the core devs here feel that after 6 years, this change might be unneeded after all, I don't mind closing the branch either. Thanks for reading.
|
|
Date |
User |
Action |
Args |
2022-04-11 14:57:59 | admin | set | github: 64973 |
2020-11-05 15:47:24 | kj | set | nosy:
+ kj messages:
+ msg380417
|
2019-03-06 05:21:12 | lisroach | set | messages:
+ msg337279 |
2017-03-27 06:54:25 | serhiy.storchaka | set | messages:
+ msg290565 |
2017-03-27 03:24:55 | rhettinger | set | messages:
+ msg290561 |
2017-03-27 01:43:22 | lisroach | set | nosy:
+ lisroach
messages:
+ msg290555 pull_requests:
+ pull_request734 |
2017-03-05 16:52:47 | rhettinger | set | nosy:
+ bob.ippolito messages:
+ msg289024
|
2017-03-05 16:35:54 | rhettinger | set | messages:
+ msg289022 |
2017-03-05 14:58:55 | r.david.murray | set | messages:
+ msg289013 |
2017-03-05 08:50:10 | rhettinger | set | messages:
+ msg289005 |
2017-03-05 06:01:16 | serhiy.storchaka | set | messages:
+ msg289000 |
2017-03-05 02:45:50 | rhettinger | set | messages:
+ msg288996 versions:
+ Python 3.7, - Python 3.5 |
2017-02-27 11:08:31 | serhiy.storchaka | link | issue29663 superseder |
2014-02-26 16:43:49 | gdr@garethrees.org | set | nosy:
+ gdr@garethrees.org messages:
+ msg212275
|
2014-02-26 15:34:48 | pitrou | set | nosy:
+ serhiy.storchaka
|
2014-02-26 15:34:43 | pitrou | set | versions:
+ Python 3.5, - Python 2.7, Python 3.3 nosy:
+ rhettinger, pitrou, ezio.melotti
messages:
+ msg212264
type: enhancement |
2014-02-26 02:22:34 | r.david.murray | set | nosy:
+ r.david.murray messages:
+ msg212235
|
2014-02-25 21:15:50 | acdha | create | |