classification
Title: json iterencode can not handle general iterators
Type: enhancement Stage: needs patch
Components: Library (Lib) Versions: Python 3.5
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Aaron.Staley, Zectbumo, eric.araujo, ezio.melotti, pitrou, rhettinger
Priority: normal Keywords:

Created on 2012-04-13 22:10 by Aaron.Staley, last changed 2014-10-02 06:46 by Zectbumo.

Messages (4)
msg158239 - (view) Author: Aaron Staley (Aaron.Staley) Date: 2012-04-13 22:10
The json library's encoder includes a function called 'iterencode'.  iterencode allows for encoding to be streamed; as tokens are produced they are yielded. This allows for the encoded object to be streamed to a file, over a socket, etc. without being placed all into memory.

Unfortunately, iterencode cannot encode general iterators.  This significantly limits the usefulness of the function.  For my use case I wish to convert a large stream (iterator) of objects into json.  Unfortunately, I currently have to:

A. Bring all the objects into memory by encasing the iterator in a list()
B. Make a hack where I subclass list and making that object's __iter__ function return my desired iterator.

The problem is that the json library explicitly checks for something being a list:

                if isinstance(value, (list, tuple)):
                    chunks = _iterencode_list(value, _current_indent_level)

It would work just as well (and be more pythonic) to see if the value supports the iterator protocol:
                if isinstance(value, collections.Iterable):
                    chunks = _iterencode_list(value, _current_indent_level)


Erroring example:

>>> import json
>>> e = json.JSONEncoder()
>>> r = xrange(20)
>>> gen = e.iterencode(r)
<generator object _iterencode at 0x14a5460>
>>> next(gen)

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.2/json/encoder.py", line 419, in _iterencode
    o = _default(o)
  File "/usr/lib/python3.2/json/encoder.py", line 170, in default
    raise TypeError(repr(o) + " is not JSON serializable")
TypeError: xrange(0, 20) is not JSON serializable
msg158334 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-04-15 14:21
That's more of a feature request than a bug. By definition JSON can only represent a small subset of Python's types.
Also, if you encode an iterator as a JSON list, you will get back a Python list when decoding the JSON representation, so it won't round-trip.
msg158906 - (view) Author: √Čric Araujo (eric.araujo) * (Python committer) Date: 2012-04-21 01:49
Agreed with Antoine; I think that if this is added, it should be opt-in, not default.  Also, it is not clear if the request is about iterators or iterables.
msg228166 - (view) Author: Alfred Morgan (Zectbumo) Date: 2014-10-02 06:46
Need a patch? Here you go.

    https://github.com/Zectbumo/cpython/compare/master

How to use it:

    encoder = JSONEncoder(stream=True)

This will iterencode() iterators as lists and file objects as strings and stream them when constructed with stream=True.
History
Date User Action Args
2014-10-02 06:46:10Zectbumosetnosy: + Zectbumo

messages: + msg228166
versions: + Python 3.5, - Python 3.3
2012-04-21 01:49:13eric.araujosetnosy: + eric.araujo
messages: + msg158906
2012-04-15 14:21:09pitrousetversions: - Python 2.7, Python 3.2
nosy: + pitrou

messages: + msg158334

type: behavior -> enhancement
stage: test needed -> needs patch
2012-04-13 23:26:41ezio.melottisetnosy: + rhettinger, ezio.melotti
stage: test needed
type: behavior

versions: + Python 3.3, - Python 2.6, Python 3.1
2012-04-13 22:10:33Aaron.Staleycreate