This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author matomatical
Recipients matomatical
Date 2019-04-27.01:56:58
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1556330218.68.0.648450414936.issue36738@roundup.psfhosted.org>
In-reply-to
Content
The json module allows a user to provide an `object_hook` function, which, if provided, is called to transform the dict that is created as a result of parsing a JSON Object.

It'd be nice if there was something analogous for JSON Arrays: an `array_hook` function to transform the list that is created as a result of parsing a JSON Array.

At the moment transforming JSON Arrays requires one of the following approaches (as far as I can see):

(1) Providing an object_hook function that will recursively transform any lists in the values of an Object/dict, including any nested lists, AND recursively transforming the final result in the event that the top level JSON object being parsed is an array (this array is never inside a JSON Object that goes through the `object_hook` transformation).
(2) Transforming the entire parsed result after parsing is finished by recursively transforming any lists in the final result, including recursively traversing nested lists AND nested dicts.

Providing an array_hook would cut out the need for either approach, as the recursive case from the recursive functions I mentioned could be used as the `array_hook` function directly (without the recursion).


## An example of usage:

Let's say we want JSON Arrays represented using tuples rather than lists, e.g. so that they are hashable straight out-of-the-(json)-box. Before this enhancement, this change requires one of the two methods I mentioned above. It is not so difficult to implement these recursive functions, but seems inelegant. After the change, `tuple` could be used as the `array_hook` directly:

```
>>> json.loads('{"foo": [[1, 2], "spam", [], ["eggs"]]}', array_hook=tuple)
{'foo': ((1, 2), 'spam', (), ('eggs',))}
```

It seems (in my opinion) this is more elegant than converting via an `object_hook` or traversing the whole structure after parsing.

## The patch:

I am submitting a patch that adds an `array_hook` kwarg to the `json` module's functions `load` and `loads`, and to the `json.decoder` module's `JSONDecoder`, `JSONArray` and `JSONObject` classes. I also hooked these together in the `json.scanner` module's `py_make_scanner` function.


It seems that `json.scanner` will prefer the `c_make_scanner` function defined in `Modules/_json.c` when it is available. I am not confident enough in my C skills or C x Python knowledge to dive into this module and make the analogous changes. But I assume they will be simple for someone who can read C x Python code, and that the changes will be analogous to those required to `Lib/json/scanner.py`. I need help to accomplish this part of the patch.


## Testing:

In the mean time, I added a test to `test_json.test_decode`. It's CURRENTLY FAILING because the implementation of the patch is incomplete (I believe this is only due to the missing part of the patch---the required changes to `Modules/_json.c` I identified above).

When I manually reset `json.scanner.make_scanner` to `json.scanner.py_make_scanner` and play around with the new `array_hook` functionality, it seems to work.
History
Date User Action Args
2019-04-27 01:56:58matomaticalsetrecipients: + matomatical
2019-04-27 01:56:58matomaticalsetmessageid: <1556330218.68.0.648450414936.issue36738@roundup.psfhosted.org>
2019-04-27 01:56:58matomaticallinkissue36738 messages
2019-04-27 01:56:58matomaticalcreate