classification
Title: json.dumps to check for obj.__json__ before raising TypeError
Type: enhancement Stage: needs patch
Components: Library (Lib) Versions: Python 3.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: bob.ippolito Nosy List: Daniel Ward, Ollie Ford, abarry, berker.peksag, bob.ippolito, r.david.murray, rhettinger, serhiy.storchaka, steven.daprano, ulope
Priority: normal Keywords: patch

Created on 2016-06-21 13:09 by Daniel Ward, last changed 2018-02-01 16:33 by ulope.

Files
File name Uploaded Description Edit
json-customize.patch serhiy.storchaka, 2017-03-05 10:03
Messages (11)
msg268991 - (view) Author: Daniel Ward (Daniel Ward) Date: 2016-06-21 13:09
To help prevent retrospective JSONEncoder overrides when failing to serialize a given object, the intention of this issue is to propose that the JSON encoder checks if a given object has a __json__ attribute, using that rather than raising a TypeError.

This will help in maintaining easier-to-follow code and keeps the responsibility of determining how an object should be represented in JSON objects firmly within the object itself.

The obj.__json__ callable/attribute should behave in the same way as __repr__ or __str__, for example.

I'm happy to look in to contributing this enhancement myself if that's preferred. Any pointers as to how I go about contributing are greatly appreciated.
msg268993 - (view) Author: Anilyka Barry (abarry) * (Python triager) Date: 2016-06-21 13:19
I'm not too familiar with the json package, but what should __json__ return when called?
msg268995 - (view) Author: Daniel Ward (Daniel Ward) Date: 2016-06-21 13:32
Sure, so for example:


=========
import json


class ObjectCounter:

    def __init__(self, name, count):
        self.name = name
        self.count = count

    def __json__(self):
       return '[{name}] {count}'.format(name=self.name, count=self.count)


object_counter = ObjectCounter('DC1', 3789)
my_json_string = json.dumps({'success': True, 'counter': object_counter})

============

In the above example, the value stored in my_json_string would be:

'{"success": true, "counter": "[DC1] 3789"}'

This is an untested and quick example, but I hope it explains what I'm aiming to achieve. Without the __json__ method, the json.dumps call would raise an exception along the lines of the below message, unless we create a new JSONEncoder object and call json.dumps(..., cls=MyJSONEncoder), which becomes difficult to manage and follow on larger projects.

TypeError: <ObjectCounter instance at XXX> is not JSON serializable
msg268999 - (view) Author: Anilyka Barry (abarry) * (Python triager) Date: 2016-06-21 13:53
So __json__ returns a string meant to be serializable. I'm not too keen on using a dunder name (although my word doesn't weigh anything ;) and I'd personally prefer something like as_json_string(). I think the idea in general is good, though. Mind submitting a patch?
msg269002 - (view) Author: Steven D'Aprano (steven.daprano) * (Python committer) Date: 2016-06-21 14:43
For starters, dunder names like __json__ are reserved for Python's own use, so you would have to get the core developers to officially bless this use.


But... I'm not really sure that "the responsibility of determining how an object should be represented in JSON objects firmly within the object itself" is a good idea. For a general purpose protocol, I don't think you can trust any object to return valid JSON. What if my object.__json__ returned "}key='c" or some other invalid string? Whose responsibility is it to check that __json__ returns valid JSON?

I don't think there is any need to make this an official protocol. You know your own objects, you know if you can trust them, and you can call any method you like. So your example becomes:

    my_json_string = json.dumps(
        {'success': True, 'counter': object_counter.to_json()})

which is okay because that's clearly *your* responsibility to make sure that your object's to_json method returns a valid string. If you make it an official language wide protocol, it's unclear whose responsibility it is: the object (dangerous!), the caller (difficult), the Python interpreter (unlikely), json.dumps (unlikely).
msg269003 - (view) Author: Daniel Ward (Daniel Ward) Date: 2016-06-21 14:47
I don't think I explained the response very well, effectively the __json__ call would return an object which is JSON-serializable. This would include dict objects containing JSON-serializable objects albeit natively-supporting JSON serialisation or by means of subsequent obj.__json__ calls.

The reason I gave it __json__ is purely for easily-remembered implementation, separating it out from calls which may potentially clash with existing codebases, because let's face it, people don't often get to start again ;)

I'm not adverse to changing the method name at all, but I do believe this is a progressive way to go regarding JSON-serialization.
msg269021 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2016-06-21 20:33
This was discussed on python-ideas before:

* https://groups.google.com/forum/#!msg/python-ideas/MbvpK21nUNA/DwYpOFOVuYMJ
* https://groups.google.com/forum/#!msg/python-ideas/JVLuo-L0reU/zLNnBa9THqgJ

I don't think there was an agreement on the idea so I suggest to send your proposal to python-ideas first.
msg269091 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2016-06-22 22:28
Pretty much any project that makes non-trivial use of json ends up implementing a jsonification protocol, usually by creating either a __json__ method or (more commonly, I think) a to_json method.

But, yeah, this is python-ideas material and would get into the stdlib only as an officially blessed protocol, in which case using __json__ would make sense.  So I'm going to close the issue pending a consensus on python-ideas.  If it gets accepted the issue can be reopened.
msg289001 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-03-05 06:23
This could fix other issues:

* issue16535 -- for Decimal.
* issue20774 -- for deque.
* issue24313 -- for NumPy numeric types.
* issue26263 -- for array.

Currently the blessed way of JSON encoder customization is to implement the default method in JSONEncoder subclass or pass the default argument to dump(). But that requires changing every JSON serialization call and handling all non-standard types in one function. I think it would be handly to pick the type-specific serialization function from: 1) per-encoder dispatch table, 2) global dispatch type (registry), 3) __json__ method. This can be done after the default function fails or be included in the default default method.

This will add JSON support of standard library types (e.g. collections other than list, tuple and dict, numbers other than int and float) and will help to implement task specific serialization of user classes.
msg289006 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-03-05 08:54
I concur with David Murray that this should be kicked around on python-dev or python-ideas first.  Also, we should ask Bob Ippolito for his thoughts.
msg289007 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-03-05 10:03
This feature already was proposed for simplejson (https://github.com/simplejson/simplejson/issues/52). Special __json__ method is used in wild in a number of projects for exactly this purpose. It looks to me the main disagreement in the past Python-Idea discussion (https://mail.python.org/pipermail/python-ideas/2010-July/007811.html) was about whether implement the customization as a special method or as a registry. I suggest to implement both. Special methods are good for standard collection and numeric classes, global registry is good for application-wide serialization, local dispatch table or the default method are good for more specific task-specific customization.

Here is a draft implementation. It follows the design of pickle and copy modules.

There are few design questions.

1. What is the order of using different customization methods? Should registries and __json__ be checked before calling the default method, after calling the default method (if it fails), or inside the default implementation of the default method?

2. For Decimal we need to customize raw JSON representation. In the past it was possible to implement an intermediate float or int subclass with __str__ or __repr__ returning raw JSON representation. But this hack no longer works. Needed to add explicit support of special JSON representation objects. Other way -- add yet one special method (__raw_json__ or __json_str__).

3. Do we need the json.registry() function for global registration, or it is enough to expose the json.dispatch_table mapping?
History
Date User Action Args
2018-02-01 16:33:45ulopesetnosy: + ulope
2017-03-05 10:03:59serhiy.storchakasetfiles: + json-customize.patch
keywords: + patch
messages: + msg289007
2017-03-05 08:54:31rhettingersetassignee: bob.ippolito

messages: + msg289006
nosy: + bob.ippolito, rhettinger
2017-03-05 06:23:51serhiy.storchakasetstatus: closed -> open

versions: + Python 3.7, - Python 3.6
nosy: + serhiy.storchaka

messages: + msg289001
resolution: later ->
stage: resolved -> needs patch
2016-06-22 22:28:29r.david.murraysetstatus: open -> closed

nosy: + r.david.murray
messages: + msg269091

resolution: later
stage: resolved
2016-06-21 20:33:12berker.peksagsetnosy: + berker.peksag

messages: + msg269021
stage: needs patch -> (no value)
2016-06-21 14:47:30Daniel Wardsetmessages: + msg269003
2016-06-21 14:43:11steven.dapranosetnosy: + steven.daprano
messages: + msg269002
2016-06-21 13:53:25abarrysetmessages: + msg268999
stage: test needed -> needs patch
2016-06-21 13:42:31Ollie Fordsetnosy: + Ollie Ford
2016-06-21 13:32:49Daniel Wardsetmessages: + msg268995
2016-06-21 13:19:55abarrysetversions: + Python 3.6
nosy: + abarry

messages: + msg268993

stage: test needed
2016-06-21 13:09:18Daniel Wardcreate