classification
Title: json C vs pure-python implementation difference
Type: behavior Stage: needs patch
Components: Library (Lib) Versions: Python 3.8, Python 3.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: alexey-smirnov, cvrebert, ezio.melotti, matrixise, pitrou, scoder, socketpair, thomaslee
Priority: normal Keywords:

Created on 2012-05-23 09:25 by socketpair, last changed 2018-08-17 12:02 by scoder.

Messages (10)
msg161395 - (view) Author: Марк Коренберг (socketpair) * Date: 2012-05-23 09:25
Pure-python implementation:
    if isinstance(o, (list, tuple)):

C implementation:
    if (PyList_Check(obj) || PyTuple_Check(obj))

This make real difference (!) in my code.

So, please change pure-python implementation to:
    if type(o) in (list, tuple):
Or, fix C implementation to: /* intentionally forgot (only for this example) to check if return value is -1 */
    if (PyObject_IsInstance(obj, PyList_Type) || PyObject_IsInstance(obj, PyTuple_Type)
msg161399 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-05-23 11:04
What difference does it make? Are you using __instancecheck__ perhaps?
msg161474 - (view) Author: Марк Коренберг (socketpair) * Date: 2012-05-24 02:29
#!/usr/bin/python2.7

import json

class pseudo_list(object):
    __class__ = list # fake isinstance

    def __init__(self, iterator):
        self._saved_iterator = iterator

    def __iter__(self):
        return self._saved_iterator

class myenc(json.JSONEncoder):
    def default(self, o):
        try:
            return pseudo_list(iter(o))
        except TypeError:
            return super(myenc, self).default(o)

# works (pure-python implementation)
print json.dumps({1:xrange(10), 2:[5,6,7,8]}, cls=myenc, indent=1)

# does not work (C implementation)
print json.dumps({1:xrange(10), 2:[5,6,7,8]}, cls=myenc, indent=None)
msg161496 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-05-24 10:30
> class pseudo_list(object):
>     __class__ = list # fake isinstance

Why not inherit from list directly?
Setting __class__ to something else isn't widely supported in the Python code base. It may work or may not work, depending on the API, but it's not something we design or test for.
msg161536 - (view) Author: Марк Коренберг (socketpair) * Date: 2012-05-25 00:00
Well, __class_ = list is my problem, but python's problem is that it uses different approaches in C and python implementation.

P.S.
I don't want to subclass list, as I don't want things like this:
x = pseudo_list(iter(xrange(10))
x.append('test')
print len(x)
msg161538 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-05-25 00:05
> Well, __class_ = list is my problem, but python's problem is that it
> uses different approaches in C and python implementation.

Well, by construction a C accelerator will use the fastest method
available within what the API's specification allows. The json API
doesn't specify whether isinstance() or a more concrete type check is
used when dispatching over argument types, so I'd classify this as an
implementation detail.
msg161540 - (view) Author: Марк Коренберг (socketpair) * Date: 2012-05-25 00:12
Inconsistency is bother me. If I specify indent in dumps(), I will have one semantics, else other ones.

Why not to fix pure-python implementation using "type(o) in (list, tuple)" ? This is faster too (as I think).
msg171409 - (view) Author: Thomas Lee (thomaslee) (Python committer) Date: 2012-09-28 06:59
FWIW, I think Mark's right here. I'm +1 on the implementations being consistent.

Seems like a potentially nasty surprise if you move from one implementation to the other and, lacking awareness of this quirk, design your algorithm around semantics. I think this was Mark's original point.

If the json API doesn't care how the type check is performed, then we get a (probably very small :)) win from the type(o) in (list, tuple) for the Python impl in addition to bringing consistency to the two implementations.
msg323649 - (view) Author: Stéphane Wirtel (matrixise) * (Python committer) Date: 2018-08-17 11:35
We have received a notification about this bug for 3.5
msg323654 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2018-08-17 12:02
FWIW, the C implementation of the sequence encoder uses PySequence_Fast(), so adding a lower priority instance check that calls the same encoding function would solve this.

https://github.com/python/cpython/blob/cfa797c0681b7fef47cf93955fd06b54ddd09a7f/Modules/_json.c#L1730

Probably not something to fix in Py3.5/6 anymore, though.
History
Date User Action Args
2018-08-17 12:02:57scodersetnosy: + scoder

messages: + msg323654
versions: - Python 3.5, Python 3.6
2018-08-17 11:35:13matrixisesetnosy: + matrixise

messages: + msg323649
versions: + Python 3.5, Python 3.6, Python 3.7, Python 3.8, - Python 2.7, Python 3.2, Python 3.3
2012-09-28 06:59:54thomasleesetnosy: + thomaslee
messages: + msg171409
2012-09-26 18:48:58ezio.melottisetstage: needs patch
2012-05-25 00:12:34socketpairsetmessages: + msg161540
2012-05-25 00:05:28pitrousetmessages: + msg161538
2012-05-25 00:00:48socketpairsetmessages: + msg161536
2012-05-24 10:30:41pitrousetmessages: + msg161496
2012-05-24 04:41:12alexey-smirnovsetnosy: + alexey-smirnov
2012-05-24 02:49:53ezio.melottisetnosy: + ezio.melotti
2012-05-24 02:29:09socketpairsetmessages: + msg161474
2012-05-23 11:04:55pitrousetnosy: + pitrou

messages: + msg161399
versions: - Python 3.1, Python 3.4
2012-05-23 10:18:01cvrebertsetnosy: + cvrebert
2012-05-23 09:25:54socketpairsettype: behavior
2012-05-23 09:25:29socketpaircreate