classification
Title: json float encoding incorrect for dbus.Double
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: mark.dickinson Nosy List: eajames, eric.smith, mark.dickinson, python-dev, r.david.murray, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2016-09-01 20:20 by eajames, last changed 2018-09-25 20:41 by serhiy.storchaka. This issue is now closed.

Files
File name Uploaded Description Edit
json-float-str-default.patch eajames, 2016-09-01 20:20 review
json-float-str-2.7.patch eajames, 2016-09-01 20:23 ported to 2.7 review
json-float-repr-2.7.patch eajames, 2016-09-02 16:02 review
Messages (16)
msg274179 - (view) Author: Eddie James (eajames) * Date: 2016-09-01 20:20
JSON does not correctly encode dbus.Double types, even though all other dbus types are handled fine. I end up with output like this (0.25 is the floating point value): dbus.Double(0.25, variant_level=1)

Found that the encoding uses repr() for float objects but uses str() for integer objects. I propose a change to use str() for float objects as well. This could be ported back to 2.7 as well
msg274226 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2016-09-02 10:42
> I propose a change to use str() for float objects as well. This could be ported back to 2.7 as well

That would be a behaviour change for 2.7, and an undesirable one. `str` loses precision in Python 2, so e.g., `json.loads(json.dumps(pi)) == pi` would no longer be true. (Indeed, the json library was changed to use `repr` rather than `str` for floats at some point in the past, for exactly this reason.)

I assume that dbus.Double is a subclass of float. Is that correct? If that's the case, I'm a bit confused: `float.__repr__` behaves identically to `float.__str__` in Python versions >= 3.2, so I don't see what your suggested change would achieve.

>>> class MyFloat(float):
...     def __repr__(self): return "MyFloat(something_or_other)"
...     def __str__(self): return "1729.0"
... 
>>> x = MyFloat(2.3)
>>> import json
>>> json.dumps(x)
'2.3'
msg274237 - (view) Author: Eddie James (eajames) * Date: 2016-09-02 14:01
Understood on 2.7, I wasn't aware it would cause any issues.

Dbus.Double is not a subclass of float unfortunately. Problem is that all Dbus types seem to have a custom tp_repr method that returns that strange formatting I mentioned. So repr won't be the same as str, in any version of python.
msg274240 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2016-09-02 14:40
Nothing that can/should be done on the stdlib side, then.
msg274242 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2016-09-02 14:44
> Dbus.Double is not a subclass of float unfortunately.

Okay, now I'm *really* confused. :-)

If it's not a subclass of `float`, then how do you end up in the `floatstr` code? Every path to that code that I can see in the source is via an `isinstance(<value>, float`) check.

I don't know dbus at all, but I just tried installing it under Macports (on OS X 10.9). Here's the package description, so you can tell me whether I'm actually installing the right thing, or something totally unrelated:

    dbus-python35 @1.2.0_2 (devel, python)
        Python bindings for the dbus message bus system.

Once I've done that, I get the following behaviour in Python:

Python 3.5.2 (default, Aug 16 2016, 08:43:53) 
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import dbus
>>> dbus.Double
<class 'dbus.Double'>
>>> dbus.Double.__mro__
(<class 'dbus.Double'>, <class '_dbus_bindings._FloatBase'>, <class 'float'>, <class 'object'>)

So it looks as though at least for this version of dbus, we do have a subclass of `float`. Looking at an instance, I see the following:

>>> x = dbus.Double(4.3)
>>> isinstance(x, float)
True
>>> repr(x)
'dbus.Double(4.3)'
>>> str(x)
'4.3'
>>> float.__repr__(x)
'4.3'
>>> float.__str__(x)
'4.3'
>>> import json
>>> json.dumps(x)
'4.3'

So I'm still struggling to see what difference swapping out float.__repr__ for float.__str__ would make.
msg274243 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2016-09-02 14:46
> Nothing that can/should be done on the stdlib side, then.

I think there's nothing to do for 3.x: as far as I can tell, everything should be working exactly as desired there.

For 2.7, we may want to consider processing float instances using `float.__repr__` instead of plain old `repr`. I believe that would fix the OP's problem, and also bring the 2.7 behaviour more in line with the 3.x behaviour. It's a backwards incompatible behaviour change, but probably not a particularly disruptive one.
msg274246 - (view) Author: Eddie James (eajames) * Date: 2016-09-02 14:56
Thanks Mark, yes you installed the right package.

OK I didn't dig deep enough in the sub class. And yea, there shouldn't be any difference between float.__repr__ and float.__str__. Obviously repr calls the object's tp_repr method, while float.__repr__ calls the base float method, which is the one we want.

I didn't do any testing for python 3.x as I couldn't figure out how to import dbus into a local build of 3.x. My mistake here.

> For 2.7, we may want to consider processing float instances using `float.__repr__` instead of plain old `repr`

That should work! I'll upload a new patch for 2.7
msg274248 - (view) Author: Eddie James (eajames) * Date: 2016-09-02 15:03
Wait what about the json C code for 2.7? That's still using PyObject_Repr() which will call tp_repr for dbus.Double... Any suggestions?
msg274249 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2016-09-02 15:38
> That's still using PyObject_Repr() which will call tp_repr for dbus.Double... Any suggestions?

Right, you'll want to replace that with a call to `PyFloat_Type.tp_repr(obj)`, as in the Python 3 code.

We're in a bit of a grey area here: making this change to the 2.7 branch does require making a case that it's a bugfix rather than a new feature (which isn't permitted in 2.7), *and* that it's not going to cause gratuitous breakage in existing json-using applications. I think there *is* a reasonable case there, but others may disagree. One point in its favour is that we're *already* behaving like a regular float for special values:

Python 2.7.12 (default, Jun 29 2016, 12:46:54) 
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import dbus, json
>>> x = dbus.Double(float('inf'))
>>> json.dumps(x)  # not using repr(x) here
'Infinity'
>>> repr(x)
'dbus.Double(inf)'
>>> x = dbus.Double(2.3)
>>> json.dumps(x)  # using repr(x) here
'dbus.Double(2.3)'
>>> repr(x)
'dbus.Double(2.3)'
msg274252 - (view) Author: Eddie James (eajames) * Date: 2016-09-02 16:08
Python 2.7 also already behaves correctly for other dbus types:

Python 2.7.6 (default, Jun 22 2015, 17:58:13) 
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import dbus
>>> import json
>>> x = dbus.Int32(3)
>>> x
dbus.Int32(3)
>>> json.dumps(x)
'3'
>>> repr(x)
'dbus.Int32(3)'
msg274261 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2016-09-02 17:17
Yes. I'm fairly convinced about the bug part; it's the gratuitous breakage part that worries me. It wouldn't surprise me at all to find that there's someone out there who wants all their numbers to be written out to JSON with exactly 6 places after the point, and subclasses float just for that purpose. (I'm not for a moment suggesting that this is a good idea, but there's a big difference between the set of Python code that *should* have been written and the set of Python code that *has* been written. :-)

>>> class MyFloat(float):
...     def __repr__(self):
...         return '{:.6f}'.format(self)
... 
>>> import math, json
>>> json.dumps(MyFloat(math.pi))
'3.141593'
msg274267 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2016-09-02 17:54
Note that this will also change the results for JSON output of NumPy float64 values, so at the very least I'd expect this change to break (admittedly poorly written) unit tests / doctests.

Python 2.7.12 (default, Jul 10 2016, 18:28:23) 
[GCC 4.2.1 Compatible Apple LLVM 7.0.2 (clang-700.1.81)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> import json
>>> x = np.float64(3.3)
>>> json.dumps(x)
'3.2999999999999998'
>>> repr(x)
'3.2999999999999998'
>>> float.__repr__(x)
'3.3'
msg274310 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016-09-03 16:45
New changeset 86d66a627b77 by Mark Dickinson in branch '2.7':
Issue #27934: Use float.__repr__ instead of plain repr when JSON-encoding an instance of a float subclass. Thanks Eddie James.
https://hg.python.org/cpython/rev/86d66a627b77
msg326346 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-09-25 12:01
This change broke a trick used for serializing Decimal to JSON without loss (see msg176158). It was good to break it in a new release, but I have doubts about breaking it in a bugfix release of 2.7.
msg326405 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2018-09-25 20:39
[Serhiy]

> I have doubts about breaking it in a bugfix release of 2.7.

Yes, possibly we shouldn't have changed this (though it's a fairly horrible trick, IMO). I don't think it would be a good idea to revert the change at this point, though.
msg326406 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2018-09-25 20:41
Agreed. It is too late.
History
Date User Action Args
2018-09-25 20:41:08serhiy.storchakasetmessages: + msg326406
2018-09-25 20:39:40mark.dickinsonsetmessages: + msg326405
2018-09-25 12:01:06serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg326346
2016-09-03 16:45:39mark.dickinsonsetstatus: open -> closed
resolution: fixed
stage: needs patch -> resolved
2016-09-03 16:45:16python-devsetnosy: + python-dev
messages: + msg274310
2016-09-03 16:27:00mark.dickinsonsetassignee: mark.dickinson
2016-09-02 17:54:01mark.dickinsonsetmessages: + msg274267
2016-09-02 17:17:10mark.dickinsonsetmessages: + msg274261
2016-09-02 16:08:13eajamessetmessages: + msg274252
2016-09-02 16:02:44eajamessetfiles: + json-float-repr-2.7.patch
2016-09-02 15:38:16mark.dickinsonsetmessages: + msg274249
versions: - Python 3.6
2016-09-02 15:07:59r.david.murraysetstatus: closed -> open
resolution: third party -> (no value)
stage: resolved -> needs patch
2016-09-02 15:03:09eajamessetmessages: + msg274248
2016-09-02 14:56:29eajamessetmessages: + msg274246
2016-09-02 14:46:50mark.dickinsonsetmessages: + msg274243
2016-09-02 14:44:04mark.dickinsonsetmessages: + msg274242
2016-09-02 14:40:00r.david.murraysetstatus: open -> closed

nosy: + r.david.murray
messages: + msg274240

resolution: third party
stage: resolved
2016-09-02 14:01:47eajamessetmessages: + msg274237
2016-09-02 12:20:35eric.smithsetnosy: + eric.smith
2016-09-02 10:42:48mark.dickinsonsetnosy: + mark.dickinson
messages: + msg274226
2016-09-01 20:23:47eajamessetfiles: + json-float-str-2.7.patch
2016-09-01 20:20:23eajamescreate