classification
Title: Using boolean arguments in the _json module
Type: performance Stage: resolved
Components: Extension Modules Versions: Python 3.7
process
Status: closed Resolution: fixed
Dependencies: 30243 Superseder:
Assigned To: serhiy.storchaka Nosy List: bob.ippolito, ezio.melotti, josh.r, rhettinger, serhiy.storchaka
Priority: normal Keywords:

Created on 2017-05-03 08:38 by serhiy.storchaka, last changed 2017-05-28 12:32 by serhiy.storchaka. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 1423 merged serhiy.storchaka, 2017-05-03 08:43
Messages (5)
msg292861 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-05-03 08:38
Currently PyObject_IsTrue() is called every time when bool attributes (_json.Scanner.strict, _json.Encoder.sort_keys, _json.Encoder.skipkeys) are used in C acceleration of the json module. PyObject_IsTrue() is fast when the argument is a bool, but in any case this isn't efficient and is cumbersome. It is better to convert Python bool to C boolean value only once when create an object, as already is done for _json.Encoder.allow_nan and for most other boolean values in extension modules.

Proposed patch simplifies and optimizes the code by making arguments strict, sort_keys and skipkeys be converted only once at argument parsing time.

The patch changes behavior in the case when the boolean value of the argument is not constant. But this is very bad practice, we may ignore this obscure case (as ignore the case when the boolean value of the container doesn't consistent with its content, see issue27613).
msg292932 - (view) Author: Josh Rosenberg (josh.r) * (Python triager) Date: 2017-05-03 18:48
So, incredibly minor note:

This will prevent a ridiculous use case of passing in a mutable object as the argument (say, a list), and mutating it between truthy and falsy values (appending or clearing) to toggle behaviors in an existing Encoder.

Note: As stated, this is a ridiculous use case, and I don't think we should be handcuffed by compatibility with an insane behavior that no one has likely ever used. The docs never guarantee that those values are examined live, so it's an implementation detail, and not a useful one at that. I just want to mention it so there is a small note in MISC/News or the like to document that slight tweak in observed behavior.

Otherwise, change looks good to me.
msg293132 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-05-05 17:41
Initially I was going to add explicit bool() calls in Python implementations, so that Python and C implementation behave the same in case of non-constant options. But then I figured out that is behavior is insane and we shouldn't guarantee anything in this case. If we don't bother to satisfy matching the behavior in Python and C implementations, this change is not worth to be mentioned in Misc/NEWS.
msg293202 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2017-05-07 17:38
This looks like a nice improvement.
msg294644 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-05-28 12:31
New changeset ac5bbd43bc7b769c13ae0412cb28a3521f4d4ff1 by Serhiy Storchaka in branch 'master':
bpo-30248: Convert boolean arguments only once in _json. (#1423)
https://github.com/python/cpython/commit/ac5bbd43bc7b769c13ae0412cb28a3521f4d4ff1
History
Date User Action Args
2017-05-28 12:32:50serhiy.storchakasetstatus: open -> closed
assignee: serhiy.storchaka
resolution: fixed
stage: patch review -> resolved
2017-05-28 12:31:51serhiy.storchakasetmessages: + msg294644
2017-05-07 17:38:04rhettingersetmessages: + msg293202
2017-05-05 17:41:48serhiy.storchakasetmessages: + msg293132
2017-05-03 18:48:40josh.rsetnosy: + josh.r
messages: + msg292932
2017-05-03 08:43:52serhiy.storchakasetdependencies: + Core dump when use uninitialized _json objects
2017-05-03 08:43:04serhiy.storchakasetpull_requests: + pull_request1529
2017-05-03 08:38:15serhiy.storchakacreate