classification
Title: Add PyObject_CallOneArg()
Type: performance Stage: resolved
Components: Interpreter Core Versions: Python 3.9
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: corona10, jdemeyer, methane, miss-islington, skrah, vstinner
Priority: normal Keywords: patch

Created on 2019-07-02 12:14 by jdemeyer, last changed 2020-11-01 13:28 by corona10. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 14558 merged jdemeyer, 2019-07-02 13:48
PR 14575 jdemeyer, 2019-07-03 18:35
PR 14599 closed methane, 2019-07-05 09:31
PR 14600 merged jdemeyer, 2019-07-05 09:36
PR 23062 merged corona10, 2020-10-31 15:02
PR 23074 merged miss-islington, 2020-11-01 13:04
Messages (32)
msg347138 - (view) Author: Jeroen Demeyer (jdemeyer) * (Python triager) Date: 2019-07-02 12:14
As discussed in https://mail.python.org/archives/list/capi-sig@python.org/thread/X6HJOEX6RRFLNKAQSQR6HLD7K4QZ4OTZ/ it would be convenient to have a function PyObject_CallOneArg() for making calls with exactly 1 positional argument and no keyword arguments. Such calls are very common. This would be analogous to PyObject_CallNoArgs().
msg347172 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2019-07-03 00:47
What do you think about macro like this?

  #define _PyObject_CALL_WITH_ARGS(func, ...) \
      _PyObject_Vectorcall(func, (PyObject*[]){NULL, __VA_ARGS__} + 1, \
              sizeof((PyObject*[]){__VA_ARGS__})/sizeof(PyObject*) | PY_VECTORCALL_ARGUMENTS_OFFSET, \
              NULL)

Pros: it can be used for 2 or 3 arguments too.
Cons: readability...
msg347181 - (view) Author: Jeroen Demeyer (jdemeyer) * (Python triager) Date: 2019-07-03 05:48
Variadic macros are not part of C89, so that would require changing PEP 7.
msg347196 - (view) Author: Jeroen Demeyer (jdemeyer) * (Python triager) Date: 2019-07-03 09:21
> Cons: readability...

It's a single relatively short macro. I'm not worried about that. I'm more about compiler support for such macros. I'll make a PR with this idea and see what CI says.
msg347197 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-07-03 09:29
> Variadic macros are not part of C89, so that would require changing PEP 7.

PEP 7 uses C99 since Python 3.6:
https://www.python.org/dev/peps/pep-0007/#c-dialect

> What do you think about macro like this?

Macros cannot be used in all C extensions. For the PEP 587, I was asked to replace macros with functions.

I would prefer a PyObject_CallOneArg() function than yet another ugly and dangerous macro. Sometimes, a regular function is better for stack consumption: it doesn't increase the stack consumption in the call site, stack memory is "released" once PyObject_CallOneArg() exit. In the past, I had to disable inlining in some functions to reduce the stack consumption. (I'm not sure if this optimization is still around.)
msg347198 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2019-07-03 09:31
AFAIK, gcc, clang, and MSVC support it.

Another cons is, general pitfall of macro:

  _PyObject_CALL_WITH_ARGS(func, PyDict_GetItem(d, key)); // PyDict_GetItem(d, key) is called twice.

If two or more arguments are not common, I prefer _PyObject_CallOneArg to macro.
msg347200 - (view) Author: Jeroen Demeyer (jdemeyer) * (Python triager) Date: 2019-07-03 09:34
> _PyObject_CALL_WITH_ARGS(func, PyDict_GetItem(d, key)); // PyDict_GetItem(d, key) is called twice.

That's pretty bad and in my opinion a good reason to reject this idea.

> If two or more arguments are not common, I prefer _PyObject_CallOneArg to macro.

I posted some counts at https://mail.python.org/archives/list/capi-sig@python.org/message/3I76R27YMFSKB5JQIM3E4NA3BECVTZHP/
msg347202 - (view) Author: Jeroen Demeyer (jdemeyer) * (Python triager) Date: 2019-07-03 09:44
> _PyObject_CALL_WITH_ARGS(func, PyDict_GetItem(d, key)); // PyDict_GetItem(d, key) is called twice.

Actually, it's not a problem: sizeof() is special, it only looks at the type of its argument, it doesn't execute the code.
msg347203 - (view) Author: Jeroen Demeyer (jdemeyer) * (Python triager) Date: 2019-07-03 09:46
> PEP 7 uses C99 since Python 3.6: https://www.python.org/dev/peps/pep-0007/#c-dialect

That's not what the PEP says: "Python versions greater than or equal to 3.6 use C89 with several select C99 features"

"several select C99 features" is not the same of using C99.

Plus, I'm also worried about C++. We also support compiling C++ extensions with the same header files, so it must be C++ compatible also.
msg347205 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2019-07-03 09:49
> Macros cannot be used in all C extensions. For the PEP 587, I was asked to replace macros with functions.

This is just an helper inline function / macro to ease calling _PyObject_Vectorcall.
Extensions can port this helper inline function / macro to ease calling _PyObject_Vectorcall.

I don't want to make this non-inline function.
msg347206 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2019-07-03 10:03
The motivation for this PR is "it would be convenient to have this function".

This is probably true, but generally I would not change 47 files at once.

Most of the locations are probably not performance sensitive.
msg347208 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2019-07-03 10:20
This change doesn't make caller code complicated.  It makes caller little simpler.

Choosing performance sensitive code is much hard than replace all occurrences.
So I'm OK to change all callers except code owner opposed at once.
msg347209 - (view) Author: Jeroen Demeyer (jdemeyer) * (Python triager) Date: 2019-07-03 10:24
Exactly. I see no reason to prefer PyObject_CallFunctionObjArgs(func, arg, NULL) over _PyObject_CallOneArg(func, arg)
msg347210 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2019-07-03 10:27
It adds yet another special case underscore function that one cannot use in external projects. So I would not say that is simpler.


Has there been any performance measurement at all?
msg347211 - (view) Author: Jeroen Demeyer (jdemeyer) * (Python triager) Date: 2019-07-03 10:29
> It adds yet another special case underscore function that one cannot use in external projects. So I would not say that is simpler.

If you're worried about the underscore, I will make a separate PR to add a non-underscored version, similar to PyObject_CallNoArgs()
msg347213 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-07-03 10:33
> Exactly. I see no reason to prefer PyObject_CallFunctionObjArgs(func, arg, NULL) over _PyObject_CallOneArg(func, arg)

In this case, maybe the whole idea of _PyObject_CallOneArg() is not worth it?

Is there any benchmark showing if it's faster or use less stack?
msg347214 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2019-07-03 10:36
> It adds yet another special case underscore function that one cannot use in external projects. So I would not say that is simpler.

I don't get what you mean.

Do you care about *adding* API with underscore?  If so, it doesn't make caller code complex.  It can not be a reason to reject changing many files.

Or do you care about *using* API with underscore?  If so, I'm OK to stop changing some callers which are not tightly coupled with Python.  (e.g. sqlite)
msg347216 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2019-07-03 10:52
> Or do you care about *using* API with underscore?  If so, I'm OK to stop changing some callers which are not tightly coupled with Python.

I care about this one.  Indeed I think underscore functions should be used in strategic places inside the core interpreter and not sprinkled across the whole code base by default.


But in general I also care about not having too many new functions to avoid churn.
msg347217 - (view) Author: Jeroen Demeyer (jdemeyer) * (Python triager) Date: 2019-07-03 10:53
> Is there any benchmark showing if it's faster

Here is one example:

class D(dict):
    def __missing__(self, key):
        return None
d = D()

and now benchmark d[0]

**before**: Mean +- std dev: 173 ns +- 1 ns
**after**: Mean +- std dev: 162 ns +- 1 ns

To be precise, I ran: ./python -m perf timeit --duplicate 200 -s 'class D(dict):' -s ' def __missing__(self, key):' -s '  return None' -s 'd = D()' 'd[0]'
msg347220 - (view) Author: Jeroen Demeyer (jdemeyer) * (Python triager) Date: 2019-07-03 11:00
Stefan: I used an underscore by analogy with PyObject_CallNoArgs()/_PyObject_CallNoArg(), where the first is in the limited API and the second is an inline function in the cpython API.

But maybe we could revisit that decision.
msg347222 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2019-07-03 11:37
I don't want to add many functions to limited/public APIs.
Note that public APIs defined outside of cpython/ are considered mandatory for all Python implementations supporting Python/C API.

When portability is more important than performance, there are many APIs already.
Please keep _PyObject_CallOneArg() as just an easy wrapper of _PyObject_Vectorcall which is CPython specific in this issue.

Candidate of revert: _csv, _decimal, _sqlite, pyexpat, _testbuffer, _xxtestfuzz
They doesn't use much private APIs, or they are tests which is useful for other interpreters.
msg347224 - (view) Author: Jeroen Demeyer (jdemeyer) * (Python triager) Date: 2019-07-03 12:21
Victor, what's your opinion on adding PyObject_CallOneArg() to the limited API?
msg347231 - (view) Author: Jeroen Demeyer (jdemeyer) * (Python triager) Date: 2019-07-03 14:31
Test of stack usage:

from _testcapi import stack_pointer
class D(dict):
    def __missing__(self, key):
        sp = stack_pointer()
        print(f"stack usage = {TOP - sp}")
        return None
d = D()
TOP = stack_pointer()
d[0]


**before**: stack usage = 1296
**after**: stack usage = 976
msg347267 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2019-07-04 10:31
New changeset 196a530e00d88a138973bf9182e013937e293f97 by Inada Naoki (Jeroen Demeyer) in branch 'master':
bpo-37483: add _PyObject_CallOneArg() function (#14558)
https://github.com/python/cpython/commit/196a530e00d88a138973bf9182e013937e293f97
msg347318 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-07-05 09:12
The following commit introduced a reference leak:

commit 196a530e00d88a138973bf9182e013937e293f97
Author: Jeroen Demeyer <J.Demeyer@UGent.be>
Date:   Thu Jul 4 12:31:34 2019 +0200
 
    bpo-37483: add _PyObject_CallOneArg() function (#14558)

Example:

https://buildbot.python.org/all/#/builders/80/builds/642

test_xml_etree leaked [1, 1, 1] references, sum=3
test_threading leaked [6, 6, 6] references, sum=18
test_atexit leaked [6, 6, 6] references, sum=18
test_compile leaked [1, 1, 1] references, sum=3
test_zipfile leaked [1, 1, 1] references, sum=3
test_tokenize leaked [3, 3, 3] references, sum=9
test_pydoc leaked [1, 1, 1] references, sum=3
test_codecs leaked [43, 43, 43] references, sum=129
test_codecs leaked [11, 11, 11] memory blocks, sum=33
Re-running failed tests in verbose mode
test__xxsubinterpreters leaked [136, 136, 136] references, sum=408
test_getargs2 leaked [4, 4, 4] references, sum=12
test_source_encoding leaked [3, 3, 3] references, sum=9
test_capi leaked [4, 4, 4] references, sum=12
test_email leaked [17, 17, 17] references, sum=51
test_xml_etree leaked [1, 1, 1] references, sum=3
test_threading leaked [6, 6, 6] references, sum=18
test_atexit leaked [6, 6, 6] references, sum=18
test_compile leaked [1, 1, 1] references, sum=3
test_zipfile leaked [1, 1, 1] references, sum=3
test_tokenize leaked [3, 3, 3] references, sum=9
test_pydoc leaked [1, 1, 1] references, sum=3
test_codecs leaked [43, 43, 43] references, sum=129
test_codecs leaked [11, 11, 11] memory blocks, sum=33
(...)
msg347330 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-07-05 10:26
INADA-san, Jeroen: hum, you both proposed a similar fix :-) One fix is enough. See my review on PR 14600.
msg347332 - (view) Author: Jeroen Demeyer (jdemeyer) * (Python triager) Date: 2019-07-05 10:41
> Jeroen: hum, you both proposed a similar fix :-)

It seems that I lost the race ;-)

But seriously: if we both independently came up with the same solution, that's a good sign that the solution is correct.
msg347333 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2019-07-05 10:57
New changeset 6e43d07324ca799118e805751a10a7eff71d5a04 by Inada Naoki (Jeroen Demeyer) in branch 'master':
bpo-37483: fix reference leak in _PyCodec_Lookup (GH-14600)
https://github.com/python/cpython/commit/6e43d07324ca799118e805751a10a7eff71d5a04
msg347335 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-07-05 12:21
Thanks for the quick fix ;-)
msg370685 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2020-06-03 20:58
Oh, PyObject_CallOneArg() is missing in the What's New in Python 3.9:
https://docs.python.org/dev/whatsnew/3.9.html#id1

Can someone please propose a PR to add it? I reopen the issue.
msg380132 - (view) Author: Dong-hee Na (corona10) * (Python committer) Date: 2020-11-01 13:04
New changeset 7feb54a6348f6220b27986777786c812f110b53d by Dong-hee Na in branch 'master':
bpo-37483: Add PyObject_CallOneArg() in the What's New in Python 3.9 (GH-23062)
https://github.com/python/cpython/commit/7feb54a6348f6220b27986777786c812f110b53d
msg380133 - (view) Author: miss-islington (miss-islington) Date: 2020-11-01 13:27
New changeset 0312efcd2bb3d7964dbfe2b4cbd5f5b440aed049 by Miss Skeleton (bot) in branch '3.9':
bpo-37483: Add PyObject_CallOneArg() in the What's New in Python 3.9 (GH-23062)
https://github.com/python/cpython/commit/0312efcd2bb3d7964dbfe2b4cbd5f5b440aed049
History
Date User Action Args
2020-11-01 13:28:37corona10setstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2020-11-01 13:27:09miss-islingtonsetmessages: + msg380133
2020-11-01 13:04:52miss-islingtonsetnosy: + miss-islington
pull_requests: + pull_request21994
2020-11-01 13:04:38corona10setmessages: + msg380132
2020-10-31 15:02:25corona10setnosy: + corona10

pull_requests: + pull_request21981
stage: resolved -> patch review
2020-06-03 20:58:08vstinnersetstatus: closed -> open
resolution: fixed -> (no value)
messages: + msg370685
2019-07-10 10:51:10jdemeyersetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2019-07-05 12:21:21vstinnersetmessages: + msg347335
2019-07-05 10:57:58methanesetmessages: + msg347333
2019-07-05 10:41:00jdemeyersetmessages: + msg347332
2019-07-05 10:26:12vstinnersetmessages: + msg347330
2019-07-05 09:36:41jdemeyersetpull_requests: + pull_request14415
2019-07-05 09:31:58methanesetpull_requests: + pull_request14414
2019-07-05 09:12:20vstinnersetmessages: + msg347318
2019-07-04 10:31:59methanesetmessages: + msg347267
2019-07-03 18:35:14jdemeyersetpull_requests: + pull_request14394
2019-07-03 14:31:32jdemeyersetmessages: + msg347231
2019-07-03 12:21:11jdemeyersetmessages: + msg347224
2019-07-03 11:37:40methanesetmessages: + msg347222
2019-07-03 11:00:19jdemeyersetmessages: + msg347220
2019-07-03 10:53:54jdemeyersetmessages: + msg347217
2019-07-03 10:52:39skrahsetmessages: + msg347216
2019-07-03 10:36:40methanesetmessages: + msg347214
2019-07-03 10:33:35vstinnersetmessages: + msg347213
2019-07-03 10:29:20jdemeyersetmessages: + msg347211
2019-07-03 10:27:30skrahsetmessages: + msg347210
2019-07-03 10:24:04jdemeyersetmessages: + msg347209
2019-07-03 10:20:38methanesetmessages: + msg347208
2019-07-03 10:03:59skrahsetnosy: + skrah
messages: + msg347206
2019-07-03 09:49:32methanesetmessages: + msg347205
2019-07-03 09:46:01jdemeyersetmessages: + msg347203
2019-07-03 09:44:10jdemeyersetmessages: + msg347202
2019-07-03 09:34:11jdemeyersetmessages: + msg347200
2019-07-03 09:31:07methanesetmessages: + msg347198
2019-07-03 09:29:42vstinnersetmessages: + msg347197
2019-07-03 09:21:07jdemeyersetmessages: + msg347196
2019-07-03 05:48:29jdemeyersetmessages: + msg347181
2019-07-03 00:47:37methanesetnosy: + methane
messages: + msg347172
2019-07-02 13:48:39jdemeyersetkeywords: + patch
stage: patch review
pull_requests: + pull_request14374
2019-07-02 12:14:43jdemeyercreate