classification
Title: Add _PyFunction_FastCallDict(): fast call with keyword arguments as a dict
Type: performance Stage:
Components: Versions: Python 3.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: python-dev, scoder, serhiy.storchaka, vstinner
Priority: normal Keywords: patch

Created on 2016-08-20 00:21 by vstinner, last changed 2016-09-01 13:12 by vstinner. This issue is now closed.

Files
File name Uploaded Description Edit
fastcall_kwargs.patch vstinner, 2016-08-20 00:21 review
fastcall_kwargs-2.patch vstinner, 2016-08-22 11:04 review
fastcall_macros.patch vstinner, 2016-08-22 13:39 review
Messages (19)
msg273170 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-08-20 00:21
_PyObject_FastCall() (added in the issue #27128) already supports keyword arguments for C functions, but not Python functions.

Attached patch adds support for keyword arguments in _PyFunction_FastCall(), to allow to use _PyObject_FastCall() in more cases.

Examples of functions that can be modified to _PyObject_FastCall() with this change:

* builtin_sorted()
* builtin___build_class__()
* init_subclass()
* PyEval_CallObjectWithKeywords(func, NULL, kwargs)
* methoddescr_call(), classmethoddescr_call(), wrapperdescr_call()
* PyFile_GetLine()

Moreover, supporting keywords is required for another more important step: add a new METH_CALL calling convention for C functions.
msg273353 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-08-22 11:04
Patch version 2:

* add a fast-path to fastlocals_dict() when total_args==0
* Extract co_varnames assignement out of the loop (it's a constant)
* Cleanup also the code to make it more readable
msg273354 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-08-22 11:05
@Stefan: Would you mind to review fastcall_kwargs-2.patch please?
msg273367 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-08-22 12:33
Copy of msg273365 from the issue #27128:

Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) 	Date: 2016-08-22 12:29

The problem is that passing keyword arguments as a dict is not the most efficient way due to an overhead of creating a dict. For now keyword arguments are pushed on the stack as interlaced array of keyword names and values. It may be more efficient to push values and names as continuous arrays (issue27213). PyArg_ParseTupleAndKeywords() accepts a tuple and a dict, but private function _PyArg_ParseTupleAndKeywordsFast() (issue27574) can be changed to accept positional and keyword arguments as continuous arrays: (int nargs, PyObject **args, int nkwargs, PyObject **kwnames, PyObject **kwargs). Therefore we will be forced either to change the signature of _PyObject_FastCall() and the meaning of METH_FASTCALL, or add new _PyObject_FastCallKw() and METH_FASTCALLKW for support fast passing keyword arguments. Or may be add yet _PyObject_FastCallNoKw() for faster passing only positional arguments without an overhead of _PyObject_FastCall(). And make older _PyObject_FastCall() and METH_FASTCALL obsolete.

There is yet one possibility. Argument Clinic can generate a dict that maps keyword argument names to indices of arguments and tie it to a function. External code should map names to indices using this dictionary and pass arguments as just a continuous array to function with METH_FASTCALL (raising an error if some argument is passed as positional and keyword, or if keyword-only argument is passed as positional, etc). In that case the kwargs parameter of _PyObject_FastCall() becomes obsolete too.
msg273370 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-08-22 13:00
Serhiy Storchaka: "The problem is that passing keyword arguments as a dict is not the most efficient way due to an overhead of creating a dict. For now keyword arguments are pushed on the stack as interlaced array of keyword names and values. It may be more efficient to push values and names as continuous arrays (issue27213)."

You describe one specific kind of function call: from Python to Python. Sure, in this case, we *can* avoid the creation of a dictionary. Such optimization can be implemented in call_function() in Python/ceval.c.

The problem is that in other cases, it's harder to avoid the creation of a dictionary:

* C functions defined with METH_KEYWORDS require a dict
* tp_new, tp_init and tp_call slots require a dict

Many C functions "pass through" keyword arguments. Example with the builtin sorted() function:

    static PyObject *
    builtin_sorted(PyObject *self, PyObject *args, PyObject *kwds)
    {
        ...
        v = PyObject_Call(callable, newargs, kwds);
        ...
    }

Example of a tp_new slot, the type constructor, type_new():

    static PyObject *
    type_new(PyTypeObject *metatype, PyObject *args, PyObject *kwds)
    {
        ...
        if (winner != metatype) {
            if (winner->tp_new != type_new) /* Pass it to the winner */
                return winner->tp_new(winner, args, kwds);
            metatype = winner;
        }
        ...
        if (init_subclass(type, kwds) < 0)
            goto error;
        ...
    }

    static int
    init_subclass(PyTypeObject *type, PyObject *kwds)
    {
        ...
        tmp = PyObject_Call(func, tuple, kwds);
        ...
    }

Well, the question can be: where do these keyword argument come from? It's like that in 99% of cases, these C functions are called from Python functions which use the Python stack for positional and keyword arguments.

The last part of my full fastcall project is to modify tp_new, tp_init and tp_call slots to support a fastcall mode which uses the fastcall calling convention (add new flags to tp_flags: Py_TPFLAGS_FASTNEW, Py_TPFLAGS_FASTINIT, Py_TPFLAGS_FASTCALL).

I'm not sure that this part of my plan is feasible since it is a major backward incompatible change: it breaks all C code calling directly tp_new, tp_init and tp_call, and such code is common (even inside the stdlib). I plan to open a discussion to see if it's worth it. It would require to modify all code calling directly tp_new, tp_init and tp_call to use a new wrapper function, function which would only be available in the C API of Python 3.6...

--

I understand your idea of avoid dictionaries for keyword parameters, but maybe we need an intermediate step and maybe even two functions:

* one function taking a dict: _PyObject_FastCallDict()
* one function taking an array of (key, value) pairs: _PyObject_FastCall()

_PyObject_FastCallDict() would pass directly the Python dict to C functions defined with METH_KEYWORDS, to tp_new, tp_init and tp_call slots.

In a perfect world, _PyObject_FastCallDict() wouldn't be needed. In practice, dictionaries are used *everywhere* in the current C code base, so we need a function for the transition.

_PyObject_FastCallDict() is private, so we can remove it whenever we want.

--

I'm patching more and more code, and I hate having to add an extra ", NULL" when calling the current _PyObject_FastCall().

Maybe we should have a short name, _PyObject_FastCall(), for the common case: no keyword parameter, and use longer name for keywords: _PyObject_FastCallKeywords() (as PyArg_ParseTuple and PyArg_ParseTupleKeywords).

--

To summarize, I propose 3 functions:

* _PyObject_FastCall(PyObject **args, int nargs)
* _PyObject_FastCallKeywords(PyObject **args, int nargs, PyObject **kwargs, int nkwargs): nkwargs is the number of (key, value) pairs
* _PyObject_FastCallDict(PyObject **args, int nargs, PyObject *kwargs): kwargs is a Python dictionary

METH_FASTCALL would use _PyObject_FastCallKeywords format: "PyObject **kwargs, int nkwargs" for keyword parameters.

What do you think?
msg273374 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-08-22 13:23
> _PyObject_FastCallKeywords(PyObject **args, int nargs, PyObject **kwargs, int nkwargs): nkwargs is the number of (key, value) pairs

Hum, I don't know if it's worth to "split" the C array into two parts. Maybe it can be just:

_PyObject_FastCallKeywords(PyObject **args, int nargs, int nkwargs);

As current used in Python/ceval.c.
msg273375 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-08-22 13:39
fastcall_macros.patch:

* Rename _PyObject_FastCall() to _PyObject_FastCallDict()
* Add _PyObject_FastCall(func, args, nargs) macro
* Add _PyObject_CallArg1(func, arg) macro
* Add _PyObject_CallNoArg(func) macro

The 3 new macros call _PyObject_FastCallDict().

Statistics on calls to these functions and macros:

* _PyObject_CallNoArg: 11
* _PyObject_CallArg1: 8
* _PyObject_FastCall: 5
* _PyObject_FastCallDict: 0   (the function doesn't work yet, it's the purpose of this issue, so it's expected that it's unused yet ;-))

_PyObject_CallNoArg() and _PyObject_CallArg1() are much more common and the more complex format _PyObject_FastCall() format (more than 1 argument). I like having macros with a simple prototype.

--

The C API has a wide choice of functions to call a function, but I still want to add  _PyObject_CallNoArg() and _PyObject_CallArg1() to get the best performances (avoid layers before reaching the final _PyObject_FastCallDict) and have a simple and well defined prototype for the most common function calls.

kwargs:

* PyObject_Call(): root of all call functions in Python 3.5
* PyEval_CallObjectWithKeywords(func, args, kwargs) -> PyObject_Call()

args:

* PyEval_CallObject(func, args)
* PyObject_CallObject(func, args) -> PyEval_CallObjectWithKeywords()

vargs:

* PyObject_CallFunctionObjArgs(func, ...) -> PyObject_Call()

format:

* PyObject_CallFunction(func, format, ...) -> PyObject_Call()
* PyObject_CallMethod(func, method, format, ...) -> PyObject_Call()
* _PyObject_CallMethodId(func, method_id, format, ...) -> PyObject_Call()
* PyEval_CallFunction(func, format, ...) -> PyEval_CallObject()
* PyEval_CallMethod(func, method, format, ...) -> PyEval_CallObject()

Fast call:

* _PyObject_CallNoArg(func) -> _PyObject_FastCallDict()
* _PyObject_CallArg1(func, arg) -> _PyObject_FastCallDict()
* _PyObject_FastCall(func, arg) -> _PyObject_FastCallDict()
msg273390 - (view) Author: Stefan Behnel (scoder) * Date: 2016-08-22 18:03
I like the oneArg/noArg etc. macros. We use something similar in Cython. You can even use them to inline the METH_O and METH_NOARGS call cases (although I use inline functions for that in Cython).
msg273391 - (view) Author: Stefan Behnel (scoder) * Date: 2016-08-22 18:11
I just took a quick look at the fastcall_kwargs-2.patch for now. It looks ok in general, but it also adds quite some special code for the dict-to-locals mapping. Is the keyword argument calling case really that important? I mean, it requires creating a dict and filling it, after all. People would rather avoid that in the first place if they care about performance. I wouldn't object, I'm merely asking...
msg273403 - (view) Author: Roundup Robot (python-dev) Date: 2016-08-22 21:03
New changeset 263334a652ac by Victor Stinner in branch 'default':
Rename _PyObject_FastCall() to _PyObject_FastCallDict()
https://hg.python.org/cpython/rev/263334a652ac
msg273404 - (view) Author: Roundup Robot (python-dev) Date: 2016-08-22 21:22
New changeset 1aefb4c4a7b4 by Victor Stinner in branch 'default':
_PyFunction_FastCallDict() supports keyword args
https://hg.python.org/cpython/rev/1aefb4c4a7b4

New changeset 7924eac912fe by Victor Stinner in branch 'default':
Issue #27809: Cleanup _PyEval_EvalCodeWithName()
https://hg.python.org/cpython/rev/7924eac912fe

New changeset 15eab21bf934 by Victor Stinner in branch 'default':
Issue #27809: Use _PyObject_FastCallDict()
https://hg.python.org/cpython/rev/15eab21bf934
msg273406 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-08-22 21:42
Stefan Behnel: "I like the oneArg/noArg etc. macros. We use something similar in Cython."

Oh, nice :-) Since they macros are private, I pushed fastcall_macros.patch. We can still rework them later if needed.


"You can even use them to inline the METH_O and METH_NOARGS call cases (although I use inline functions for that in Cython)."

Hum, I'm not sure that it's worth it.


"I just took a quick look at the fastcall_kwargs-2.patch for now. It looks ok in general, but it also adds quite some special code for the dict-to-locals mapping. Is the keyword argument calling case really that important? I mean, it requires creating a dict and filling it, after all. People would rather avoid that in the first place if they care about performance. I wouldn't object, I'm merely asking..."

Hey, it's nothing new: Python 3.5 does the same in function_call()!? Oh in fact, function_call() code is very short compared to fastcall_kwargs-2.patch and it doesn't require to modify the hot code of Python/ceval.c! So I pushed a very small change (change 1aefb4c4a7b4) which adds support for keyword with the sample function_call() code.

Serhiy already told me that creating a dict is overkill... but please read again my message msg273370. We cannot break all APIs, we need a transition from the existing C APIs using Python dictionaries for keyword arguments.

I reworked my new fastcall APIs to keep the door open for a faster API to pass keyword arguments as pairs of (key, value) in an unique C array.
msg273407 - (view) Author: Roundup Robot (python-dev) Date: 2016-08-22 22:32
New changeset ef0110a52e24 by Victor Stinner in branch 'default':
PyEval_CallObjectWithKeywords() uses fast call with kwargs
https://hg.python.org/cpython/rev/ef0110a52e24

New changeset 5587d0dfab4c by Victor Stinner in branch 'default':
Issue #27809: Use _PyObject_FastCallDict()
https://hg.python.org/cpython/rev/5587d0dfab4c

New changeset 0c65a2089f00 by Victor Stinner in branch 'default':
Add _PyErr_CreateException()
https://hg.python.org/cpython/rev/0c65a2089f00

New changeset c53c532de995 by Victor Stinner in branch 'default':
Issue #27809: PyErr_SetImportError() uses fast call
https://hg.python.org/cpython/rev/c53c532de995

New changeset 614dd914c21e by Victor Stinner in branch 'default':
Issue #27809: tzinfo_reduce() uses fast call
https://hg.python.org/cpython/rev/614dd914c21e

New changeset f48ae71e8a8f by Victor Stinner in branch 'default':
Issue #27809: _csv: _call_dialect() uses fast call
https://hg.python.org/cpython/rev/f48ae71e8a8f

New changeset 60660890459f by Victor Stinner in branch 'default':
Issue #27809: methodcaller_reduce() uses fast call
https://hg.python.org/cpython/rev/60660890459f

New changeset bf2d88cf039a by Victor Stinner in branch 'default':
PyEval_CallObjectWithKeywords() doesn't inc/decref
https://hg.python.org/cpython/rev/bf2d88cf039a
msg273413 - (view) Author: Roundup Robot (python-dev) Date: 2016-08-22 23:36
New changeset 620b2e554be4 by Victor Stinner in branch 'default':
Issue #27809: builtin___build_class__() uses fast call
https://hg.python.org/cpython/rev/620b2e554be4
msg273452 - (view) Author: Roundup Robot (python-dev) Date: 2016-08-23 14:28
New changeset c1a698edfa1b by Victor Stinner in branch 'default':
Issue #27809: partial_call() uses fast call for positional args
https://hg.python.org/cpython/rev/c1a698edfa1b
msg273520 - (view) Author: Roundup Robot (python-dev) Date: 2016-08-23 22:09
New changeset 70f88b097f60 by Victor Stinner in branch 'default':
Issue #27809: map_next() uses fast call
https://hg.python.org/cpython/rev/70f88b097f60

New changeset 0e4f26083bbb by Victor Stinner in branch 'default':
PyObject_CallMethodObjArgs() now uses fast call
https://hg.python.org/cpython/rev/0e4f26083bbb
msg273521 - (view) Author: Roundup Robot (python-dev) Date: 2016-08-23 23:25
New changeset 401f59a7020b by Victor Stinner in branch 'default':
PyObject_CallMethodObjArgs() now uses fast call
https://hg.python.org/cpython/rev/401f59a7020b
msg273522 - (view) Author: Roundup Robot (python-dev) Date: 2016-08-23 23:52
New changeset 1b241e761f8f by Victor Stinner in branch 'default':
Issue #27809: map_next() uses fast call
https://hg.python.org/cpython/rev/1b241e761f8f
msg274122 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-09-01 13:12
The main feature has been merged, so I close the issue.
History
Date User Action Args
2016-09-01 13:12:27vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg274122

title: _PyObject_FastCall(): add support for keyword arguments -> Add _PyFunction_FastCallDict(): fast call with keyword arguments as a dict
2016-08-25 21:35:07vstinnerunlinkissue27810 dependencies
2016-08-23 23:52:07python-devsetmessages: + msg273522
2016-08-23 23:25:32python-devsetmessages: + msg273521
2016-08-23 22:09:10python-devsetmessages: + msg273520
2016-08-23 14:28:09python-devsetmessages: + msg273452
2016-08-22 23:36:33python-devsetmessages: + msg273413
2016-08-22 22:32:50python-devsetmessages: + msg273407
2016-08-22 21:42:34vstinnersetmessages: + msg273406
2016-08-22 21:22:21python-devsetmessages: + msg273404
2016-08-22 21:03:32python-devsetnosy: + python-dev
messages: + msg273403
2016-08-22 18:11:29scodersetmessages: + msg273391
2016-08-22 18:03:49scodersetmessages: + msg273390
2016-08-22 13:39:49vstinnersetfiles: + fastcall_macros.patch

messages: + msg273375
2016-08-22 13:23:45vstinnersetmessages: + msg273374
2016-08-22 13:00:25vstinnersetmessages: + msg273370
2016-08-22 12:33:42vstinnersetmessages: + msg273367
2016-08-22 11:05:38vstinnersetnosy: + scoder
messages: + msg273354
2016-08-22 11:04:32vstinnersetfiles: + fastcall_kwargs-2.patch

messages: + msg273353
2016-08-20 00:32:48vstinnerlinkissue27810 dependencies
2016-08-20 00:21:25vstinnercreate