Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add _PyFunction_FastCallDict(): fast call with keyword arguments as a dict #71996

Closed
vstinner opened this issue Aug 20, 2016 · 19 comments
Closed
Labels
performance Performance or resource usage

Comments

@vstinner
Copy link
Member

BPO 27809
Nosy @scoder, @vstinner, @serhiy-storchaka
Files
  • fastcall_kwargs.patch
  • fastcall_kwargs-2.patch
  • fastcall_macros.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2016-09-01.13:12:27.132>
    created_at = <Date 2016-08-20.00:21:25.496>
    labels = ['performance']
    title = 'Add _PyFunction_FastCallDict(): fast call with keyword arguments as a dict'
    updated_at = <Date 2016-09-01.13:12:27.131>
    user = 'https://github.com/vstinner'

    bugs.python.org fields:

    activity = <Date 2016-09-01.13:12:27.131>
    actor = 'vstinner'
    assignee = 'none'
    closed = True
    closed_date = <Date 2016-09-01.13:12:27.132>
    closer = 'vstinner'
    components = []
    creation = <Date 2016-08-20.00:21:25.496>
    creator = 'vstinner'
    dependencies = []
    files = ['44162', '44187', '44188']
    hgrepos = []
    issue_num = 27809
    keywords = ['patch']
    message_count = 19.0
    messages = ['273170', '273353', '273354', '273367', '273370', '273374', '273375', '273390', '273391', '273403', '273404', '273406', '273407', '273413', '273452', '273520', '273521', '273522', '274122']
    nosy_count = 4.0
    nosy_names = ['scoder', 'vstinner', 'python-dev', 'serhiy.storchaka']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = None
    status = 'closed'
    superseder = None
    type = 'performance'
    url = 'https://bugs.python.org/issue27809'
    versions = ['Python 3.6']

    @vstinner
    Copy link
    Member Author

    _PyObject_FastCall() (added in the issue bpo-27128) already supports keyword arguments for C functions, but not Python functions.

    Attached patch adds support for keyword arguments in _PyFunction_FastCall(), to allow to use _PyObject_FastCall() in more cases.

    Examples of functions that can be modified to _PyObject_FastCall() with this change:

    • builtin_sorted()
    • builtin___build_class__()
    • init_subclass()
    • PyEval_CallObjectWithKeywords(func, NULL, kwargs)
    • methoddescr_call(), classmethoddescr_call(), wrapperdescr_call()
    • PyFile_GetLine()

    Moreover, supporting keywords is required for another more important step: add a new METH_CALL calling convention for C functions.

    @vstinner vstinner added the performance Performance or resource usage label Aug 20, 2016
    @vstinner
    Copy link
    Member Author

    Patch version 2:

    • add a fast-path to fastlocals_dict() when total_args==0
    • Extract co_varnames assignement out of the loop (it's a constant)
    • Cleanup also the code to make it more readable

    @vstinner
    Copy link
    Member Author

    @Stefan: Would you mind to review fastcall_kwargs-2.patch please?

    @vstinner
    Copy link
    Member Author

    Copy of msg273365 from the issue bpo-27128:

    Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-08-22 12:29

    The problem is that passing keyword arguments as a dict is not the most efficient way due to an overhead of creating a dict. For now keyword arguments are pushed on the stack as interlaced array of keyword names and values. It may be more efficient to push values and names as continuous arrays (bpo-27213). PyArg_ParseTupleAndKeywords() accepts a tuple and a dict, but private function _PyArg_ParseTupleAndKeywordsFast() (bpo-27574) can be changed to accept positional and keyword arguments as continuous arrays: (int nargs, PyObject **args, int nkwargs, PyObject **kwnames, PyObject **kwargs). Therefore we will be forced either to change the signature of _PyObject_FastCall() and the meaning of METH_FASTCALL, or add new _PyObject_FastCallKw() and METH_FASTCALLKW for support fast passing keyword arguments. Or may be add yet _PyObject_FastCallNoKw() for faster passing only positional arguments without an overhead of _PyObject_FastCall(). And make older _PyObject_FastCall() and METH_FASTCALL obsolete.

    There is yet one possibility. Argument Clinic can generate a dict that maps keyword argument names to indices of arguments and tie it to a function. External code should map names to indices using this dictionary and pass arguments as just a continuous array to function with METH_FASTCALL (raising an error if some argument is passed as positional and keyword, or if keyword-only argument is passed as positional, etc). In that case the kwargs parameter of _PyObject_FastCall() becomes obsolete too.

    @vstinner
    Copy link
    Member Author

    Serhiy Storchaka: "The problem is that passing keyword arguments as a dict is not the most efficient way due to an overhead of creating a dict. For now keyword arguments are pushed on the stack as interlaced array of keyword names and values. It may be more efficient to push values and names as continuous arrays (bpo-27213)."

    You describe one specific kind of function call: from Python to Python. Sure, in this case, we *can* avoid the creation of a dictionary. Such optimization can be implemented in call_function() in Python/ceval.c.

    The problem is that in other cases, it's harder to avoid the creation of a dictionary:

    • C functions defined with METH_KEYWORDS require a dict
    • tp_new, tp_init and tp_call slots require a dict

    Many C functions "pass through" keyword arguments. Example with the builtin sorted() function:

        static PyObject *
        builtin_sorted(PyObject *self, PyObject *args, PyObject *kwds)
        {
            ...
            v = PyObject_Call(callable, newargs, kwds);
            ...
        }

    Example of a tp_new slot, the type constructor, type_new():

        static PyObject *
        type_new(PyTypeObject *metatype, PyObject *args, PyObject *kwds)
        {
            ...
            if (winner != metatype) {
                if (winner->tp_new != type_new) /* Pass it to the winner */
                    return winner->tp_new(winner, args, kwds);
                metatype = winner;
            }
            ...
            if (init_subclass(type, kwds) < 0)
                goto error;
            ...
        }
    
        static int
        init_subclass(PyTypeObject *type, PyObject *kwds)
        {
            ...
            tmp = PyObject_Call(func, tuple, kwds);
            ...
        }

    Well, the question can be: where do these keyword argument come from? It's like that in 99% of cases, these C functions are called from Python functions which use the Python stack for positional and keyword arguments.

    The last part of my full fastcall project is to modify tp_new, tp_init and tp_call slots to support a fastcall mode which uses the fastcall calling convention (add new flags to tp_flags: Py_TPFLAGS_FASTNEW, Py_TPFLAGS_FASTINIT, Py_TPFLAGS_FASTCALL).

    I'm not sure that this part of my plan is feasible since it is a major backward incompatible change: it breaks all C code calling directly tp_new, tp_init and tp_call, and such code is common (even inside the stdlib). I plan to open a discussion to see if it's worth it. It would require to modify all code calling directly tp_new, tp_init and tp_call to use a new wrapper function, function which would only be available in the C API of Python 3.6...

    --

    I understand your idea of avoid dictionaries for keyword parameters, but maybe we need an intermediate step and maybe even two functions:

    • one function taking a dict: _PyObject_FastCallDict()
    • one function taking an array of (key, value) pairs: _PyObject_FastCall()

    _PyObject_FastCallDict() would pass directly the Python dict to C functions defined with METH_KEYWORDS, to tp_new, tp_init and tp_call slots.

    In a perfect world, _PyObject_FastCallDict() wouldn't be needed. In practice, dictionaries are used *everywhere* in the current C code base, so we need a function for the transition.

    _PyObject_FastCallDict() is private, so we can remove it whenever we want.

    --

    I'm patching more and more code, and I hate having to add an extra ", NULL" when calling the current _PyObject_FastCall().

    Maybe we should have a short name, _PyObject_FastCall(), for the common case: no keyword parameter, and use longer name for keywords: _PyObject_FastCallKeywords() (as PyArg_ParseTuple and PyArg_ParseTupleKeywords).

    --

    To summarize, I propose 3 functions:

    • _PyObject_FastCall(PyObject **args, int nargs)
    • _PyObject_FastCallKeywords(PyObject **args, int nargs, PyObject **kwargs, int nkwargs): nkwargs is the number of (key, value) pairs
    • _PyObject_FastCallDict(PyObject **args, int nargs, PyObject *kwargs): kwargs is a Python dictionary

    METH_FASTCALL would use _PyObject_FastCallKeywords format: "PyObject **kwargs, int nkwargs" for keyword parameters.

    What do you think?

    @vstinner
    Copy link
    Member Author

    _PyObject_FastCallKeywords(PyObject **args, int nargs, PyObject **kwargs, int nkwargs): nkwargs is the number of (key, value) pairs

    Hum, I don't know if it's worth to "split" the C array into two parts. Maybe it can be just:

    _PyObject_FastCallKeywords(PyObject **args, int nargs, int nkwargs);

    As current used in Python/ceval.c.

    @vstinner
    Copy link
    Member Author

    fastcall_macros.patch:

    • Rename _PyObject_FastCall() to _PyObject_FastCallDict()
    • Add _PyObject_FastCall(func, args, nargs) macro
    • Add _PyObject_CallArg1(func, arg) macro
    • Add _PyObject_CallNoArg(func) macro

    The 3 new macros call _PyObject_FastCallDict().

    Statistics on calls to these functions and macros:

    • _PyObject_CallNoArg: 11
    • _PyObject_CallArg1: 8
    • _PyObject_FastCall: 5
    • _PyObject_FastCallDict: 0 (the function doesn't work yet, it's the purpose of this issue, so it's expected that it's unused yet ;-))

    _PyObject_CallNoArg() and _PyObject_CallArg1() are much more common and the more complex format _PyObject_FastCall() format (more than 1 argument). I like having macros with a simple prototype.

    --

    The C API has a wide choice of functions to call a function, but I still want to add _PyObject_CallNoArg() and _PyObject_CallArg1() to get the best performances (avoid layers before reaching the final _PyObject_FastCallDict) and have a simple and well defined prototype for the most common function calls.

    kwargs:

    • PyObject_Call(): root of all call functions in Python 3.5
    • PyEval_CallObjectWithKeywords(func, args, kwargs) -> PyObject_Call()

    args:

    • PyEval_CallObject(func, args)
    • PyObject_CallObject(func, args) -> PyEval_CallObjectWithKeywords()

    vargs:

    • PyObject_CallFunctionObjArgs(func, ...) -> PyObject_Call()

    format:

    • PyObject_CallFunction(func, format, ...) -> PyObject_Call()
    • PyObject_CallMethod(func, method, format, ...) -> PyObject_Call()
    • _PyObject_CallMethodId(func, method_id, format, ...) -> PyObject_Call()
    • PyEval_CallFunction(func, format, ...) -> PyEval_CallObject()
    • PyEval_CallMethod(func, method, format, ...) -> PyEval_CallObject()

    Fast call:

    • _PyObject_CallNoArg(func) -> _PyObject_FastCallDict()
    • _PyObject_CallArg1(func, arg) -> _PyObject_FastCallDict()
    • _PyObject_FastCall(func, arg) -> _PyObject_FastCallDict()

    @scoder
    Copy link
    Contributor

    scoder commented Aug 22, 2016

    I like the oneArg/noArg etc. macros. We use something similar in Cython. You can even use them to inline the METH_O and METH_NOARGS call cases (although I use inline functions for that in Cython).

    @scoder
    Copy link
    Contributor

    scoder commented Aug 22, 2016

    I just took a quick look at the fastcall_kwargs-2.patch for now. It looks ok in general, but it also adds quite some special code for the dict-to-locals mapping. Is the keyword argument calling case really that important? I mean, it requires creating a dict and filling it, after all. People would rather avoid that in the first place if they care about performance. I wouldn't object, I'm merely asking...

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Aug 22, 2016

    New changeset 263334a652ac by Victor Stinner in branch 'default':
    Rename _PyObject_FastCall() to _PyObject_FastCallDict()
    https://hg.python.org/cpython/rev/263334a652ac

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Aug 22, 2016

    New changeset 1aefb4c4a7b4 by Victor Stinner in branch 'default':
    _PyFunction_FastCallDict() supports keyword args
    https://hg.python.org/cpython/rev/1aefb4c4a7b4

    New changeset 7924eac912fe by Victor Stinner in branch 'default':
    Issue bpo-27809: Cleanup _PyEval_EvalCodeWithName()
    https://hg.python.org/cpython/rev/7924eac912fe

    New changeset 15eab21bf934 by Victor Stinner in branch 'default':
    Issue bpo-27809: Use _PyObject_FastCallDict()
    https://hg.python.org/cpython/rev/15eab21bf934

    @vstinner
    Copy link
    Member Author

    Stefan Behnel: "I like the oneArg/noArg etc. macros. We use something similar in Cython."

    Oh, nice :-) Since they macros are private, I pushed fastcall_macros.patch. We can still rework them later if needed.

    "You can even use them to inline the METH_O and METH_NOARGS call cases (although I use inline functions for that in Cython)."

    Hum, I'm not sure that it's worth it.

    "I just took a quick look at the fastcall_kwargs-2.patch for now. It looks ok in general, but it also adds quite some special code for the dict-to-locals mapping. Is the keyword argument calling case really that important? I mean, it requires creating a dict and filling it, after all. People would rather avoid that in the first place if they care about performance. I wouldn't object, I'm merely asking..."

    Hey, it's nothing new: Python 3.5 does the same in function_call()!? Oh in fact, function_call() code is very short compared to fastcall_kwargs-2.patch and it doesn't require to modify the hot code of Python/ceval.c! So I pushed a very small change (change 1aefb4c4a7b4) which adds support for keyword with the sample function_call() code.

    Serhiy already told me that creating a dict is overkill... but please read again my message msg273370. We cannot break all APIs, we need a transition from the existing C APIs using Python dictionaries for keyword arguments.

    I reworked my new fastcall APIs to keep the door open for a faster API to pass keyword arguments as pairs of (key, value) in an unique C array.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Aug 22, 2016

    New changeset ef0110a52e24 by Victor Stinner in branch 'default':
    PyEval_CallObjectWithKeywords() uses fast call with kwargs
    https://hg.python.org/cpython/rev/ef0110a52e24

    New changeset 5587d0dfab4c by Victor Stinner in branch 'default':
    Issue bpo-27809: Use _PyObject_FastCallDict()
    https://hg.python.org/cpython/rev/5587d0dfab4c

    New changeset 0c65a2089f00 by Victor Stinner in branch 'default':
    Add _PyErr_CreateException()
    https://hg.python.org/cpython/rev/0c65a2089f00

    New changeset c53c532de995 by Victor Stinner in branch 'default':
    Issue bpo-27809: PyErr_SetImportError() uses fast call
    https://hg.python.org/cpython/rev/c53c532de995

    New changeset 614dd914c21e by Victor Stinner in branch 'default':
    Issue bpo-27809: tzinfo_reduce() uses fast call
    https://hg.python.org/cpython/rev/614dd914c21e

    New changeset f48ae71e8a8f by Victor Stinner in branch 'default':
    Issue bpo-27809: _csv: _call_dialect() uses fast call
    https://hg.python.org/cpython/rev/f48ae71e8a8f

    New changeset 60660890459f by Victor Stinner in branch 'default':
    Issue bpo-27809: methodcaller_reduce() uses fast call
    https://hg.python.org/cpython/rev/60660890459f

    New changeset bf2d88cf039a by Victor Stinner in branch 'default':
    PyEval_CallObjectWithKeywords() doesn't inc/decref
    https://hg.python.org/cpython/rev/bf2d88cf039a

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Aug 22, 2016

    New changeset 620b2e554be4 by Victor Stinner in branch 'default':
    Issue bpo-27809: builtin___build_class__() uses fast call
    https://hg.python.org/cpython/rev/620b2e554be4

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Aug 23, 2016

    New changeset c1a698edfa1b by Victor Stinner in branch 'default':
    Issue bpo-27809: partial_call() uses fast call for positional args
    https://hg.python.org/cpython/rev/c1a698edfa1b

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Aug 23, 2016

    New changeset 70f88b097f60 by Victor Stinner in branch 'default':
    Issue bpo-27809: map_next() uses fast call
    https://hg.python.org/cpython/rev/70f88b097f60

    New changeset 0e4f26083bbb by Victor Stinner in branch 'default':
    PyObject_CallMethodObjArgs() now uses fast call
    https://hg.python.org/cpython/rev/0e4f26083bbb

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Aug 23, 2016

    New changeset 401f59a7020b by Victor Stinner in branch 'default':
    PyObject_CallMethodObjArgs() now uses fast call
    https://hg.python.org/cpython/rev/401f59a7020b

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Aug 23, 2016

    New changeset 1b241e761f8f by Victor Stinner in branch 'default':
    Issue bpo-27809: map_next() uses fast call
    https://hg.python.org/cpython/rev/1b241e761f8f

    @vstinner
    Copy link
    Member Author

    vstinner commented Sep 1, 2016

    The main feature has been merged, so I close the issue.

    @vstinner vstinner closed this as completed Sep 1, 2016
    @vstinner vstinner changed the title _PyObject_FastCall(): add support for keyword arguments Add _PyFunction_FastCallDict(): fast call with keyword arguments as a dict Sep 1, 2016
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    performance Performance or resource usage
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants