Author haypo
Recipients haypo, larry, rhettinger, serhiy.storchaka, yselivanov
Date 2016-04-22.11:10:16
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <>
Changes of my current implementation, ad4a53ed1fbf.diff.

The good thing is that all changes are internals (really?). Even if you don't modify your C extensions (nor your Python code), you should benefit of the new fast call is *a lot* of cases.

IMHO the best tricky part are changes on the PyTypeObject. Is it ok to add a new tp_fastcall slot? Should we add even more slots using the fast call convention like tp_fastnew and tp_fastinit? How should we handle the inheritance of types with that?

(*) Add 2 new public functions:

PyObject* PyObject_CallNoArg(PyObject *func);
PyObject* PyObject_CallArg1(PyObject *func, PyObject *arg);

(*) Add 1 new private function:

PyObject* _PyObject_FastCall(PyObject *func, PyObject **stack, int na, int nk);

_PyObject_FastCall() is the root of the new feature.

(*) type: add a new "tp_fastcall" field to the PyTypeObject structure.

It's unclear to me how inheritance is handled here. Maybe it's simply broken, but it's strange because it looks like it works :-) Maybe it's very rare that tp_call is overidden in a child class?

TODO: maybe reuse the "tp_call" field? (risk of major backward incompatibility...)

(*) slots: add a new "fastwrapper" field to the wrappercase structure. Add a fast wrapper to all slots (really all? i should check).

I don't think that consumers of the C API are of this change, or maybe only a few projects.

TODO: maybe remove "fastwrapper" and reuse the "wrapper" field? (low risk of backward compatibility?)

(*) Implement fast call for Python function (_PyFunction_FastCall) and C functions (PyCFunction_FastCall)

(*) Add a new METH_FASTCALL calling convention for C functions. Right now, it is used for 4 builtin functions: sorted(), getattr(), iter(), next().

Argument Clinic should be modified to emit C code using this new fast calling convention.

(*) Implement fast call in the following functions (types):

- method()
- method_descriptor()
- wrapper_descriptor()
- method_wrapper()
- operator.itemgetter => used by collections.namedtuple to get an item by its name

(*) Modify PyObject_Call*() functins to reuse internally the fast call. "tp_fastcall" is preferred over "tp_call" (FIXME: is it really useful to do that?).

The following functions are able to avoid temporary tuple/dict without having to modify the code calling them:

- PyObject_CallFunction()
- PyObject_CallMethod(), _PyObject_CallMethodId()
- PyObject_CallFunctionObjArgs(), PyObject_CallMethodObjArgs()

It's not required to modify code using these functions to use the 3 new shiny functions (PyObject_CallNoArg, PyObject_CallArg1, _PyObject_FastCall). For example, replacing PyObject_CallFunctionObjArgs(func, NULL) with PyObject_CallNoArg(func) is just a micro-optimization, the tuple is already avoided. But PyObject_CallNoArg() should use less memory of the C stack and be a "little bit" faster.

(*) Add new helpers: new Include/pystack.h file, Py_VaBuildStack(), etc.

Please ignore unrelated changes.
Date User Action Args
2016-04-22 11:10:16hayposetrecipients: + haypo, rhettinger, larry, serhiy.storchaka, yselivanov
2016-04-22 11:10:16hayposetmessageid: <>
2016-04-22 11:10:16haypolinkissue26814 messages
2016-04-22 11:10:16haypocreate