Message266422
Since the issue #26814 proved that avoiding the creation of temporary tuples to call Python and C functions makes Python faster (between 2% and 29% depending on the benchmark), I extracted a first "minimal" patch to start merging this work.
The first patch adds new functions:
* PyObject_CallNoArg(func) and PyObject_CallArg1(func, arg): public functions
* _PyObject_FastCall(func, args, nargs, kwargs): private function
I hesitate between the C types "int" and "Py_ssize_t" for nargs. I read once that using "int" can cause performance issues on a loop using "i++" and "data[i]" because the compiler has to handle integer overflow of the int type.
The "int" type is also annoying on Windows 64-bit, it causes compiler warnings on downcast like PyTuple_GET_SIZE(co->co_argcount) stored into a C int.
_PyObject_FastCall() avoids the creation of tuple for:
* All Python functions (PyFunction_Check)
* C functions using METH_NOARGS or METH_O
The patch removes the "cache tuple" optimization from property_descr_get(), it uses PyObject_CallArg1() instead. It means that the optimization is (currently) missed in some cases compared to the current code, but the code is safer and simpler.
The patch adds Python/pystack.c which currently only contains _PyStack_AsTuple(), but will contain more code later.
I tried to write the smallest patch, but I started to use PyObject_CallNoArg() and PyObject_CallArg1() when the code already created a tuple at each call: PyObject_CallObject(), call_function_tail() and PyEval_CallObjectWithKeywords().
In the patch, keywords are not used in fast calls. But they will be used later. I prefer to start directly with keywords than changing the calling convention once again later.
--
Later, I will propose other patches to:
* add METH_FASTCALL calling convention for C functions
* modify Argument Clinic to use METH_FASTCALL
So the fast call will be taken in more cases.
--
The long term plan is to slowly use the new FASTCALL calling convention "everywhere". The tricky point are tp_new, tp_init and tp_call attributes of type objects. In the issue #26814, I wrote a patch adding Py_TPFLAGS_FASTNEW, Py_TPFLAGS_FASTINIT and Py_TPFLAGS_FASTCALL flags to use the FASTCALL calling convention for tp_new, tp_init and tp_call. The problem is that calling directly these methods looks common. If we can the calling convention of these methods, it will break the C API, I propose to discuss that later ;-)
An alternative is to add a tp_fastcall method to PyTypeObject and use a wrapper for tp_call for backward compatibility. This option has also drawbacks. Again, I propose to discuss this later, and first start to focus on the changes that don't break anything ;-) |
|
Date |
User |
Action |
Args |
2016-05-26 10:15:59 | vstinner | set | recipients:
+ vstinner, serhiy.storchaka, yselivanov |
2016-05-26 10:15:56 | vstinner | set | messageid: <1464257756.81.0.101125062969.issue27128@psf.upfronthosting.co.za> |
2016-05-26 10:15:56 | vstinner | link | issue27128 messages |
2016-05-26 10:15:56 | vstinner | create | |
|