Author vstinner
Recipients serhiy.storchaka, vstinner, yselivanov
Date 2016-05-26.10:15:54
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1464257756.81.0.101125062969.issue27128@psf.upfronthosting.co.za>
In-reply-to
Content
Since the issue #26814 proved that avoiding the creation of temporary tuples to call Python and C functions makes Python faster (between 2% and 29% depending on the benchmark), I extracted a first "minimal" patch to start merging this work.

The first patch adds new functions:

* PyObject_CallNoArg(func) and PyObject_CallArg1(func, arg): public functions
* _PyObject_FastCall(func, args, nargs, kwargs): private function

I hesitate between the C types "int" and "Py_ssize_t" for nargs. I read once that using "int" can cause performance issues on a loop using "i++" and "data[i]" because the compiler has to handle integer overflow of the int type.

The "int" type is also annoying on Windows 64-bit, it causes compiler warnings on downcast like PyTuple_GET_SIZE(co->co_argcount) stored into a C int.


_PyObject_FastCall() avoids the creation of tuple for:

* All Python functions (PyFunction_Check)
* C functions using METH_NOARGS or METH_O

The patch removes the "cache tuple" optimization from property_descr_get(), it uses PyObject_CallArg1() instead. It means that the optimization is (currently) missed in some cases compared to the current code, but the code is safer and simpler.


The patch adds Python/pystack.c which currently only contains _PyStack_AsTuple(), but will contain more code later.


I tried to write the smallest patch, but I started to use PyObject_CallNoArg() and PyObject_CallArg1() when the code already created a tuple at each call: PyObject_CallObject(), call_function_tail() and PyEval_CallObjectWithKeywords().


In the patch, keywords are not used in fast calls. But they will be used later. I prefer to start directly with keywords than changing the calling convention once again later.

--

Later, I will propose other patches to:

* add METH_FASTCALL calling convention for C functions
* modify Argument Clinic to use METH_FASTCALL

So the fast call will be taken in more cases.

--

The long term plan is to slowly use the new FASTCALL calling convention "everywhere". The tricky point are tp_new, tp_init and tp_call attributes of type objects. In the issue #26814, I wrote a patch adding Py_TPFLAGS_FASTNEW, Py_TPFLAGS_FASTINIT and Py_TPFLAGS_FASTCALL flags to use the FASTCALL calling convention for tp_new, tp_init and tp_call. The problem is that calling directly these methods looks common. If we can the calling convention of these methods, it will break the C API, I propose to discuss that later ;-)

An alternative is to add a tp_fastcall method to PyTypeObject and use a wrapper for tp_call for backward compatibility. This option has also drawbacks. Again, I propose to discuss this later, and first start to focus on the changes that don't break anything ;-)
History
Date User Action Args
2016-05-26 10:15:59vstinnersetrecipients: + vstinner, serhiy.storchaka, yselivanov
2016-05-26 10:15:56vstinnersetmessageid: <1464257756.81.0.101125062969.issue27128@psf.upfronthosting.co.za>
2016-05-26 10:15:56vstinnerlinkissue27128 messages
2016-05-26 10:15:56vstinnercreate