Message287261
pyobject_fastcall-4.patch combines a lot of changes. I wrote it to experiment _PyObject_FastCall().
I will rewrite these changes as a patch serie with smaller changes.
The design of PyObject_FastCall*() is to be short and simple.
Changes:
* Add _PyObject_FastCall(), _PyFunction_FastCall(), _PyCFunction_FastCall(): similar to the _PyObject_FastCallKeywords() but without keyword arguments. Without keyword arguments, the checks on keyword arguments can be removed, and it allows to reduce the stack usage: one less parameter to C functions.
* Move the slow path out of _PyObject_FastCallDict() and _PyObject_FastCallKeywords() in a new subfunction which is tagged with _Py_NO_INLINE to reduce the stack consumption of _PyObject_FastCallDict() and _PyObject_FastCallKeywords() in their fast paths, which are the most code paths.
* Move all "call" functions into a new Objects/call.c function. This change should help code placement to enhance the usage of the CPU caches (especially the CPU L1 instruction cache). In my long experience with benchmarking last year, I notice huge performance differences caused by code placement. See my blog post:
https://haypo.github.io/analysis-python-performance-issue.html
Sadly, _Py_HOT_FUNCTION didn't fix the issue completely.
* _PyObject_FastCallDict() and _PyObject_FastCallKeywords() "call" _PyObject_FastCall(). In fact, _PyObject_FastCall() is inlined manually in these functions, against, to minimize the stack usage. Similar change in _PyCFunction_FastCallDict() and PyCFunction_Call().
* Since the slow path is moved to a subfunction, I removed _Py_NO_INLINE from _PyStack_AsTuple() to allow the compiler to inline it if it wants to ;-) Since the stack usage is better with the patch, it seems like this strange has no negative impact on the stack usage.
* Optimize PyMethodDescr_Type (call _PyMethodDescr_FastCallKeywords()) in _PyObject_FastCallKeywords()
* I moved Py_EnterRecursiveCall() closer to the final function call, to simplify error handling. It would be nice to fix the issue #29306 before :-/
* I had to copy/paste the null_error() function from Objects/abstract.c. I don't think that it's worth it to add a _PyErr_NullError() function shared by abstract.c and call.c. Compilers are even able to merge duplicated functions ;-)
* A few other changes. |
|
Date |
User |
Action |
Args |
2017-02-07 22:42:44 | vstinner | set | recipients:
+ vstinner, methane, serhiy.storchaka |
2017-02-07 22:42:44 | vstinner | set | messageid: <1486507364.36.0.535614769835.issue29465@psf.upfronthosting.co.za> |
2017-02-07 22:42:44 | vstinner | link | issue29465 messages |
2017-02-07 22:42:43 | vstinner | create | |
|