Message263991
Hi, I started to work on a new top-down approach avoiding completly temporary tuple/dict: "FASTCALL", issue #26814. As you may expect, the patch is much larger that Serhiy's patches attached to his issue, but IMHO the final code is much simpler: it doesn't have to use complex tricks on the reference count or setting tuple items to NULL (which lead to segfaults like the issue #26811).
Antoine Pitrou: "Maybe we want a facility to create on-stack static-size tuples?"
My current implementation of FASTCALL uses a buffer allocated on the stack to handle calls with up to 10 parameters (10 positional parameters or 5 keyword parameters, since a keyword uses 2 PyObject*).
Antoine Pitrou: "How many functions can benefit from this approach, though?"
In my experimental FASTCALL branch, slot wrappers, PyObject_Call*() functions (except of PyObject_Call()!), ceval.c, etc. benefit from FASTCALL. The question is not really how much benefit from FASTCALL, but more how much changes are worth to avoid temporary tuple/dict. For example, right now I decided to not add a FASTCALL flavor of tp_new and tp_init (instanciate a class or create a new type) because it becomes much more complex when you have to handle inheritance.
Maybe my FASTCALL requires too many changes and is overkill. Maybe we need to find a compromise between FASTCALL (issue #26814) and Serhiy's changes which are limited to a few functions.
Today, it's too early to decide, but I have fun with my FASTCALL experiment ;-) Slot wrappers are 40% faster, getting a namedtuple attribute is 25% faster (whereas Serhiy already optimized this specific case!), etc. |
|
Date |
User |
Action |
Args |
2016-04-22 09:38:05 | vstinner | set | recipients:
+ vstinner, rhettinger, pitrou, scoder, ezio.melotti, serhiy.storchaka |
2016-04-22 09:38:05 | vstinner | set | messageid: <1461317885.82.0.627770302773.issue23507@psf.upfronthosting.co.za> |
2016-04-22 09:38:05 | vstinner | link | issue23507 messages |
2016-04-22 09:38:05 | vstinner | create | |
|