Author vstinner
Recipients inada.naoki, python-dev, serhiy.storchaka, vstinner
Date 2017-01-26.14:08:48
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1485439729.14.0.66968452831.issue29259@psf.upfronthosting.co.za>
In-reply-to
Content
"While I feel your work is great, performance benefit seems very small,
compared complexity of this patch."

I have to agree. I spent a lot of times on benhchmarking these tp_fast* changes. While one or two benchmarks are faster, it's not really the case for the others.

I also agree with the complexity. In Python 3.6, most FASTCALL changes were internals. For example, using PyObject_CallFunctionObjArgs() now uses FASTCALL internally, without having to modify callers of the API. I tried to only use _PyObject_FastCallDict/Keywords() in a few places where the speedup was significant.

The main visible change of Python 3.6 FASTCALL is the new METH_CALL calling convention for C function. Your change modifying print() to use METH_CALL has a significant impact on the telco benchmark, without no drawback. I tested further changes to use METH_FASTCALL in struct and decimal modules, and they optimize telco even more.

To continue the optimization work, I guess that using METH_CALL in more cases, using Argument Clinic whenever possible, would have a more concrete and measurable impact on performances, than this big tp_fastcall patch.

But I'm not ready to abandon the whole approach yet, so I change the status to Pending. I may come back in one or two months, to check if I didn't miss anything obvious to unlock even more optimizations ;-)
History
Date User Action Args
2017-01-26 14:08:49vstinnersetrecipients: + vstinner, inada.naoki, python-dev, serhiy.storchaka
2017-01-26 14:08:49vstinnersetmessageid: <1485439729.14.0.66968452831.issue29259@psf.upfronthosting.co.za>
2017-01-26 14:08:49vstinnerlinkissue29259 messages
2017-01-26 14:08:48vstinnercreate