Message 258204 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	yselivanov
Recipients	benjamin.peterson, brett.cannon, gvanrossum, ncoghlan, vstinner, yselivanov
Date	2016-01-14.17:09:04
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1452791347.05.0.923215900831.issue26110@psf.upfronthosting.co.za>
In-reply-to

Content
This issue supersedes issue #6033. I decided to open a new one, since the patch is about Python 3.6 (not 2.7) and is written from scratch. The idea is to add new opcodes to avoid instantiation of BoundMethods. The patch only affects method calls of Python functions with positional arguments. I'm working on the attached patch in this repo: https://github.com/1st1/cpython/tree/call_meth2 If the patch gets accepted, I'll update it with the docs etc. Performance Improvements ------------------------ Method calls in micro-benchmarks are 1.2x faster: ### call_method ### Avg: 0.330371 -> 0.281452: 1.17x faster ### call_method_slots ### Avg: 0.333489 -> 0.280612: 1.19x faster ### call_method_unknown ### Avg: 0.304949 -> 0.251623: 1.21x faster Improvements in mini-benchmarks, such as Richards are less impressive, I'd say it's 3-7% improvement. The full output of benchmarks/perf.py is here: https://gist.github.com/1st1/e00f11586329f68fd490 When the full benchmarks suite is run, some of them report that they were slow. When I ran them separately several times, they all show no real slowdowns. It's just some of them (such as nbody) are highly unstable. It's actually possible to improve the performance another 1-3% if we fuse __PyObject_GetMethod with ceval/LOAD_METHOD code. I've tried this here: https://github.com/1st1/cpython/tree/call_meth4, however I don't like to have so many details of object.c into ceval.c. Changes in the Core ------------------- Two new opcodes are added -- LOAD_METHOD and CALL_METHOD. Whenever compiler sees a method call "obj.method(..)" with positional arguments it compiles it as follows: LOAD_FAST(obj) LOAD_METHOD(method) {call arguments} CALL_METHOD LOAD_METHOD implementation in ceval looks up "method" on obj's type, and checks that it wasn't overridden in obj.__dict__. Apparently, even with the __dict__ check this is still faster then creating a BoundMethod instance etc. If the method is found and not overridden, LOAD_METHOD pushes the resolved method object, and 'obj'. If the method was overridden, the resolved method object and NULL are pushed to the stack. CALL_METHOD then looks at the two stack values after call arguments. If the first one isn't NULL, it means that we have a method call. Why CALL_METHOD? ---------------- It's actually possible to hack CALL_FUNCTION to support LOAD_METHOD. I've tried this approach in https://github.com/1st1/cpython/tree/call_meth3. It looks like that adding extra checks in CALL_FUNCTION have negative impact on many benchmarks. It's really easier to add another opcode. Why only pure-Python methods? ----------------------------- LOAD_METHOD atm works only with methods defined in pure Python. C methods, such as `list.append` are still wrapped into a descriptor, that creates a PyCFunction object on each attribute access. I've tried to do that in https://github.com/1st1/cpython/tree/call_cmeth. It does impact C method calls in a positive way, although my implementation is very hacky. It still uses LOAD_METHOD and CALL_METHOD opcodes, so my idea is to consider merging this patch first, and then introduce the necessary refactoring of PyCFunction and MethodDesctiptors in a separate issue. Why only calls with positional arguments? ----------------------------------------- As showed in "Why CALL_METHOD?", making CALL_FUNCTION to work with LOAD_METHOD slows it down. For keyword and var-arg calls we have three more opcodes -- CALL_FUNCTION_VAR, CALL_FUNCTION_KW, and CALL_FUNCTION_VAR_KW. I suspect that making them work with LOAD_METHOD would slow them down too, which will probably require us to add three (!) more opcodes for LOAD_METHOD. And these kind of calls require much more overhead anyways, I don't expect them to be as optimizable as positional arg calls.

This issue supersedes issue #6033. I decided to open a new one, since the patch is about Python 3.6 (not 2.7) and is written from scratch.

The idea is to add new opcodes to avoid instantiation of BoundMethods. The patch only affects method calls of Python functions with positional arguments.

I'm working on the attached patch in this repo: https://github.com/1st1/cpython/tree/call_meth2

If the patch gets accepted, I'll update it with the docs etc.

Performance Improvements
------------------------

Method calls in micro-benchmarks are 1.2x faster:

### call_method ###
Avg: 0.330371 -> 0.281452: 1.17x faster

### call_method_slots ###
Avg: 0.333489 -> 0.280612: 1.19x faster

### call_method_unknown ###
Avg: 0.304949 -> 0.251623: 1.21x faster

Improvements in mini-benchmarks, such as Richards are less impressive, I'd say it's 3-7% improvement. The full output of benchmarks/perf.py is here: https://gist.github.com/1st1/e00f11586329f68fd490

When the full benchmarks suite is run, some of them report that they were slow. When I ran them separately several times, they all show no real slowdowns. It's just some of them (such as nbody) are highly unstable.

It's actually possible to improve the performance another 1-3% if we fuse __PyObject_GetMethod with ceval/LOAD_METHOD code. I've tried this here: https://github.com/1st1/cpython/tree/call_meth4, however I don't like to have so many details of object.c into ceval.c.

Changes in the Core
-------------------

Two new opcodes are added -- LOAD_METHOD and CALL_METHOD. Whenever compiler sees a method call "obj.method(..)" with positional arguments it compiles it as follows:

LOAD_FAST(obj)
LOAD_METHOD(method)
{call arguments}
CALL_METHOD

LOAD_METHOD implementation in ceval looks up "method" on obj's type, and checks that it wasn't overridden in obj.__dict__. Apparently, even with the __dict__ check this is still faster then creating a BoundMethod instance etc.

If the method is found and not overridden, LOAD_METHOD pushes the resolved method object, and 'obj'. If the method was overridden, the resolved method object and NULL are pushed to the stack.

CALL_METHOD then looks at the two stack values after call arguments. If the first one isn't NULL, it means that we have a method call.

Why CALL_METHOD?
----------------

It's actually possible to hack CALL_FUNCTION to support LOAD_METHOD. I've tried this approach in https://github.com/1st1/cpython/tree/call_meth3. It looks like that adding extra checks in CALL_FUNCTION have negative impact on many benchmarks. It's really easier to add another opcode.

Why only pure-Python methods?
-----------------------------

LOAD_METHOD atm works only with methods defined in pure Python. C methods, such as `list.append` are still wrapped into a descriptor, that creates a PyCFunction object on each attribute access. I've tried to do that in https://github.com/1st1/cpython/tree/call_cmeth. It does impact C method calls in a positive way, although my implementation is very hacky. It still uses LOAD_METHOD and CALL_METHOD opcodes, so my idea is to consider merging this patch first, and then introduce the necessary refactoring of PyCFunction and MethodDesctiptors in a separate issue.

Why only calls with positional arguments?
-----------------------------------------

As showed in "Why CALL_METHOD?", making CALL_FUNCTION to work with LOAD_METHOD slows it down. For keyword and var-arg calls we have three more opcodes -- CALL_FUNCTION_VAR, CALL_FUNCTION_KW, and CALL_FUNCTION_VAR_KW. I suspect that making them work with LOAD_METHOD would slow them down too, which will probably require us to add three (!) more opcodes for LOAD_METHOD.

And these kind of calls require much more overhead anyways, I don't expect them to be as optimizable as positional arg calls.

History
Date	User	Action	Args
2016-01-14 17:09:10	yselivanov	set	recipients: + yselivanov, gvanrossum, brett.cannon, ncoghlan, vstinner, benjamin.peterson
2016-01-14 17:09:07	yselivanov	set	messageid: <1452791347.05.0.923215900831.issue26110@psf.upfronthosting.co.za>
2016-01-14 17:09:06	yselivanov	link	issue26110 messages
2016-01-14 17:09:06	yselivanov	create