Issue 864059: optimize eval_frame

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/39722

classification

Title:	optimize eval_frame
Type:		Stage:
Components:	Interpreter Core	Versions:	Python 2.4

process

Status:	closed	Resolution:
Dependencies:		Superseder:
Assigned To:	nnorwitz	Nosy List:	nnorwitz, rhettinger
Priority:	normal	Keywords:	patch

Created on 2003-12-21 18:30 by nnorwitz, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
optimize2.patch	nnorwitz, 2003-12-21 18:30

Messages (6)
msg45061 - (view)	Author: Neal Norwitz (nnorwitz) *	Date: 2003-12-21 18:30
There are several different parts to this patch which are separable. They each seemed to have a small benefit. It would be interesting for others to test this patch in whole and in different parts to see if speed can be improved. I generally got between 1% - 10% improvement. I used pystone, pybench, and the total time to run all regression tests. Runs were on a RH9 Linux/Athlon 650. I used a non-debug build (so gcc 3.2 with -O3). All regression tests pass with these changes. I removed registers from many variables. This seemed to have little to no effect. So I'm not sure about those. opcode does not need to be initialized to 0. I removed the freevars variable since it is rarely used. I think the largest benefit was from adding the gotos for opcodes which set why: BREAK_LOOP, CONTINUE_LOOP, RETURN_VALUE, YIELD_VALUE; This skips many tests which are known a priori depending on the opcode. I removed the special check for list in UNPACK_SEQUENCE since this path is rarely used. (http://coverage.livinglogic.de/file.phtml?file%5fid=12442339) I also removed the predcitions for JUMP_IF_TRUE since this wasn't executed often (see previous URL). I added 2 opcodes for calling functions with 0 or 1 arguments. This removed a lot of code in call_function(). By removing test branches in several places, this seemed to speed up the code. However, it seemed that just specializing for 0 arguments was better than for 1 arg. I'm not sure if the specialization for 1 argument provides much benefit. Both of these specializations could possibly be improved to speed things up.
msg45062 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2003-12-24 08:20
Logged In: YES user_id=80475 I'll try these out and review the patch when I get back from vacation next week. The list special case for UNPACK_SEQUENCE and the prediction for JUMP_IF_TRUE should be left in -- they do provide speed-ups for code that exercises those features and they don't hurt the general cases.
msg45063 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2004-01-01 03:42
Logged In: YES user_id=80475 The patch is promising. I'm able to measure a small speed- up for the two new function opcodes and for the setwhy gotos. Both optimizations make sense. I don't measure a savings from not initializing opcode and oparg. That change makes sense conceptually because the variables are always assigned before use; however, the surrounding control flow statements hide that fact from the compiler. So, it is likely that they were initialized to suppress warnings on somebody's system. If so, then that change should not be made. The other stuff should definitely be left out. The effect of register variables will vary from compiler to compiler, so if you can't measure an improvement, it is best to leave it alone. Some compilers do not do much in the way of optimization and the register declaration may be a valuable hint. Please leave in the branch prediction for JUMP_IF_TRUE -- I put it in after finding measurable savings in real code. While it doesn't come up often, when it does it should run as fast as possible. The special case for UNPACK_SEQUENCE is up for grabs. When that case occurs, the speedup is substantial. Also, given that the tuple check has failed, it becomes highly probable that the target is a list. OTOH, this inlined code fattens the already voluminuous code for eval_frame. Maybe eliminating it will help someone's optimizer cope with all the code. Use your judgement on this one. Removing the freevars variable did not show any speedup. It does keep one variable off the stack and shortens the startup time by a few instructions. OTOH, the in-lined replacements for it result in a net expansion of code size and causes a microscopic slowdown whenever it is used. I recommend leaving this one alone. Executive summary: Only make the two big changes that show meaurable speedups and make conceptual sense. Leave the other stuff alone. One other thought, try making custom benchmarks for targeted optimizations. The broad spectrum benchmarks are too coarse to tell whether an improvement is really working. Also, be sure to check with Guido before adding the new opcodes. Ideally, each optimization should be loaded separately so its effects can be isolated and to allow any one to be backed out if necessary.
msg45064 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2004-02-06 18:37
Logged In: YES user_id=80475 Added a simplified version of the goto optimization. See Python/ceval.c 2.374
msg45065 - (view)	Author: Raymond Hettinger (rhettinger) *	Date: 2004-03-07 09:13
Logged In: YES user_id=80475 Neal, assigning back to you in case you want to purse the two new opcodes.
msg45066 - (view)	Author: Neal Norwitz (nnorwitz) *	Date: 2004-10-21 03:25
Logged In: YES user_id=33168 No reason to take this further.

History
Date	User	Action	Args
2022-04-11 14:56:01	admin	set	github: 39722
2003-12-21 18:30:29	nnorwitz	create