Issue864059
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2003-12-21 18:30 by nnorwitz, last changed 2022-04-11 14:56 by admin. This issue is now closed.
Files | ||||
---|---|---|---|---|
File name | Uploaded | Description | Edit | |
optimize2.patch | nnorwitz, 2003-12-21 18:30 |
Messages (6) | |||
---|---|---|---|
msg45061 - (view) | Author: Neal Norwitz (nnorwitz) * | Date: 2003-12-21 18:30 | |
There are several different parts to this patch which are separable. They each seemed to have a small benefit. It would be interesting for others to test this patch in whole and in different parts to see if speed can be improved. I generally got between 1% - 10% improvement. I used pystone, pybench, and the total time to run all regression tests. Runs were on a RH9 Linux/Athlon 650. I used a non-debug build (so gcc 3.2 with -O3). All regression tests pass with these changes. I removed registers from many variables. This seemed to have little to no effect. So I'm not sure about those. opcode does not need to be initialized to 0. I removed the freevars variable since it is rarely used. I think the largest benefit was from adding the gotos for opcodes which set why: BREAK_LOOP, CONTINUE_LOOP, RETURN_VALUE, YIELD_VALUE; This skips many tests which are known a priori depending on the opcode. I removed the special check for list in UNPACK_SEQUENCE since this path is rarely used. (http://coverage.livinglogic.de/file.phtml?file%5fid=12442339) I also removed the predcitions for JUMP_IF_TRUE since this wasn't executed often (see previous URL). I added 2 opcodes for calling functions with 0 or 1 arguments. This removed a lot of code in call_function(). By removing test branches in several places, this seemed to speed up the code. However, it seemed that just specializing for 0 arguments was better than for 1 arg. I'm not sure if the specialization for 1 argument provides much benefit. Both of these specializations could possibly be improved to speed things up. |
|||
msg45062 - (view) | Author: Raymond Hettinger (rhettinger) * | Date: 2003-12-24 08:20 | |
Logged In: YES user_id=80475 I'll try these out and review the patch when I get back from vacation next week. The list special case for UNPACK_SEQUENCE and the prediction for JUMP_IF_TRUE should be left in -- they do provide speed-ups for code that exercises those features and they don't hurt the general cases. |
|||
msg45063 - (view) | Author: Raymond Hettinger (rhettinger) * | Date: 2004-01-01 03:42 | |
Logged In: YES user_id=80475 The patch is promising. I'm able to measure a small speed- up for the two new function opcodes and for the setwhy gotos. Both optimizations make sense. I don't measure a savings from not initializing opcode and oparg. That change makes sense conceptually because the variables are always assigned before use; however, the surrounding control flow statements hide that fact from the compiler. So, it is likely that they were initialized to suppress warnings on somebody's system. If so, then that change should not be made. The other stuff should definitely be left out. The effect of register variables will vary from compiler to compiler, so if you can't measure an improvement, it is best to leave it alone. Some compilers do not do much in the way of optimization and the register declaration may be a valuable hint. Please leave in the branch prediction for JUMP_IF_TRUE -- I put it in after finding measurable savings in real code. While it doesn't come up often, when it does it should run as fast as possible. The special case for UNPACK_SEQUENCE is up for grabs. When that case occurs, the speedup is substantial. Also, given that the tuple check has failed, it becomes highly probable that the target is a list. OTOH, this inlined code fattens the already voluminuous code for eval_frame. Maybe eliminating it will help someone's optimizer cope with all the code. Use your judgement on this one. Removing the freevars variable did not show any speedup. It does keep one variable off the stack and shortens the startup time by a few instructions. OTOH, the in-lined replacements for it result in a net expansion of code size and causes a microscopic slowdown whenever it is used. I recommend leaving this one alone. Executive summary: Only make the two big changes that show meaurable speedups and make conceptual sense. Leave the other stuff alone. One other thought, try making custom benchmarks for targeted optimizations. The broad spectrum benchmarks are too coarse to tell whether an improvement is really working. Also, be sure to check with Guido before adding the new opcodes. Ideally, each optimization should be loaded separately so its effects can be isolated and to allow any one to be backed out if necessary. |
|||
msg45064 - (view) | Author: Raymond Hettinger (rhettinger) * | Date: 2004-02-06 18:37 | |
Logged In: YES user_id=80475 Added a simplified version of the goto optimization. See Python/ceval.c 2.374 |
|||
msg45065 - (view) | Author: Raymond Hettinger (rhettinger) * | Date: 2004-03-07 09:13 | |
Logged In: YES user_id=80475 Neal, assigning back to you in case you want to purse the two new opcodes. |
|||
msg45066 - (view) | Author: Neal Norwitz (nnorwitz) * | Date: 2004-10-21 03:25 | |
Logged In: YES user_id=33168 No reason to take this further. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:56:01 | admin | set | github: 39722 |
2003-12-21 18:30:29 | nnorwitz | create |