Issue 44505: loop invariant code motion

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/88671

classification

Title:	loop invariant code motion
Type:	performance	Stage:	resolved
Components:	Interpreter Core	Versions:	Python 3.11

process

Status:	closed	Resolution:	not a bug
Dependencies:		Superseder:
Assigned To:		Nosy List:	BTaskaya, zhangchaospecial
Priority:	normal	Keywords:

Created on 2021-06-24 15:36 by zhangchaospecial, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (2)
msg396496 - (view)	Author: zcpara (zhangchaospecial) *	Date: 2021-06-24 15:36
It is a common scenario to call functions in loop, such as def foo(): for i in range(10): len("abc") Currently, we generate bytecode like this; 2 0 SETUP_LOOP 24 (to 26) 2 LOAD_GLOBAL 0 (range) 4 LOAD_CONST 1 (10) 6 CALL_FUNCTION 1 8 GET_ITER >> 10 FOR_ITER 12 (to 24) 12 STORE_FAST 0 (i) 3 14 LOAD_GLOBAL 1 (len) 16 LOAD_CONST 2 ('abc') 18 CALL_FUNCTION 1 20 POP_TOP 22 JUMP_ABSOLUTE 10 >> 24 POP_BLOCK >> 26 LOAD_CONST 0 (None) 28 RETURN_VALUE If we can make sure len is a loop invariant, we can generate code as follows. Only one LOAD_GLOBAL len is executed before the loop. LOAD_GLOBAL len is replaced with LOAD_FAST __local__len in the loop. 2 0 LOAD_GLOBAL 0 (range) 2 LOAD_CONST 1 (10) 4 CALL_FUNCTION 1 6 GET_ITER 8 LOAD_GLOBAL 1 (len) 10 STORE_FAST 1 (__local__len) >> 12 FOR_ITER 6 (to 26) 14 STORE_FAST 0 (i) 3 16 LOAD_FAST 1 (__local__len) 18 LOAD_CONST 2 ('abc') 20 CALL_FUNCTION 1 22 POP_TOP 24 JUMP_ABSOLUTE 6 (to 12) 2 >> 26 LOAD_CONST 0 (None) 28 RETURN_VALU I prototyped this optimization and gain 4.6% performance improvement on regex_v8 of PyPerformance. Currently, only the global invariants are moved out of the loop. There are some other loop invariants can be moved. Other optimizations such as function inline can provide more opportunities for LICM. I meet a problem with the following case; 1 def no_pop_blocks(): 2 y = 1 3 while not y: 4 bla 5 x = 1 bla is not defined anywhere. With LICM, LOAD_GLOBAL bla will be move out of the loop. Then it executes with an error: bla is not defined.
msg396504 - (view)	Author: Batuhan Taskaya (BTaskaya) *	Date: 2021-06-24 17:00
There are multiple issues with doing 'static' loop invariant code motion in python, but the most obvious one is how would you detect if anything in the body is related to the loop or not? The trivial way that the compiled languages do this optimization is generating some sort of data dependency graph, and seeing whether anything in an expression refers to the loop target though in Python you can't know for sure unless you know every possible type in the loop body and the type of the loop target which is something nearly impossible to infer (due to the dynamic nature of python). You could theoretically do this only if the whole loop only uses simple values: for x in [1, 2, 3, 4, 5]: a = 5 if a > 3: a += 2 continue and in that case, it is better to just optimize the whole loop away but since this kind of code doesn't exist at all in the public, there is no need to make the compiler more complicated. Another option is generating multiple code objects with guards, which would require to do first some sort of partial type inference and generate the code for the possible types. It would require a substantial amount of changes on both the compiler and the interpreter which is something I'd say can be deferred for later if we implement a tracing JIT. With traces, it should be way easier to do this sort of trivial compiler optimization (loop invariants, constant propagation, GVN, etc.) since we would know the types, and possible code paths.

History
Date	User	Action	Args
2022-04-11 14:59:47	admin	set	github: 88671
2021-06-24 17:00:37	BTaskaya	set	status: open -> closed resolution: not a bug stage: resolved
2021-06-24 17:00:04	BTaskaya	set	nosy: + BTaskaya messages: + msg396504
2021-06-24 15:36:41	zhangchaospecial	create