This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author zhangchaospecial
Recipients zhangchaospecial
Date 2021-06-24.15:36:41
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1624549001.38.0.871680374624.issue44505@roundup.psfhosted.org>
In-reply-to
Content
It is a common scenario to call functions in loop, such as
def foo():
  for i in range(10):
    len("abc")
Currently, we generate bytecode like this;
2           0 SETUP_LOOP              24 (to 26)
              2 LOAD_GLOBAL              0 (range)
              4 LOAD_CONST               1 (10)
              6 CALL_FUNCTION            1
              8 GET_ITER
        >>   10 FOR_ITER                12 (to 24)
             12 STORE_FAST               0 (i)

  3          14 LOAD_GLOBAL              1 (len)
             16 LOAD_CONST               2 ('abc')
             18 CALL_FUNCTION            1
             20 POP_TOP
             22 JUMP_ABSOLUTE           10
        >>   24 POP_BLOCK
        >>   26 LOAD_CONST               0 (None)
             28 RETURN_VALUE
If we can make sure len is a loop invariant, we can generate code as follows. Only one LOAD_GLOBAL len is executed before the loop. LOAD_GLOBAL len is replaced with LOAD_FAST __local__len in the loop.
2           0 LOAD_GLOBAL              0 (range)
              2 LOAD_CONST               1 (10)
              4 CALL_FUNCTION            1
              6 GET_ITER
              8 LOAD_GLOBAL              1 (len)
             10 STORE_FAST               1 (__local__len)
        >>   12 FOR_ITER                 6 (to 26)
             14 STORE_FAST               0 (i)

  3          16 LOAD_FAST                1 (__local__len)
             18 LOAD_CONST               2 ('abc')
             20 CALL_FUNCTION            1
             22 POP_TOP
             24 JUMP_ABSOLUTE            6 (to 12)

  2     >>   26 LOAD_CONST               0 (None)
             28 RETURN_VALU

I prototyped this optimization and gain 4.6% performance improvement on regex_v8 of PyPerformance. Currently, only the global invariants are moved out of the loop. There are some other loop invariants can be moved. Other optimizations such as function inline can provide more opportunities for LICM.
I meet a problem with the following case;
1 def no_pop_blocks():
2    y = 1
3    while not y:
4        bla
5    x = 1

bla is not defined anywhere. With LICM, LOAD_GLOBAL bla will be move out of the loop. Then it executes with an error: bla is not defined.
History
Date User Action Args
2021-06-24 15:36:41zhangchaospecialsetrecipients: + zhangchaospecial
2021-06-24 15:36:41zhangchaospecialsetmessageid: <1624549001.38.0.871680374624.issue44505@roundup.psfhosted.org>
2021-06-24 15:36:41zhangchaospeciallinkissue44505 messages
2021-06-24 15:36:41zhangchaospecialcreate