This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Never have GET_ITER not followed by FOR_ITER
Type: performance Stage: patch review
Components: Interpreter Core Versions: Python 3.6
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: Nosy List: Demur Rumed, dino.viehland, rhettinger, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2016-05-26 01:44 by Demur Rumed, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
gitfit.patch Demur Rumed, 2016-05-26 01:44 review
forbegin.patch Demur Rumed, 2016-05-29 18:24 Replace GET_ITER/FOR_ITER with FOR_BEGIN/FOR_ITER review
forbegin2.patch Demur Rumed, 2016-06-07 04:06
forbegin3.patch Demur Rumed, 2016-06-08 03:38
gitfit2.patch Demur Rumed, 2016-06-08 05:26 Pass test_genexps & test_dis review
Messages (19)
msg266400 - (view) Author: Philip Dubé (Demur Rumed) * Date: 2016-05-26 01:44
This is a small change to comprehensions pass in their iterable rather than calling GET_ITER before CALL_FUNCTION. This makes it so that the compiler never generates GET_ITER without following it with FOR_ITER, nor does it generate FOR_ITER without preceding it by GET_ITER

This is the standalone portion of a more constructive patch I'm wanting to submit: Replace GET_ITER with a FOR_BEGIN which effectively acts as GET_ITER/FOR_ITER. Then modify FOR_ITER to replace the JUMP_ABSOLUTE by changing it to a JABS instruction which jumps if the iterator does not return null. This reduces the bytecode of by 2 bytes for every for loop & reduces the dispatch count per loop by 1 (As we merge GET_ITER/FOR_ITER & JUMP_ABSOLUTE/FOR_ITER)
msg266453 - (view) Author: Dino Viehland (dino.viehland) * (Python committer) Date: 2016-05-26 20:31
I like how this makes the loop opcodes more regular - it should make things like Pyjion (http://www.github.com/Microsoft/Pyjion) which translate bytecode into native code.  We already have some code (currently commented out) checking that GET_ITER is followed by FOR_ITER to do some optimizations around loops.  

Another issue we ran into was that this leaves the iterable on the stack while the loop is running...  I'm just curious if your larger change alter that in anyway?
msg266609 - (view) Author: Philip Dubé (Demur Rumed) * Date: 2016-05-29 18:24
Attached is the larger potential change of replacing GET_ITER/FOR_ITER with FOR_BEGIN/FOR_ITER. I took awhile to getting this put together due to missing that CONTINUE_LOOP was jumping to the top of the loop, skipping past FOR_ITER
msg266615 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-05-29 19:17
Interesting.

How it works with continue?

    for x in a:
        if x: continue
        f()

In this case we can use new FOR_ITER instead of JUMP_ABSOLUTE. I would rename FOR_ITER to something more appropriate (maybe containing NEXT or CONTINUE).

How it works with continue in the try block?

    for x in a:
        try:
            if x: continue
            f()
        except Error:
            e()
msg266617 - (view) Author: Philip Dubé (Demur Rumed) * Date: 2016-05-29 19:31
Currently it'll work since in an except it'll generate a CONTINUE_LOOP that'll jump to the end of the loop & either jump back to the start or to the end

Your example is incorrect. If the continue's JUMP_ABS were a FOR_ITER then if we were on the last iteration of the loop we would end up executing the loop body with an invalid stack. So you'd have to follow the FOR_ITER with a JUMP_ABS past the loop. Not sure if there's a speed advantage worth complexity/larger wordcode
msg266618 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-05-29 19:37
Ah, sorry. You are right.

Do you have any microbenchmarks that shows the benefit?
msg266622 - (view) Author: Philip Dubé (Demur Rumed) * Date: 2016-05-29 19:59
Pretty small perf increase:

timeit("for a in range(256):pass", number=100000)
Without patch: ~2.1
With patch: ~1.84

pybench is 1-2% faster. Would prefer others' results. Especially a benchmark where it doesn't wrap payloads in for loops
msg266624 - (view) Author: Philip Dubé (Demur Rumed) * Date: 2016-05-29 20:03
Microbenchmark of continue regression: timeit("for a in range(256):\n\tif a:continue",number=100000)
Without patch: ~3.57
With patch: ~3.64
msg266631 - (view) Author: Philip Dubé (Demur Rumed) * Date: 2016-05-29 20:29
Summer heat is getting to me. Please ignore this issue until I've uploaded a fixed patch
msg266657 - (view) Author: Philip Dubé (Demur Rumed) * Date: 2016-05-30 01:13
Two issues:

One: `a = (i for i in 6)` does not throw until calling next(a). This applies to the first patch. If it's unacceptable to defer the throw then this whole issue should be closed unless there's interest in having a GET_ITER, FOR_ITER, FOR_BEGIN, FOR_NEXT set of opcodes between comprehensions & for loops. That seems excessive

Two: tracing is getting a bit confused by FOR_ITER being at the end of the loop body. The clearest example is that in test_pdb's test_pdb_until_command_generator it needs 3 steps instead of 2 to trace `print(i)` after the 'until 4' because whereas currently it traces the `for i in test_gen()` before calling into the iterator, it now calls into the iterator first & traces `for i in test_gen()` only after returning. Additionally it doesn't trace `for i in test_gen()` at the end of iteration, instead tracing the last line

It seems fragile to try fix this up by having the FOR_ITER tail receive a negative lnotab line offset to place it on the same line as the loop heading. My own research of the code base has been trying to determine if I can manually emit a trace in the FOR_ITER opcode, but I'd rather not muck with the frame to carry out this façade
msg267468 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-06-05 19:53
It looks to me this idea is dead.
msg267476 - (view) Author: Philip Dubé (Demur Rumed) * Date: 2016-06-05 20:58
I've gotten most tests to past by having FOR_ITER be traced as if the instruction index is that of the corresponding FOR_BEGIN. test_sys_settrace still fails on test_15_loops because an empty loop body doesn't have the 'pass' line traced (ie when FOR_ITER starts the line) which I'm currently pondering ways around

The first patch, which only moved GET_ITER into the closure, would still be good for list/set/dict comprehensions (to help PREDICT & JITs)

If there's essentially a decision that all loops should have JUMP_ABSOLUTE to their beginning for the sake of tracing simplicity, then FOR_BEGIN/FOR_ITER are dead
msg267583 - (view) Author: Philip Dubé (Demur Rumed) * Date: 2016-06-07 03:55
Attached is a patch which fixes test_sys_settrace & test_pdb & test_trace. Still failing is test_genexps. I'm unsure the best way to fix this one:

1. Allow generator expressions to defer type errors about non iterables until the initial call to next

2. Create a TEST_ITER opcode entirely for generator expressions to eagerly throw

3. Generate instructions like
FOR_BEGIN, if empty, jump over generator & put an empty generator on the stack
-> CALL_FUNCTION 2, have generator produce code as LOAD_FAST 0, LOAD_FAST 1, <code>, FOR_ITER repeat <code> (ie translate stack of FOR_BEGIN from before call as header of generator)

Empty generator can be created with 'LOAD_CONST (None,), FOR_BEGIN, POP' but it seems rather bad to have generators require 3 more opcodes around their creation than currently
msg267584 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-06-07 04:13
What if generate following code?

    GET_ITER
    JUMP_FORWARD L2
L1:
    # ...
L2:
    FOR_ITER L1

This saves one jump instruction per iteration.

Next, we can merge GET_ITER+JUMP_FORWARD in one FOR_BEGIN instruction for regular loops. Generators will start with JUMP_FORWARD.
msg267722 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2016-06-07 18:48
I don't think this should go forward.  The current FOR_ITER and JUMP_ABSOLUTE combination is very efficient and shouldn't be changed lightly.  It is the inside of the loop that matters -- the GET_ITER step is outside of the loop and isn't expensive.

Also, I really like that GET_ITER and FOR_ITER correspond exactly to our high level understanding of how for-loops operate:

   for x in s:
       f(x)

corresponds to:

   it = iter(s)
   while True:
       try:
           x = next(it)
       except StopIteration:
           break
       f(x)

I really don't think this should be pursued further.  It messes with my understanding of for-loops, it combines features that should remain decoupled, it focuses its efforts on the exterior of the loop, and it alters code that is very mature (having been examined and tweaked by hundreds of your predecessors).
msg267768 - (view) Author: Philip Dubé (Demur Rumed) * Date: 2016-06-08 03:38
Attaching forbegin3.patch. It reintroduces GET_ITER for the sole purpose of eagerly throwing. I decided to reuse GET_ITER over something like TEST_ITER as this way we can have GET_ITER flow into FOR_BEGIN & rely on the fast path of iter(iter(x))

GET_ITER/JUMP_FORWARD idea doesn't work because FOR_ITER is carefully setup currently to trace as existing on 2 separate lines. If we JUMP_FORWARD into FOR_ITER then that tracing triggers & our trace will say we executed the last line of the loop immediately before executing the iteration logic
msg267769 - (view) Author: Philip Dubé (Demur Rumed) * Date: 2016-06-08 03:42
Didn't see Raymond's response before posting, forbegin3 at least exists as a completion of the experiment to a passes-tests state. The tracing hacks to support an instruction corresponding to two separate lines support rejecting this idea
msg267787 - (view) Author: Philip Dubé (Demur Rumed) * Date: 2016-06-08 05:26
I should've kept gitfit & forbegin more separated as issues. Attached is gitfit2, which only folds the GET_ITER into the comprehension if it isn't a generator so to pass test_genexps
msg267789 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016-06-08 05:36
As Dino noted in msg266453, this leaves the iterable on the stack while the loop is running. I think opcode reworking shouldn't change behavior. You should call GET_ITER or FOR_BEGIN before calling generator code.
History
Date User Action Args
2022-04-11 14:58:31adminsetgithub: 71314
2016-06-08 15:59:56Demur Rumedsetstatus: open -> closed
resolution: rejected
2016-06-08 05:36:06serhiy.storchakasetmessages: + msg267789
2016-06-08 05:26:14Demur Rumedsetfiles: + gitfit2.patch

messages: + msg267787
2016-06-08 03:42:18Demur Rumedsetmessages: + msg267769
2016-06-08 03:38:18Demur Rumedsetfiles: + forbegin3.patch

messages: + msg267768
2016-06-07 18:48:08rhettingersetnosy: + rhettinger
messages: + msg267722
2016-06-07 04:13:06serhiy.storchakasetmessages: + msg267584
2016-06-07 04:06:28Demur Rumedsetfiles: + forbegin2.patch
2016-06-07 04:05:30Demur Rumedsetfiles: - forbegin2.patch
2016-06-07 03:55:39Demur Rumedsetfiles: + forbegin2.patch

messages: + msg267583
2016-06-05 20:58:43Demur Rumedsetmessages: + msg267476
2016-06-05 19:53:25serhiy.storchakasetmessages: + msg267468
2016-05-30 01:13:31Demur Rumedsetmessages: + msg266657
2016-05-29 20:29:22Demur Rumedsetmessages: + msg266631
2016-05-29 20:03:32Demur Rumedsetmessages: + msg266624
2016-05-29 19:59:10Demur Rumedsetmessages: + msg266622
2016-05-29 19:37:33serhiy.storchakasetmessages: + msg266618
stage: patch review
2016-05-29 19:31:37Demur Rumedsetmessages: + msg266617
2016-05-29 19:17:41serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg266615
2016-05-29 18:24:31Demur Rumedsetfiles: + forbegin.patch

messages: + msg266609
2016-05-26 20:31:41dino.viehlandsetnosy: + dino.viehland
messages: + msg266453
2016-05-26 01:44:24Demur Rumedcreate