classification
Title: Handle generator (and coroutine) state in the bytecode.
Type: performance Stage: patch review
Components: Versions: Python 3.11, Python 3.10
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Mark.Shannon Nosy List: Dennis Sweeney, Mark.Shannon, dpgeorge, pablogsal
Priority: release blocker Keywords: patch

Created on 2021-03-31 16:45 by Mark.Shannon, last changed 2021-12-08 11:04 by pablogsal.

Pull Requests
URL Status Linked Edit
PR 25137 closed Mark.Shannon, 2021-04-01 11:27
PR 25138 merged Mark.Shannon, 2021-04-01 15:23
PR 25224 merged Mark.Shannon, 2021-04-06 17:21
PR 25225 merged Dennis Sweeney, 2021-04-06 17:22
Messages (12)
msg389919 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2021-03-31 16:45
Every time we send, or throw, to a generator, the C code in genobject.c needs to check what state the generator is in. 
This is inefficient and couples the generator code, which should just be a thin wrapper around the interpreter, to the internals of the interpreter.

The state of the generator is known to the compiler. It should emit appropriate bytecodes to handle the different behavior for the different states.

While the main reason this is robustness and maintainability, removing the complex C code between Python caller and Python callee also opens up the possibility of some worthwhile optimizations.

There are three changes I want to make:

1. Add a new bytecode to handle starting a generator. This `GEN_START` bytecode would pop TOS, raising an exception if it is not None.
This adds some overhead for the first call to iter()/send() but speeds up all the others.

2. Handle the case of exhausted generators. This is a bit more fiddly, and involves putting an infinite loop at the end of the generator. Something like:

   CLEAR_FRAME
label:
   GEN_RETURN (Like RETURN_VALUE None, but does not discard the frame)
   JUMP label


This removes a lot of special case code for corner cases of exhausted generators and coroutines.

3. Handle throw() on `YIELD_FROM`. The problem here is that we need to differentiate between exceptions triggered by throw, which must call throw() on sub-generators, and exceptions propagating out of sub-generators which should be passed up the stack. By splitting the opcode into two (or more), it is clear which case is being handled in the interpreter without complicated logic in genobject.c
msg390305 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2021-04-06 10:49
New changeset b37181e69209746adc2119c471599a1ea5faa6c8 by Mark Shannon in branch 'master':
bpo-43683: Handle generator entry in bytecode (GH-25138)
https://github.com/python/cpython/commit/b37181e69209746adc2119c471599a1ea5faa6c8
msg390354 - (view) Author: Dennis Sweeney (Dennis Sweeney) * (Python triager) Date: 2021-04-06 17:27
Looks like we both opened PRs in the same minute.

The MAGIC constant didn't get updated, but perhaps that can just be included in the Minor Corrections PR.

I'd bet a CI check could be added to check that if the opcodes change then Python/importlib_external.h changes.
msg404564 - (view) Author: Damien George (dpgeorge) * Date: 2021-10-21 03:12
It looks like this change introduced a subtle, and maybe intended (?), behavioural change.

Consider (from MicroPython's test suite):

def f():
    n = 0 
    while True:
        n = yield n + 1 
        print(n)

g = f()
try:
    g.send(1)
except TypeError:
    print("caught")

print(g.send(None))
print(g.send(100))
print(g.send(200))

This used to work prior to commit b37181e69209746adc2119c471599a1ea5faa6c8.  But after that commit it fails on the print(g.send(None)) because the generator is now stopped.
msg405081 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2021-10-27 10:21
Damien, thanks for catching this.

The change was not intended.

There are two kind of exceptions raised by send.
1. Where a pre-condition is not met, e.g. a generator is already ruuning (caller errors)
2. When the generator/coroutine raises an exception (callee exceptions).

(1) does not change the state of the generator, (2) does.

Sending non-None to a generator that has not started falls into kind 1, IMO. So we should revert the change.
msg405083 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2021-10-27 11:03
Just to be clear, it is the behavior change that should be reverted, not necessarily the new bytecode.

In fact we should probably push on with (2) and add an exception handler wrapping the whole generator except the GEN_START. That way a GEN_START exception will not clear the generator.
GEN_START might need to do a bit of trickery like setting lasti back to -1.
msg405096 - (view) Author: Damien George (dpgeorge) * Date: 2021-10-27 13:31
Thanks for confirming the bug.

Sending non-None to a not-started generator could arguably be case (2), because that's exactly the semantics introduced by the commit that broke the test case :)

Honestly I don't have a strong opinion on which way this goes.  But I think it would be interesting to know if there was code out there that relied on the original behaviour.
msg407983 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-12-07 23:26
Mark, is something left in this issue?
msg407987 - (view) Author: Dennis Sweeney (Dennis Sweeney) * (Python triager) Date: 2021-12-07 23:43
bpo-46009 about the same behavior change
msg407988 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-12-07 23:48
Unfortunately, this has not been fixed into 3.10.1 and 3.11.0a3 as this hasn't version information and therefore has missed our automatic checks for blockers.

I have marked https://bugs.python.org/issue46009? as release blocker, as well as this issue and I have marked the versions.

Please, ensure this is fixed ASAP so it doesn't go into the next bugfix release or alpha release.
msg408006 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2021-12-08 10:19
Sorry about the delay; this dropped off my list. I'll fix it shortly.

This is quite an obscure bug, so I'm not sure that it is worth blocking the release for. But I'm not the release manager :)
msg408009 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-12-08 11:04
> This is quite an obscure bug, so I'm not sure that it is worth blocking the release for. But I'm not the release manager :)

Well, it certainly didn't block 3.10.1 or 3.11.0a3 ;)
History
Date User Action Args
2021-12-08 11:04:03pablogsalsetmessages: + msg408009
2021-12-08 10:19:38Mark.Shannonsetmessages: + msg408006
2021-12-07 23:48:41pablogsalsetmessages: + msg407988
versions: + Python 3.10, Python 3.11
2021-12-07 23:43:39Dennis Sweeneysetmessages: + msg407987
2021-12-07 23:26:09pablogsalsetnosy: + pablogsal
messages: + msg407983
2021-10-27 13:31:05dpgeorgesetmessages: + msg405096
2021-10-27 11:03:05Mark.Shannonsetmessages: + msg405083
2021-10-27 10:22:02Mark.Shannonsetpriority: normal -> release blocker
2021-10-27 10:21:53Mark.Shannonsetmessages: + msg405081
2021-10-21 03:12:43dpgeorgesetnosy: + dpgeorge
messages: + msg404564
2021-04-06 17:27:03Dennis Sweeneysetmessages: + msg390354
2021-04-06 17:22:50Dennis Sweeneysetnosy: + Dennis Sweeney
pull_requests: + pull_request23962
2021-04-06 17:21:09Mark.Shannonsetpull_requests: + pull_request23961
2021-04-06 10:49:03Mark.Shannonsetmessages: + msg390305
2021-04-01 15:23:45Mark.Shannonsetpull_requests: + pull_request23885
2021-04-01 11:27:57Mark.Shannonsetkeywords: + patch
stage: needs patch -> patch review
pull_requests: + pull_request23884
2021-03-31 16:50:06Mark.Shannonsetassignee: Mark.Shannon
type: performance
stage: needs patch
2021-03-31 16:45:03Mark.Shannoncreate