Title: Handle generator (and coroutine) state in the bytecode.
Type: performance Stage: patch review
Components: Versions:
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Mark.Shannon Nosy List: Dennis Sweeney, Mark.Shannon
Priority: normal Keywords: patch

Created on 2021-03-31 16:45 by Mark.Shannon, last changed 2021-04-06 17:27 by Dennis Sweeney.

Pull Requests
URL Status Linked Edit
PR 25137 closed Mark.Shannon, 2021-04-01 11:27
PR 25138 merged Mark.Shannon, 2021-04-01 15:23
PR 25224 merged Mark.Shannon, 2021-04-06 17:21
PR 25225 merged Dennis Sweeney, 2021-04-06 17:22
Messages (3)
msg389919 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2021-03-31 16:45
Every time we send, or throw, to a generator, the C code in genobject.c needs to check what state the generator is in. 
This is inefficient and couples the generator code, which should just be a thin wrapper around the interpreter, to the internals of the interpreter.

The state of the generator is known to the compiler. It should emit appropriate bytecodes to handle the different behavior for the different states.

While the main reason this is robustness and maintainability, removing the complex C code between Python caller and Python callee also opens up the possibility of some worthwhile optimizations.

There are three changes I want to make:

1. Add a new bytecode to handle starting a generator. This `GEN_START` bytecode would pop TOS, raising an exception if it is not None.
This adds some overhead for the first call to iter()/send() but speeds up all the others.

2. Handle the case of exhausted generators. This is a bit more fiddly, and involves putting an infinite loop at the end of the generator. Something like:

   GEN_RETURN (Like RETURN_VALUE None, but does not discard the frame)
   JUMP label

This removes a lot of special case code for corner cases of exhausted generators and coroutines.

3. Handle throw() on `YIELD_FROM`. The problem here is that we need to differentiate between exceptions triggered by throw, which must call throw() on sub-generators, and exceptions propagating out of sub-generators which should be passed up the stack. By splitting the opcode into two (or more), it is clear which case is being handled in the interpreter without complicated logic in genobject.c
msg390305 - (view) Author: Mark Shannon (Mark.Shannon) * (Python committer) Date: 2021-04-06 10:49
New changeset b37181e69209746adc2119c471599a1ea5faa6c8 by Mark Shannon in branch 'master':
bpo-43683: Handle generator entry in bytecode (GH-25138)
msg390354 - (view) Author: Dennis Sweeney (Dennis Sweeney) * Date: 2021-04-06 17:27
Looks like we both opened PRs in the same minute.

The MAGIC constant didn't get updated, but perhaps that can just be included in the Minor Corrections PR.

I'd bet a CI check could be added to check that if the opcodes change then Python/importlib_external.h changes.
Date User Action Args
2021-04-06 17:27:03Dennis Sweeneysetmessages: + msg390354
2021-04-06 17:22:50Dennis Sweeneysetnosy: + Dennis Sweeney
pull_requests: + pull_request23962
2021-04-06 17:21:09Mark.Shannonsetpull_requests: + pull_request23961
2021-04-06 10:49:03Mark.Shannonsetmessages: + msg390305
2021-04-01 15:23:45Mark.Shannonsetpull_requests: + pull_request23885
2021-04-01 11:27:57Mark.Shannonsetkeywords: + patch
stage: needs patch -> patch review
pull_requests: + pull_request23884
2021-03-31 16:50:06Mark.Shannonsetassignee: Mark.Shannon
type: performance
stage: needs patch
2021-03-31 16:45:03Mark.Shannoncreate