Message 389919 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	Mark.Shannon
Recipients	Mark.Shannon
Date	2021-03-31.16:45:03
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1617209103.34.0.926095267338.issue43683@roundup.psfhosted.org>
In-reply-to

Content
Every time we send, or throw, to a generator, the C code in genobject.c needs to check what state the generator is in. This is inefficient and couples the generator code, which should just be a thin wrapper around the interpreter, to the internals of the interpreter. The state of the generator is known to the compiler. It should emit appropriate bytecodes to handle the different behavior for the different states. While the main reason this is robustness and maintainability, removing the complex C code between Python caller and Python callee also opens up the possibility of some worthwhile optimizations. There are three changes I want to make: 1. Add a new bytecode to handle starting a generator. This `GEN_START` bytecode would pop TOS, raising an exception if it is not None. This adds some overhead for the first call to iter()/send() but speeds up all the others. 2. Handle the case of exhausted generators. This is a bit more fiddly, and involves putting an infinite loop at the end of the generator. Something like: CLEAR_FRAME label: GEN_RETURN (Like RETURN_VALUE None, but does not discard the frame) JUMP label This removes a lot of special case code for corner cases of exhausted generators and coroutines. 3. Handle throw() on `YIELD_FROM`. The problem here is that we need to differentiate between exceptions triggered by throw, which must call throw() on sub-generators, and exceptions propagating out of sub-generators which should be passed up the stack. By splitting the opcode into two (or more), it is clear which case is being handled in the interpreter without complicated logic in genobject.c

Every time we send, or throw, to a generator, the C code in genobject.c needs to check what state the generator is in.
This is inefficient and couples the generator code, which should just be a thin wrapper around the interpreter, to the internals of the interpreter.

The state of the generator is known to the compiler. It should emit appropriate bytecodes to handle the different behavior for the different states.

While the main reason this is robustness and maintainability, removing the complex C code between Python caller and Python callee also opens up the possibility of some worthwhile optimizations.

There are three changes I want to make:

1. Add a new bytecode to handle starting a generator. This `GEN_START` bytecode would pop TOS, raising an exception if it is not None.
This adds some overhead for the first call to iter()/send() but speeds up all the others.

2. Handle the case of exhausted generators. This is a bit more fiddly, and involves putting an infinite loop at the end of the generator. Something like:

CLEAR_FRAME
label:
GEN_RETURN (Like RETURN_VALUE None, but does not discard the frame)
JUMP label

This removes a lot of special case code for corner cases of exhausted generators and coroutines.

3. Handle throw() on `YIELD_FROM`. The problem here is that we need to differentiate between exceptions triggered by throw, which must call throw() on sub-generators, and exceptions propagating out of sub-generators which should be passed up the stack. By splitting the opcode into two (or more), it is clear which case is being handled in the interpreter without complicated logic in genobject.c

History
Date	User	Action	Args
2021-03-31 16:45:03	Mark.Shannon	set	recipients: + Mark.Shannon
2021-03-31 16:45:03	Mark.Shannon	set	messageid: <1617209103.34.0.926095267338.issue43683@roundup.psfhosted.org>
2021-03-31 16:45:03	Mark.Shannon	link	issue43683 messages
2021-03-31 16:45:03	Mark.Shannon	create