classification
Title: Make generator state easier to introspect
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.2
process
Status: closed Resolution: accepted
Dependencies: Superseder:
Assigned To: ncoghlan Nosy List: Rodolpho.Eckhardt, eric.araujo, gvanrossum, ncoghlan, pitrou, rbp, zbysz
Priority: normal Keywords: easy, patch

Created on 2010-10-28 12:42 by ncoghlan, last changed 2010-11-30 06:36 by ncoghlan. This issue is now closed.

Files
File name Uploaded Description Edit
getgeneratorstate.patch Rodolpho.Eckhardt, 2010-11-20 20:09
getgeneratorstate_v2.patch Rodolpho.Eckhardt, 2010-11-20 21:15
Messages (16)
msg119774 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2010-10-28 12:42
Generators can be in four different states that may be relevant to framework code making use of them (especially as coroutines). This state is all currently available from Python code, but is rather obscure and could be made readable.

The four states are:

Created:
  "g.gi_frame is not None and g.gi_frame.f_lasti == -1"
  (Frame exists, but no instructions have been executed yet)

Currently executing:
  "g.gi_running"
  (This being true implies other things about the state as well, but this is all you need to check)

Suspended at a yield point:
  "g.gi_frame is not None and g.gi_frame.f_lasti != -1 and not g.gi_running"

Exhausted/closed:
  "g.gi_frame is None"

My API proposal is to add the following to the inspect module:

GEN_CREATED, GEN_RUNNING, GEN_SUSPENDED, GEN_CLOSED = range(4)

def getgeneratorstate(g):
  if g.gi_running:
    return GEN_RUNNING
  if g.gi_frame is None:
    return GEN_CLOSED
  if g.gi_frame.f_lasti == -1:
    return GEN_CREATED
  return GEN_SUSPENDED

(the lack of underscores in the function name follows the general style of the inspect module)
msg119775 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-10-28 12:55
Is it CPython-specific?
Does "currently executing" also include "currently closing"?
msg119776 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2010-10-28 13:05
On Thu, Oct 28, 2010 at 10:55 PM, Antoine Pitrou <report@bugs.python.org> wrote:
>
> Antoine Pitrou <pitrou@free.fr> added the comment:
>
> Is it CPython-specific?

The states are not CPython-specific (they're logical states of the
underlying generator), but I don't know if other implementations
expose generator and frame details in the same way (all the more
reason to put this in inspect - other implementations can provide the
information without needing to exactly mimic gi_frame and f_lasti).

> Does "currently executing" also include "currently closing"?

"Currently executing" means the frame is being evaluated in a Python
thread (the thread running it may be suspended in a multi-threaded
environment, but the frame itself is in the middle of doing something,
which may include processing a thrown in GeneratorExit)
msg119795 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2010-10-28 15:04
I could imagine separating the state into two parts:

- a three-valued enum distinguishing created, active, or exhausted

- a bool (only relevant in the active state) whether it is currently running or suspended

The latter is just g.gi_running so we don't need a new API for this. :-)
msg119830 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2010-10-29 00:26
So something like:

GEN_CREATED, GEN_ACTIVE, GEN_CLOSED = range(3)

def getgeneratorstate(g):
  """Get current state of a generator-iterator.
  
  Possible states are:
    GEN_CREATED: Created, waiting to start execution
    GEN_ACTIVE: Currently being executed (or suspended at yield)
    GEN_CLOSED: Execution has completed

  Use g.gi_running to determine if an active generator is running or
  is suspended at a yield expression.
  """
  if g.gi_frame is None:
    return GEN_CLOSED
  if g.gi_frame.f_lasti == -1:
    return GEN_CREATED
  return GEN_ACTIVE

Having 4 separate states actually makes the documentation a little easier to write:

def getgeneratorstate(g):
  """Get current state of a generator-iterator.
  
  Possible states are:
    GEN_CREATED: Waiting to start execution
    GEN_RUNNING: Currently being executed by the interpreter
    GEN_SUSPENDED: Currently suspended at a yield expression
    GEN_CLOSED: Execution has completed
  """

Checking if the generator is active is then just a matter of checking "gen_state in (GEN_RUNNING, GEN_SUSPENDED)".
msg119834 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2010-10-29 00:43
I take it back. The 4-value state looks better.

My initial hesitance was that if you ever see GEN_RUNNING you are
probably already in trouble, since you can't call send, next, throw or
even close on a running generator (they all throw ValueError), so why
are you looking at its state at all? But most reasons for looking at
the state are obscure anyway, and from a different perspective it's a
nice state machine. (Note that there's no transition from SUSPENDED to
CLOSED -- you have to go through RUNNING to possibly handle
GeneratorExit.)
msg119836 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2010-10-29 01:28
Indeed, the minimal lifecycles are:

GEN_CREATED->GEN_CLOSED (exception thrown in before the generator was even started)
GEN_CREATED->GEN_RUNNING->GEN_CLOSED (initial next() with internal logic that skips all yields)
GEN_CREATED->GEN_RUNNING->GEN_SUSPENDED->GEN_RUNNING->GEN_CLOSED (initial next() with a throw, next or send to close it)

Other cases following the same basic pattern as the last one, they just bounce back and forth between suspended and running more times.

It occurred to me that threads really use the same state machine, it's just that almost nobody writes their own Python thread schedulers, so only _thread and threading care about the suspended/running distinction. There are quite a few different generator schedulers though, so the distinctions matters to more 3rd party code than it does for threads.
msg121749 - (view) Author: Rodolpho Eckhardt (Rodolpho.Eckhardt) Date: 2010-11-20 20:09
This patch adds the getgeneratorstate function and related tests.
msg121771 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2010-11-20 20:56
Looks good, modulo two nitpicks:

1) Using literals for constants would avoid a lookup and call of range.

2) Please remove four leading spaces in your docstring.
msg121780 - (view) Author: Rodolpho Eckhardt (Rodolpho.Eckhardt) Date: 2010-11-20 21:15
Done, thank you!

Second version attached.
msg121781 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2010-11-20 21:18
Nice.  Now you can sit back, relax and wait for Nick to commit the patch or make further comments.
msg121844 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2010-11-21 03:04
I'll actually go with version 1 of the patch as far as the variable initialisation goes. Yes, it is fractionally slower, but you get a maintenance gain from the fact that the enum values are guaranteed to be orthogonal, and this is clearly obvious to the reader.

When you write the assignments out explicitly, the reader has to actually look at the assigned values to determine whether or not the same value is ever assigned twice.

(No need to post a modified patch - I'll fix it before I check it in)
msg121856 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2010-11-21 03:46
Committed in r86633.

I added the missing docs changes, and tweaked the tests a little bit:
- added a helper method to retrieve the generator state in the test case
- this allowed test_running to be simplified a bit
- added an explicit test for the CREATED->CLOSED case
- renamed the test functions to match the existing idiom in test_inspect
msg122138 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2010-11-22 15:44
Temporarily reopening to remind me to switch from using integer constants to strings (which are much friendlier for debugging purposes).
msg122146 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2010-11-22 16:39
Yes please.

On Mon, Nov 22, 2010 at 7:44 AM, Nick Coghlan <report@bugs.python.org> wrote:
>
> Nick Coghlan <ncoghlan@gmail.com> added the comment:
>
> Temporarily reopening to remind me to switch from using integer constants to strings (which are much friendlier for debugging purposes).
>
> ----------
> status: closed -> open
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue10220>
> _______________________________________
>
msg122889 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2010-11-30 06:36
Switched to strings in r86879.
History
Date User Action Args
2010-11-30 06:36:50ncoghlansetstatus: open -> closed

messages: + msg122889
2010-11-22 16:39:24gvanrossumsetmessages: + msg122146
2010-11-22 15:44:56ncoghlansetstatus: closed -> open

messages: + msg122138
2010-11-21 03:46:30ncoghlansetstatus: open -> closed
resolution: accepted
messages: + msg121856

stage: patch review -> resolved
2010-11-21 03:04:00ncoghlansetmessages: + msg121844
2010-11-20 21:18:01eric.araujosetmessages: + msg121781
stage: needs patch -> patch review
2010-11-20 21:15:24Rodolpho.Eckhardtsetfiles: + getgeneratorstate_v2.patch

messages: + msg121780
2010-11-20 21:14:02rbpsetnosy: + rbp
2010-11-20 20:56:22eric.araujosetnosy: + eric.araujo
messages: + msg121771
2010-11-20 20:09:45Rodolpho.Eckhardtsetfiles: + getgeneratorstate.patch

nosy: + Rodolpho.Eckhardt
messages: + msg121749

keywords: + patch
2010-10-30 09:37:46zbyszsetnosy: + zbysz
2010-10-29 01:28:54ncoghlansetassignee: ncoghlan
2010-10-29 01:28:37ncoghlansetmessages: + msg119836
2010-10-29 00:43:02gvanrossumsetmessages: + msg119834
2010-10-29 00:26:03ncoghlansetmessages: + msg119830
2010-10-28 15:04:32gvanrossumsetnosy: + gvanrossum
messages: + msg119795
2010-10-28 13:05:09ncoghlansetmessages: + msg119776
2010-10-28 12:55:12pitrousetnosy: + pitrou
messages: + msg119775
2010-10-28 12:42:52ncoghlancreate