classification
Title: Define a general "asynchronous operation introspection" protocol
Type: enhancement Stage: needs patch
Components: Versions: Python 3.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: ncoghlan, syncosmic, yselivanov
Priority: normal Keywords:

Created on 2017-08-18 02:25 by ncoghlan, last changed 2017-08-19 16:47 by yselivanov.

Messages (10)
msg300468 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2017-08-18 02:25
As covered in issue 31183, correctly disassembling functions, generator-iterators, coroutines, and async generator-iterators currently requires checking for 4 different attributes:

* __code__ (functions)
* gi_code (generator iterators)
* cr_code (coroutines)
* ag_code (async generator iterators)

This could be simplified if the latter three all *also* exposed their code attributes as `__code__`.
msg300471 - (view) Author: syncosmic (syncosmic) * Date: 2017-08-18 02:52
Some possibly helpful background (adapted from a discussion in PR 3077):

It looks as though `gi_code` was added to generators in bpo-1473257. At this time, function bytecode was still stored in `f.func_code`, so `gi_code` was a clear analogy.

My best guess is that `gi_code` was not chosen to deliberately create a namespace difference between functions and generators, but just because it wouldn't have made sense to add a `func_code` attribute to generators. In other words, if functions had had `code_object` instead of `func_code` at the time, bpo-1473257 might well have proposed `code_object` for generators. This doesn't, of course, mean that there might not be some post hoc reason to distinguish this part of the namespace now.

Then `func_code` was changed in 3.0 to `__code__` as part of the `f.func_X` to `f.__X__` renaming. The 3.0 whatsnew explained that the purpose of this change was to "free up [the f.func_X] names in the function attribute namespace for user-defined attributes". But this wasn't done for the analogous code object in generators. On a quick look, I didn't find any discussion of this at the time, but that doesn't mean there wasn't any.
msg300472 - (view) Author: syncosmic (syncosmic) * Date: 2017-08-18 03:00
A related issue (since this issue is contemplating restructuring these objects anyway):

Other than `??_code`, none of the `f.func_X` attributes which were eliminated in 3.0 have direct equivalents in generator-iterators, coroutines, or async generator-iterators.

However, all three of those have `[gi|cr|ag]_running` and `[gi|cr|ag]_frame` attributes. Generator-iterators also have `gi_yieldfrom`, and coroutines and async generator-iterators have `[cr|ag]_await`.

On a clean slate (with no attention paid to issues around breaking changes or how these attributes are already used in existing code), is there an argument for dundering some or all of these along with `__code__`? Or is special casing a better pattern when dealing with these other attributes?
msg300475 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2017-08-18 06:57
I think there's a strong case for a generic __frame__ attribute, since there are plenty of useful things you can do given "object with a linked frame" as a protocol (like extract the current locals namespace). We may even want to include traceback objects in that particular attribute access consolidation (since they have a tb_frame attribute).

Looking at the way inspect.getgeneratorstate() and inspect.getcoroutinestate() are implemented, I also think there may be a case for consolidating `[gi|cr|ag]_running` into a single `__running__` attribute, such that for objects that provide a `__frame__` attribute:

* `__running__` doesn't exist -> not an asynchronous operation
* `__running__ = True` -> running asynchronous operation
* `__running__ = False` -> paused or halted asynchronous operation

Then, rather than adding an inspect API specifically for async generators, we could add a general "inspect.getasyncstate()" one that covered all the object types, with the possible states:

    ASYNC_RUNNING: __running__ == True, assert __frame__ is not None
    ASYNC_CREATED: __running__ == False, __frame__.f_lasti == -1
    ASYNC_CLOSED: __running__ == False, __frame__ is None
    ASYNC_SUSPENDED: __running__ == False, neither closed nor created

Properly interpreting `gi_yieldfrom` and `[cr|ag]_await` is a bit trickier, since they require additional knowledge of how the control flow for those objects actually work. However, if we run with the "async operation introspection protocol" idea, then a suitably generic name would be `__async_call__`, and we could make it a doubly-linked list for the coroutine and async generator case by setting an `__async_return__` attribute on the target (to say where we expect control to return to when we're done).
msg300476 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2017-08-18 07:00
Actually, __async_return__ would be applicable for the synchronous generator case as well, since it would link to the generator containing the `yield from` expression.
msg300477 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2017-08-18 07:14
Slight amendment: `__delegated_to__` and `__returns_to__` might be better names for the doubly-linked list of the async call chain.

"__async_call__" and "__async_return__" both have the problem that they look like imperative commands (since "call" and "return" are typically used as verbs), and collection referring to "yield from" and "await" as "asynchronous delegation" would help avoid potential confusion with regular synchronous function calls.

I've also retitled the issue to cover the broader scope that also addresses the needs of the inspect module, not just the dis module.
msg300517 - (view) Author: syncosmic (syncosmic) * Date: 2017-08-18 16:58
I like where this is heading! Aside from the cleaner patterns for handling these objects, I think it'll make it a little easier for people who are just starting to use asynchronous objects in Python (e.g. me) to grasp what's similar about them.

+1 that `__async_call__` could be confusing, for the general reason you mention, but in particular because it would look exactly analogous to `function.__call__()` (what the French call a "faux ami").

A not-thought-out alternative: what about noun-ing the verbs into `__async_calls__` and `__async_returns__` (or maybe `__async_returns_to__`)?

BTW, I was thinking of taking a quick run at bpo-31197 while this simmers. I don't /think/ that risks being eventually mooted by these changes; if anything, it might be easier to adapt `dis` to this proposal if that refactoring has already been done. LMK if you think that's the wrong order, though.
msg300564 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2017-08-19 03:56
I agree https://bugs.python.org/issue31197 is orthogonal - refactoring the current logic is useful regardless of what we do at the object attribute layer.

Regarding __delegated_to__/__returns_to__, the reason I like *not* having "async" in the attribute names is that generators are actually used as a synchronous construct (via synchronous for loops and next() calls), even though they technically define an asynchronous operation from the interpreter's point of view.

The reason I like omitting "call" (or "calls") from the name is that even though "yield from" and "await" are both sometimes described as providing syntax for asynchronous calls, that terminology is really only defensible for "await" - PEP 380 describes the "yield from" operation as delegating to a subgenerator, and I think that's a better way of framing the general concept.
msg300580 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2017-08-19 11:15
I started a local PR at https://github.com/ncoghlan/cpython/pull/1/files to explore what this might look like in practice.

I think that what I've done so far shows that generic __frame__ and __running__ attributes would be sufficient to extend the inspect module's state introspection support to also cover async generators, while also allowing for the soft (i.e. documentation only) deprecation of the generator and coroutine specific variants of those APIs.

Issue 31183 already showed the potential value of a __code__ attribute, since it would allow all of the asynchronous operations to be handled by the same code path that already handles functions.

I'm less sure about __delegated_to__/__returns_to__, since we don't have *any* code in the standard library that reads gi_yieldfrom, and the only code that reads cr_await is a Python 3.5 compatibility hack in asyncio.
msg300587 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2017-08-19 16:47
> I started a local PR at https://github.com/ncoghlan/cpython/pull/1/files to explore what this might look like in practice.

Looks good to me.


> I'm less sure about __delegated_to__/__returns_to__, since we don't have *any* code in the standard library that reads gi_yieldfrom, and the only code that reads cr_await is a Python 3.5 compatibility hack in asyncio.

I'm -1 on this too.  gi_yieldfrom and cr_await are very special and rarely used things.  Refactoring or unifying them isn't trivial and ultimately not as useful.
History
Date User Action Args
2017-08-19 16:47:44yselivanovsetmessages: + msg300587
2017-08-19 11:15:12ncoghlansetmessages: + msg300580
2017-08-19 03:56:58ncoghlansetmessages: + msg300564
2017-08-18 16:58:21syncosmicsetmessages: + msg300517
2017-08-18 07:14:22ncoghlansetmessages: + msg300477
title: Add __code__ attributes to generator and coroutine objects -> Define a general "asynchronous operation introspection" protocol
2017-08-18 07:00:42ncoghlansetmessages: + msg300476
2017-08-18 06:57:53ncoghlansetnosy: + yselivanov
messages: + msg300475
2017-08-18 03:00:41syncosmicsetmessages: + msg300472
2017-08-18 02:52:32syncosmicsetmessages: + msg300471
2017-08-18 02:41:20syncosmicsetpull_requests: - pull_request3163
2017-08-18 02:41:01syncosmicsetnosy: + syncosmic
2017-08-18 02:37:54syncosmicsetpull_requests: + pull_request3163
2017-08-18 02:25:30ncoghlancreate