classification
Title: add a Generator ABC
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.5
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: rhettinger Nosy List: Ludovic.Gasc, gvanrossum, haypo, lukasz.langa, martin.panter, pitrou, python-dev, rhettinger, scoder, yselivanov
Priority: normal Keywords: patch

Created on 2015-04-20 19:36 by scoder, last changed 2015-05-09 18:38 by gvanrossum. This issue is now closed.

Files
File name Uploaded Description Edit
generator_abc.patch scoder, 2015-04-20 20:01 implementation of a Generator ABC review
generator_abc.patch scoder, 2015-04-21 04:57 updated patch that improves close() implementation and test coverage review
generator_abc.patch scoder, 2015-04-21 06:10 updated patch: check GeneratorExit propagation review
inspect_docs.patch scoder, 2015-04-23 06:37 update inspect docs to state that isinstance() is better than isgenerator() review
generator_abc.patch scoder, 2015-04-27 06:26 updated implementation: make throw() abstract review
Messages (36)
msg241675 - (view) Author: Stefan Behnel (scoder) * Date: 2015-04-20 19:36
Currently, CPython tends to assume that generators are always implemented by a Python function that uses the "yield" keyword. However, it is absolutely possible to implement generators as a protocol by adding send(), throw() and close() methods to an iterator.

The problem is that these will usually be rejected by code that needs to distinguish generators from other input, e.g. in asyncio, as this code will commonly do a type check against types.GeneratorType. Instead, it should check for the expected protocol. The best way to achieve this is to extend the existing ABCs with a Generator ABC that external generator implementations can register on.

Related to issue 24004 (asyncio). Asyncio could provide a backport copy of this for older Python versions.

I assume this is considered a new feature that cannot be added to 3.4 anymore?
msg241677 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2015-04-20 19:40
It's missing tests. :-)

Otherwise looks quite sensible.

Also, shouldn't you override __subclasshook__ so you don't inherit it from Iterator?
msg241681 - (view) Author: Stefan Behnel (scoder) * Date: 2015-04-20 19:58
Ok, sure. Here's a new patch that adds tests and docs.
msg241682 - (view) Author: Stefan Behnel (scoder) * Date: 2015-04-20 20:01
Sorry, here's another doc fix.
msg241683 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2015-04-20 20:11
Is it a problem that the check can't be done in a fast way from C code?

Other than that, sounds good to me.
msg241692 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2015-04-21 02:21
For the other ABCs, if you define the required abstract methods, you get working versions of all the mixin methods.

In the case of the Generator ABC, throw() and close() are useless empty stub methods.  In the other ABCs, we leave optional methods out entirely.  A user should expect that if isinstance(g, Generator) is true that all of the ABC methods will work.

Also, the return StopIteration in throw() doesn't seem correct.  Shouldn't it raise the exception that was thrown?

    >>> def f():
            yield x

            
    >>> g = f()
    >>> g.throw(KeyError)

    Traceback (most recent call last):
      File "<pyshell#11>", line 1, in <module>
        g.throw(KeyError)
      File "<pyshell#9>", line 1, in f
        def f():
    KeyError

Ideally, there should be a practical example of where this ABC would be useful with anything other than a real generator.
msg241700 - (view) Author: Stefan Behnel (scoder) * Date: 2015-04-21 04:57
Here's a new patch that addresses the review comments. I kept throw() and close() non-abstract and added an example to the tests instead that implements a minimal generator by inheriting these methods from the Generator base class, using it as a mixin. It only needs to implement send().

I don't think it's unreasonable to assume that there are use cases where a generator that is implemented in a class instead of a yield-function does not require special cleanup actions in its close()/throw(). Or is it *required* that a generator stops working when close() is called?

There are also iterators that raise StopIteration at some point to signal that they are temporarily exhausted, but then eventually restart returning values from their __next__() method when they have received more to return. Avoids recreating the iterator object after each exhaustion cycle.

I extended the default implementation of close() to call throw(GeneratorExit), though, because that's how the protocol is defined. Unless users implement throw(), it won't make a difference for them. And if they do, they may even get away with not reimplementing close().
msg241701 - (view) Author: Stefan Behnel (scoder) * Date: 2015-04-21 05:09
> Is it a problem that the check can't be done in a fast way from C code?

C code can still quickly special case the generator type in the positive case, and it will usually be interested in an exact type match anyway. Determining that an object is *not* (compatible with) a generator might lead to some degradation, but in most cases, at least in the interpreter core, this would be an error case anyway, so being slower here should rarely hurt.
msg241702 - (view) Author: Stefan Behnel (scoder) * Date: 2015-04-21 05:11
Next question: should inspect.isgenerator() be changed? Or should its usages be replaced by isinstance(obj, Generator) ?
msg241703 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015-04-21 05:26
I agree that there is no big reason why we should force generators to stop working after close(). Your new default implementation of close() is probably the right thing too. I added a few new comments on Reitveld.
msg241704 - (view) Author: Stefan Behnel (scoder) * Date: 2015-04-21 06:10
Good catch with the RuntimeError. Patch updated.
msg241705 - (view) Author: Stefan Behnel (scoder) * Date: 2015-04-21 06:35
> should inspect.isgenerator() be changed?

Replying to myself, one thing that speaks against this approach is that "inspect" also has the functions "getgeneratorlocals()" and "getgeneratorstate()", which depend on "gen.gi_frame". Cython could emulate that, but normal user code usually can't. It's definitely not part of the Generator protocol but an implementation detail of GeneratorType.
msg241716 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2015-04-21 15:19
Is it an option to leave inspect alone? Its definition and use of generators is very focused on the builtin implementation.

Although this means that code that wants to do the right thing but also be backwards compatible has to use something like
```
def isgen(g):
    if hasattr(collections.abc, 'Generator'):
        return isinstance(c, collections.abc.Generator)
    else:
        return inspect.isgenerator(g)
```
msg241717 - (view) Author: Stefan Behnel (scoder) * Date: 2015-04-21 15:44
Yes, code usually doesn't fall from the sky with a new CPython release. If something wants to make use of this ABC, it still has to find ways to also work with older CPythons. What would be a good fallback? A backport on PyPI?

I wouldn't even mind shipping this class with Cython and dropping it right into "collections.abc" if it's missing. Cython already contains lots of compatibility code, and we need to patch *something* in all current CPythons either way. Then your code snippet would be the right thing to do for user code (with the twist that Py2.x doesn't have "collections.abc"...).
msg241723 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2015-04-21 16:10
Realistically only Cython will care much about this, so I'm okay if Cython just monkeypatches the collections package.
msg241725 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2015-04-21 16:49
In some distant future, Numba might also care too :)
(right now, we only support compiling basic generators, i.e. no send() and no throw())
msg241736 - (view) Author: Stefan Behnel (scoder) * Date: 2015-04-21 19:39
Yes, and there are certainly other compilers and tools. However, it's going to be a short list, so if Py3.5 takes the lead, they can all just agree on the one way to do it and let the first patcher win.
msg241846 - (view) Author: Stefan Behnel (scoder) * Date: 2015-04-23 06:32
Adding a patch for the inspect docs that refers to the Right Way To Do It.
msg241956 - (view) Author: Stefan Behnel (scoder) * Date: 2015-04-24 16:54
Searching for Python code that seems to implement the Generator protocol doesn't return much:

https://code.openhub.net/search?s=%22def%20throw%28%22%20%22def%20send%28%22%20%22def%20close%28%22&pp=0&fl=Python&mp=1&ml=1&me=1&md=1&ff=1&filterChecked=true

But at least it pointed me to this:

https://hg.python.org/cpython/file/5c0247a6f98a/Lib/multiprocessing/managers.py#l922

So, wrapping generators (apparently for remote calls in this case) seems to be another use case.
msg241993 - (view) Author: Stefan Behnel (scoder) * Date: 2015-04-25 06:03
FYI, here's the patch implementation for Cython:

https://github.com/cython/cython/blob/cf63ff71c06b16c3a30facdc7859743f4cd495f6/Cython/Utility/Generator.c#L849

The only difference is that it takes care of changing "__next__" to "next" in Py2.x.
msg242054 - (view) Author: Ludovic Gasc (Ludovic.Gasc) * Date: 2015-04-26 12:58
Sorry guys to be basic for you, but if I take my "AsyncIO end-user" hat, I'm not sure to understand the potential end-user source code impacts to use Cython with Python 3.5 and AsyncIO.

In concrete terms, it's only a low-level change, Cython will monkeypatch CPython if it's missing. I can continue to use asyncio.iscoroutine() function to detect coroutines.
Or it should be better to change something in AsyncIO libs and/or end-user source code ?

With the potential async/await inclusion in Python 3.5, it should be good to know if something else is necessary to help for the Cython support.
msg242071 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2015-04-26 19:37
In the lastest patch, the close() method is now a valid mixin method.  

However, the throw() method should be made abstract because it doesn't provide the required operation (it doesn't even use the "self" argument) or it should be left out entirely (i.e. Numba only supporting next()) if the presence of throw() is not required.

Going back to Stefan's original use case, the problem being solved is that isinstance(g, types.GeneratorType) rejects regular classes that implement send(), next(), throw(), and close(), presumably because it is going to call those methods and expect that they work.

If an object tests positive for isinstance(g, collections.abc.Generator), what can we then assume about the object.  It would be weird to pass that test, see a throw() method, call it and have it do something other than raise an exception inside the generator.

    g = lazy_readline_from_connection('171.0.0.1')
    if isinstance(g, collections.abc.Generator):
        # We passed the isinstance check, now what does that
        # actually mean?  What is guaranteed to work?
        next(g)
        g.close()    # Expect this to close the connection


If a working throw() isn't actually required, then the code ought to be checking for isinstance(obj, collections.abc.Iterator) or somesuch; otherwise, was is the point of doing any checks for a generator-like object?

I don't think this patch should go in until it is capable of doing something meaningful.  Otherwise, it looks like at attempt to bluff its way past generator checks that were presumably there for a reason.
msg242072 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2015-04-26 19:40
I think the throw() method should be required.
If you don't need throw(), send() or close(), then you aren't really asking for a full-blown generator: you are asking for an iterator, so you can just check for collections.Iterator.

(PS: why is this bug assigned to Lukasz?)
msg242077 - (view) Author: Stefan Behnel (scoder) * Date: 2015-04-26 20:32
PEP 342 isn't really conclusive here as it intended to define the protocol based on the de-facto design of a yield-based generator function. Trying to abstract from that poses the question how a class based generator implementation should look like. Specifically, it is not clear in this case what "raise an exception *inside* the generator" even means.

As Antoine pointed out, there definitely has to be more than an Iterator. However, it's not clear that throw() and close() are always required for an implementation. Simple Generators might get away without special error handling and cleanup.

Similarly, a Generator may not allow (or expect) values to be passed in and only make use of close() for cleanup (and potentially also throw()).

So, there seem to be at least three use cases for the Generator protocol here:

1) an Iterator that can accept input values via send()
2) an Iterator that needs to safely do certain cleanup operations after use (throw/close)
3) a complete Generator that accepts input values and cleans up after it

Given that there used to be only one implementation, I think we can assume that a lot of code depends on the complete protocol. Some recipients will certainly be aware of the exact subset of the protocol that the Generator they have in their hands makes use of, but if they don't, they'll just have to assume it supports everything and requires proper cleanup on errors and on termination.

Therefore, I think it's important to cover the complete protocol in the Generator ABC. I also think it's helpful to not require users to override throw() in a subclass, as they might not need it. "Throwing an exception inside of them" might simply not have a meaning for them. If they do need it, however, then the current implementation might help them to properly raise the exception, given that the 3-argument raise statement is no longer available in Py3.

Both reasons (whether you need throw() or not) make me think that it should not be abstract. The same applies to close(), which has a meaningful implementation now that subclasses can make use of in order to avoid having to implement both close() and throw().

And yes, this is about sneaking past generator type checks because most of them are actually *not* there for a reason. Asyncio is just one example.
msg242087 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2015-04-27 00:07
> Therefore, I think it's important to cover the complete protocol 
> in the Generator ABC. I also think it's helpful to not require
> users to override throw() in a subclass, as they might not need it.

Sorry, but I think you're fighting the fundament nature of what the ABCs are supposed to do and I object to the patch going in as-is.  

Please either 
1) drop the throw() method entirely or
2) make throw an abstractmethod()

If it is an abstractmethod, the user is free to implement a throw() method that raises a NotImplementedError, but at least they will do so consciously rather than having it come-up expectedly.

When I teach engineers how to use the collections ABCs, they rely on the rules:

 * when building a class that inherits from an ABC,
   if you supply the required abstract methods,
   then the mixin method just work.

 * if an object passes the isinstance check, 
   then it is fair game to call any of the listed
   methods an expect it to work.

Until now, those rules were followed by all the ABCs.  I really don't want to break the basic contract of what the ABC is supposed to mean.
msg242094 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2015-04-27 03:43
I'm with Raymond.
msg242102 - (view) Author: Stefan Behnel (scoder) * Date: 2015-04-27 06:26
> Please either 
> 1) drop the throw() method entirely or
> 2) make throw an abstractmethod()

Ok, as I already said, I think it's important to provide the complete protocol as code will usually expect that. Also, close() has a helpful implementation, but it depends on throw(). Thus, I'll go with 2) and make throw() abstract. It's easy enough to call super(), but then it's explicit. I agree that that's better here.

Patch updated.
msg242104 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2015-04-27 07:33
The latest patch looks good overall.   

Łukasz, assigning back to you.
msg242445 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2015-05-03 01:51
Yeah, looks good -- Łuke, can you commit this?
msg242612 - (view) Author: Łukasz Langa (lukasz.langa) * (Python committer) Date: 2015-05-05 19:12
Yup, will do.
msg242794 - (view) Author: Stefan Behnel (scoder) * Date: 2015-05-09 03:34
This is blocking issue 24017 (async/await syntax).
msg242796 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2015-05-09 03:47
Ask Yury if he'll commit it for you. It's ready.
msg242797 - (view) Author: Roundup Robot (python-dev) Date: 2015-05-09 05:07
New changeset ba5d7041e2f5 by Raymond Hettinger in branch 'default':
Issue #24018: Add a collections.Generator abstract base class.
https://hg.python.org/cpython/rev/ba5d7041e2f5
msg242798 - (view) Author: Stefan Behnel (scoder) * Date: 2015-05-09 05:28
Thanks! Minor grouch: it should say "collections.*abc*.Generator" in the NEWS entry.
msg242826 - (view) Author: Roundup Robot (python-dev) Date: 2015-05-09 18:37
New changeset f7cc54086cd2 by Guido van Rossum in branch 'default':
Fix news entry for issue 24018.
https://hg.python.org/cpython/rev/f7cc54086cd2
msg242827 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2015-05-09 18:38
Fixed.

On Fri, May 8, 2015 at 10:28 PM, Stefan Behnel <report@bugs.python.org>
wrote:

>
> Stefan Behnel added the comment:
>
> Thanks! Minor grouch: it should say "collections.*abc*.Generator" in the
> NEWS entry.
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue24018>
> _______________________________________
>
History
Date User Action Args
2015-05-10 21:25:37yselivanovlinkissue24017 dependencies
2015-05-09 18:38:11gvanrossumsetmessages: + msg242827
2015-05-09 18:37:45python-devsetmessages: + msg242826
2015-05-09 05:28:47scodersetmessages: + msg242798
2015-05-09 05:08:15rhettingersetstatus: open -> closed
resolution: fixed
stage: resolved
2015-05-09 05:07:32python-devsetnosy: + python-dev
messages: + msg242797
2015-05-09 04:52:56rhettingersetassignee: lukasz.langa -> rhettinger
2015-05-09 03:47:38gvanrossumsetmessages: + msg242796
2015-05-09 03:34:18scodersetmessages: + msg242794
2015-05-05 19:12:23lukasz.langasetmessages: + msg242612
2015-05-03 01:51:38gvanrossumsetmessages: + msg242445
2015-04-27 07:33:57rhettingersetassignee: rhettinger -> lukasz.langa
messages: + msg242104
2015-04-27 06:26:11scodersetfiles: + generator_abc.patch

messages: + msg242102
2015-04-27 03:43:59gvanrossumsetmessages: + msg242094
2015-04-27 00:07:06rhettingersetassignee: lukasz.langa -> rhettinger
messages: + msg242087
2015-04-26 20:32:06scodersetmessages: + msg242077
2015-04-26 19:40:43pitrousetmessages: + msg242072
2015-04-26 19:37:49rhettingersetmessages: + msg242071
2015-04-26 12:58:41Ludovic.Gascsetnosy: + Ludovic.Gasc
messages: + msg242054
2015-04-25 06:03:07scodersetmessages: + msg241993
2015-04-24 16:54:35scodersetmessages: + msg241956
2015-04-23 06:37:11scodersetfiles: + inspect_docs.patch
2015-04-23 06:36:51scodersetfiles: - inspect_docs.patch
2015-04-23 06:32:33scodersetfiles: + inspect_docs.patch

messages: + msg241846
2015-04-21 19:39:16scodersetmessages: + msg241736
2015-04-21 16:49:09pitrousetmessages: + msg241725
2015-04-21 16:10:33gvanrossumsetmessages: + msg241723
2015-04-21 15:44:02scodersetmessages: + msg241717
2015-04-21 15:19:31gvanrossumsetmessages: + msg241716
2015-04-21 06:35:10scodersetmessages: + msg241705
2015-04-21 06:10:31scodersetfiles: + generator_abc.patch

messages: + msg241704
2015-04-21 05:26:02martin.pantersetmessages: + msg241703
2015-04-21 05:11:02scodersetmessages: + msg241702
2015-04-21 05:09:06scodersetmessages: + msg241701
2015-04-21 04:57:55scodersetfiles: + generator_abc.patch

messages: + msg241700
2015-04-21 02:21:57rhettingersetnosy: + rhettinger
messages: + msg241692
2015-04-20 23:14:44martin.pantersetnosy: + martin.panter
2015-04-20 20:45:29lukasz.langasetassignee: lukasz.langa

nosy: + lukasz.langa
2015-04-20 20:11:22pitrousetnosy: + pitrou
messages: + msg241683
2015-04-20 20:01:44scodersetfiles: + generator_abc.patch

messages: + msg241682
2015-04-20 20:01:15scodersetfiles: - generator_abc.patch
2015-04-20 19:58:05scodersetfiles: + generator_abc.patch

messages: + msg241681
2015-04-20 19:57:12scodersetfiles: - generator_abc.patch
2015-04-20 19:40:48gvanrossumsetmessages: + msg241677
2015-04-20 19:36:25scodercreate