This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author hniksic
Recipients asvetlov, hniksic, yselivanov
Date 2018-05-16.06:48:52
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1526453333.0.0.682650639539.issue33533@psf.upfronthosting.co.za>
In-reply-to
Content
Judging by questions on the StackOverflow python-asyncio tag[1][2], it seems that users find it hard to understand how to use as_completed correctly. I have identified three issues:

* It's somewhat sparingly documented.

A StackOverflow user ([2]) didn't find it obvious that it runs the futures in parallel. Unless one is already aware of the meaning, the term "as completed" could suggest that they are executed and completed sequentially.

* Unlike its concurrent.futures counter-part, it's non-blocking.

This sounds like a good idea because it's usable from synchronous code, but it means that the futures it yields aren't completed, you have to await them first. This is confusing for a function with "completed" in the name, and is not the case with concurrent.futures.as_completed, nor with other waiting functions in asyncio (gather, wait, wait_for).

* It yields futures other than those that were passed in.

This prevents some usual patterns from working, e.g. associating the results with context data, such as Python docs itself uses for concurrent.futures.as_completed in https://docs.python.org/3/library/concurrent.futures.html#threadpoolexecutor-example .  See SO question [1] for a similar request in asyncio.


Here is my proposal to address the issues.

I believe the usage problems stem from as_completed predating the concept of async iterators. If we had async iterators back then, as_completed would have been an obvious candidate to be one. In that case it could be both "blocking" (but not for the event loop) and return the original futures. For example:

async def as_completed2(fs):
    pending = set(map(asyncio.ensure_future(fs)))
    while pending:
        done, pending = await asyncio.wait(pending, return_when=asyncio.FIRST_COMPLETED)
        yield from done

(It is straightforward to add support for a timeout argument.)

I propose to deprecate asyncio.as_completed and advertise the async-iterator version like the one presented here - under a nicer name, such as as_done(), or as_completed_async().



[1] https://stackoverflow.com/questions/50028465/python-get-reference-to-original-task-after-ordering-tasks-by-completion
[2] https://stackoverflow.com/questions/50355944/yield-from-async-generator-in-python-asyncio
History
Date User Action Args
2018-05-16 06:48:53hniksicsetrecipients: + hniksic, asvetlov, yselivanov
2018-05-16 06:48:53hniksicsetmessageid: <1526453333.0.0.682650639539.issue33533@psf.upfronthosting.co.za>
2018-05-16 06:48:52hniksiclinkissue33533 messages
2018-05-16 06:48:52hniksiccreate