classification
Title: asyncio.gather does not cancel tasks if one fails
Type: behavior Stage: patch review
Components: asyncio, Library (Lib) Versions: Python 3.7, Python 3.6
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Andrew Lytvyn, asvetlov, gvanrossum, kwarunek, yselivanov
Priority: normal Keywords: patch

Created on 2017-09-13 14:53 by Andrew Lytvyn, last changed 2017-09-22 15:41 by gvanrossum.

Pull Requests
URL Status Linked Edit
PR 3597 open kwarunek, 2017-09-15 11:16
Messages (9)
msg302079 - (view) Author: Andrew Lytvyn (Andrew Lytvyn) Date: 2017-09-13 14:53
If you do not await gather, then if one of gather inner coroutines fails, others keep working, but they should not.


```python
import asyncio
import logging

logging.basicConfig(level=logging.DEBUG)

loop = asyncio.get_event_loop()

async def success_coro(seconds):
    await asyncio.sleep(seconds)
    print(seconds)


async def failed_coro(seconds):
    await asyncio.sleep(seconds)
    print(seconds)
    raise ZeroDivisionError

coros = [
    success_coro(2),
    failed_coro(3),
    success_coro(5),
]

async def waiter():
    await asyncio.gather(*coros)

asyncio.ensure_future(waiter())

loop.run_forever()
```
-------------------------------------------------------------
Console:
2
3
ERROR:asyncio:Task exception was never retrieved
future: <Task finished coro=<waiter() done, defined at tst.py:72> exception=ZeroDivisionError()>
Traceback (most recent call last):
  File "tst.py", line 73, in waiter
    await asyncio.gather(*coros)
  File "tst.py", line 64, in failed_coro
    raise ZeroDivisionError
ZeroDivisionError
5
-------------------------------------------------------------

Expected behavior that 5 should not be printed.
msg302186 - (view) Author: Andrew Svetlov (asvetlov) * (Python committer) Date: 2017-09-14 17:36
Yuri, it looks like a serious bug.

I expected `success_coro(5)` cancelling but see print out.
msg302195 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2017-09-14 18:32
Andrew, looks like it.  You can take a look if you want (I don't have time to work on this right now).
msg302243 - (view) Author: Krzysztof Warunek (kwarunek) * Date: 2017-09-15 11:22
I hit the same thing. Also there is a PR patch
msg302720 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2017-09-21 20:41
I looked at the PR, and I'm not so sure about this change.  In short, it can be viewed as a backwards incompatible change to asyncio.gather.

Guido, what do you think?
msg302735 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2017-09-22 03:08
I'm afraid I no longer have all the details of this design in my head, and I have no idea what the fix does (and no time to read up on everything).

The OP says "If you do not await gather" -- what happens if you *do* await it? Do the tasks then get killed?

It seems strange to both cancel the task *and* set its exception.

The docstring for gather() seems to be pretty clear that cancellation and failure of one task should not affect other tasks. So this argues against the PR.
msg302737 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2017-09-22 04:36
> I'm afraid I no longer have all the details of this design in my head, and I have no idea what the fix does (and no time to read up on everything).

Let's say we have three tasks: t1, t2, t3.  Then we use gather on them:

   await gather(t1, t2, t3)

Let's say that t2 finishes first with an exception.

Currently, both t1 and t3 would continue their execution even though gather throws the t2 exception.

The PR for this issue makes 'gather' to cancel both t1 and t3 as soon as t2 throws an exception.

The question is: I see the point of the PR, but I'm afraid that it's too late to change the semantics of asyncio.gather.  Instead we should consider adding new TaskGroup API (we discussed it briefly on the sprint).
msg302743 - (view) Author: Andrew Lytvyn (Andrew Lytvyn) Date: 2017-09-22 12:18
Guido, look. The point is that if you change run_forever with run_until_complete, then behavior changes: success_coro(5) will not be executed.

I think that it's strange that behavior differs depending on entrypoint: run_forever or run_untill_complete
msg302751 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2017-09-22 15:41
> if you change run_forever with run_until_complete, then behavior changes: success_coro(5) will not be executed

Oh, that's a red herring. The reason is that the event loop stops when you use run_complete(), but the execution of success_coro(5) is still pending, and when you resume the loop (e.g. call run_until_complete() on the same loop with a different task) it will run.

So the issue is really the semantics of gather(), and I agree with Yury that we designed it intentionally that way, and if you want different behavior we'll have to provide a different API.
History
Date User Action Args
2017-09-22 15:41:27gvanrossumsetmessages: + msg302751
2017-09-22 12:18:40Andrew Lytvynsetmessages: + msg302743
2017-09-22 04:36:42yselivanovsetmessages: + msg302737
2017-09-22 03:08:34gvanrossumsetmessages: + msg302735
2017-09-21 20:41:27yselivanovsetnosy: + gvanrossum
messages: + msg302720
2017-09-15 11:22:18kwaruneksetnosy: + kwarunek
messages: + msg302243
2017-09-15 11:16:31kwaruneksetkeywords: + patch
stage: needs patch -> patch review
pull_requests: + pull_request3588
2017-09-14 18:32:41yselivanovsetmessages: + msg302195
2017-09-14 17:36:26asvetlovsetstage: needs patch
messages: + msg302186
components: + Library (Lib)
versions: + Python 3.7
2017-09-13 14:53:37Andrew Lytvyncreate