This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: asyncio create_task() odd behavior
Type: behavior Stage: resolved
Components: asyncio Versions: Python 3.9
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: asvetlov, yanghao.py, yselivanov, zach.ware
Priority: normal Keywords:

Created on 2021-04-05 21:03 by yanghao.py, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (6)
msg390260 - (view) Author: Yanghao Hua (yanghao.py) Date: 2021-04-05 21:03
This code runs perfectly fine with expected behavior: two tasks created, executed in an interleaved manner:

from time import time
from asyncio import run, create_task, sleep

async def task(name, n): 
    for i in range(n):
        print(f"task-{name}: ", i, time())
        await sleep(1)

async def top():
    t0 = create_task(task("T0", 10))
    t1 = create_task(task("T1", 10))
    print("starting tasks ...")
    await t0
    await t1

run(top())

Output:
starting tasks ...
task-T0:  0 1617656271.6513114
task-T1:  0 1617656271.6513336
task-T0:  1 1617656272.6526577
task-T1:  1 1617656272.652813
task-T0:  2 1617656273.654187
task-T1:  2 1617656273.6543217
task-T0:  3 1617656274.655706
task-T1:  3 1617656274.6558387
task-T0:  4 1617656275.65722
task-T1:  4 1617656275.657355
task-T0:  5 1617656276.6587365
task-T1:  5 1617656276.6588728
task-T0:  6 1617656277.660276
task-T1:  6 1617656277.6604114
task-T0:  7 1617656278.6617858
task-T1:  7 1617656278.66192
task-T0:  8 1617656279.6633058
task-T1:  8 1617656279.6634388
task-T0:  9 1617656280.6648436
task-T1:  9 1617656280.6649704

However, with slightly modified `async def top()`, things become executing sequentially:

async def top():
    print("starting tasks ...")
    await create_task(task("T0", 10))
    await create_task(task("T1", 10))

Output:
starting tasks ...
task-T0:  0 1617656306.1343822
task-T0:  1 1617656307.1357212
task-T0:  2 1617656308.1369958
task-T0:  3 1617656309.1384225
task-T0:  4 1617656310.1398354
task-T0:  5 1617656311.1412706
task-T0:  6 1617656312.1427014
task-T0:  7 1617656313.1441336
task-T0:  8 1617656314.1455553
task-T0:  9 1617656315.1468768
task-T1:  0 1617656316.1482618
task-T1:  1 1617656317.1496553
task-T1:  2 1617656318.151089
task-T1:  3 1617656319.1525192
task-T1:  4 1617656320.153974
task-T1:  5 1617656321.1554224
task-T1:  6 1617656322.1568594
task-T1:  7 1617656323.1582792
task-T1:  8 1617656324.1597185
task-T1:  9 1617656325.1611636

This breaks the behavior expectation, where created tasks should have been executing in parallel. It seems if a created task is immediately awaited, it is not returning to the top() immediately, and instead, it executes the task and waited until it finishes.

Is this a bug, or did I miss something?

Thank you.
msg390262 - (view) Author: Zachary Ware (zach.ware) * (Python committer) Date: 2021-04-05 21:23
You missed something :)

By immediately awaiting the result of `create_task`, you're synchronizing thing.  It's the same as just rearranging the lines of the first example to:

t0 = create_task(task("T0", 10))
print("starting tasks ...")
await t0
t1 = create_task(task("T1", 10))
await t1

Basically, `t1` simply doesn't exist yet when you ask `t0` to run to completion.
msg390294 - (view) Author: Yanghao Hua (yanghao.py) Date: 2021-04-06 08:05
This unfortunately contradicts to all the other concurrency semantics
I know, I have myself implemented various event-driven schedulers and
none of them would behave like this.

Consider an OS as the simplest example, you have a main thread that
starts many child threads, there shouldn't be a single case (not even
a possibility) that starting a child thread would block the main
thread.

And coming back to this particular example, semantically equivalent
code, producing completely different behavior for me is a major bug.
The correct way to implement "await a_task" should be like
"process().run()", rather than waiting for completion. After all,
"await t0" did *NOT* wait until t0's completion! but if you wrote
"await create_task()" it does wait ...! seems not right to me.

I strongly ask for a second opinion before we close this bug ...

On Mon, Apr 5, 2021 at 11:23 PM Zachary Ware <report@bugs.python.org> wrote:
>
>
> Zachary Ware <zachary.ware@gmail.com> added the comment:
>
> You missed something :)
>
> By immediately awaiting the result of `create_task`, you're synchronizing thing.  It's the same as just rearranging the lines of the first example to:
>
> t0 = create_task(task("T0", 10))
> print("starting tasks ...")
> await t0
> t1 = create_task(task("T1", 10))
> await t1
>
> Basically, `t1` simply doesn't exist yet when you ask `t0` to run to completion.
>
> ----------
> nosy: +zach.ware
> resolution:  -> not a bug
> status: open -> pending
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <https://bugs.python.org/issue43736>
> _______________________________________
msg390298 - (view) Author: Yanghao Hua (yanghao.py) Date: 2021-04-06 08:31
Poking around a bit more revealed another interesting behavior and now I understand what went wrong with asyncio.create_task() :-)

In the example I show, you don't have to "await t1", you only have to "await t0/t1", e.g. await on one of them, and both starts executing. And it seems the synchronized call to "create_task()" alone, already created the task and placed it in a runnable queue. This is wrong!!!

Consider in multiprocessing.Process(), do you place the new process in a runnable queue with a call to Process()? NO, you don't. You merely create a process object out of it. And it is Process().run() that actually flags the process is ready to run and let the OS kernel actually create that process.

By analogy, calling synchronized "create_task()" shouldn't do more than create a coroutine task object. And "await task" should not simply blocking and waiting for completion of the task, but rather it place task in the runnable queue. Also, the current behavior not only awaits on t0 to complete, it also awaits t1 to complete! completely contradictrary when you look at the code and it is simply "await t0" and t1 was not even in the picture!

The example works at all because if both t0 and t1 are created with create_task(), it already creating side-effects and are placed in the running queue. That is like a user-mode code is causing a side-effect in the OS kernel. "await task" is the equivalent of making the actual OS kernel syscall to get things REALLY started ...
msg390299 - (view) Author: Andrew Svetlov (asvetlov) * (Python committer) Date: 2021-04-06 08:45
1. Please consider `await` as a 'yield point': the point where the current task may be suspended to get other tasks a chance to be executed. 
It can be any `await`, not necessarily waiting for a task.  Just a point where asyncio event loop gives a chance to roll an iteration.

Note: no suspension happens if the argument is 'ready' already.

2. If we design asyncio from scratch we can consider the separation of task creation and start. Unfortunately, the ship has sailed many years ago. The behavior cannot be changed without breaking virtually every asyncio program, sorry.
msg390300 - (view) Author: Yanghao Hua (yanghao.py) Date: 2021-04-06 09:04
by the way, another feedback, of course, curio works the way it
should, no matter where do you await ;-)

Now I start to understand why David Beazley has to create curio.

Python asyncio team should really really think about it carefully,
please. You don't have to modify create_task(), at least a sane
version (maybe create_task_sane()) could be provided.

On Tue, Apr 6, 2021 at 10:46 AM Andrew Svetlov <report@bugs.python.org> wrote:
>
>
> Andrew Svetlov <andrew.svetlov@gmail.com> added the comment:
>
> 1. Please consider `await` as a 'yield point': the point where the current task may be suspended to get other tasks a chance to be executed.
> It can be any `await`, not necessarily waiting for a task.  Just a point where asyncio event loop gives a chance to roll an iteration.
>
> Note: no suspension happens if the argument is 'ready' already.
>
> 2. If we design asyncio from scratch we can consider the separation of task creation and start. Unfortunately, the ship has sailed many years ago. The behavior cannot be changed without breaking virtually every asyncio program, sorry.
>
> ----------
> resolution: remind -> not a bug
> stage:  -> resolved
> status: open -> closed
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <https://bugs.python.org/issue43736>
> _______________________________________
History
Date User Action Args
2022-04-11 14:59:43adminsetgithub: 87902
2021-04-06 09:04:16yanghao.pysetmessages: + msg390300
2021-04-06 08:45:51asvetlovsetstatus: open -> closed
resolution: remind -> not a bug
messages: + msg390299

stage: resolved
2021-04-06 08:31:33yanghao.pysetresolution: not a bug -> remind
messages: + msg390298
2021-04-06 08:05:49yanghao.pysetstatus: pending -> open

messages: + msg390294
2021-04-05 21:23:45zach.waresetstatus: open -> pending

nosy: + zach.ware
messages: + msg390262

resolution: not a bug
2021-04-05 21:03:00yanghao.pycreate