classification
Title: Killing asyncio subprocesses on timeout?
Type: behavior Stage: resolved
Components: asyncio Versions: Python 3.6
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: asvetlov, cjrh, dontbugme, terry.reedy, yselivanov
Priority: normal Keywords:

Created on 2019-12-06 17:41 by dontbugme, last changed 2020-02-24 15:37 by asvetlov. This issue is now closed.

Files
File name Uploaded Description Edit
subprocess_timeout.py dontbugme, 2019-12-06 17:41
Messages (4)
msg357930 - (view) Author: (dontbugme) Date: 2019-12-06 17:41
I'm trying to use asyncio.subproceess and am having difficulty killing the subprocesses after timeout. My use case is launching processes that hold on to file handles and other exclusive resources, so subsequent processes can only be launched after the first ones are fully stopped.


The documentation on https://docs.python.org/3/library/asyncio-subprocess.html#asyncio.asyncio.subprocess.Process say there is no timeout-parameter and suggests using wait_for() instead.
I tried this but it's kind of a footgun because the wait_for() times out but the process still lives on in the background. See Fail(1) and Fail(2) in attached test1().


To solve this i tried to catch the CancelledError and in the exception handler kill the process myself. While this worked it's also semi dangerous because it takes some time for the process to get killed and the wait() after kill() runs in the background as some kind of detached task. See Fail(3) in attached test2().
This i can sortof understand because after TimeoutError something would have to block for wait() to actually finish and this is impossible.

After writing this i feel myself there is no good solution for Fail#3 because again, timeouts can't be blocking. Maybe some warning in the documentation would be appropriate for Fail(1+2) because the suggestion in the documentation right now is quite misleading, the wait_for()-alternative to timeout-parameter does not behave like the timeout-parameter in ordinary subprocess.Popen.wait()
msg357952 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2019-12-06 22:56
There have been changes to asyncio since 3.6.  I you have not, please check the 3.8 version to see if any of them are relevant to your issue.
msg361209 - (view) Author: Caleb Hattingh (cjrh) * Date: 2020-02-02 04:54
@dontbugme This is a very old problem with threads and sub-processes.  In the general case (cross-platform, etc) it is difficult to kill threads and sub-processes from the outside. The traditional solution is to somehow send a message to the thread or subprocess to tell it to finish up. Then, you have to write the code running the thread or subprocess to notice such a message, and then shut itself down. With threads, the usual solution is to pass `None` on a queue, and have the thread pull data off that queue. When it receives a `None` it knows that it's time to shut down, and the thread terminates itself. This model can also be used with the multiprocessing module because there is a Queue instance provided there that works across the inter-process boundary.  Unfortunately, we don't have that feature in the asyncio subprocess machinery yet. For subprocesses, there are three options available:

1) Send a "shutdown" sentinal via STDIN (asyncio.subprocess.Process.communicate)
2) Send a process signal (via asyncio.subprocess.Process.send_signal)
3) Pass messages between main process and child process via socket connections

My experience has been that (3) is the most practical, esp. in a cross-platform sense. The added benefit of (3) is that this also works, unchanged, if the "worker" process is running on a different machine. There are probably things we can do to make (3) easier. Not sure.

I don't know if my comment helps, but I feel your pain. You are correct that `wait_for` is not an alternative to `timeout` because there is no actual cancellation that happen.
msg362594 - (view) Author: Andrew Svetlov (asvetlov) * (Python committer) Date: 2020-02-24 15:37
asyncio doesn't kill subprocess by timeout, that's why test1() doesn't work.
The kill is done by signal sending which is asynchronous. That's why test2 may fail at "FAIL(3)" point sometimes. 
1-second sleep is enough to stop this case, but maybe not enough, say, under high load and for much more complex program than "sleep".

That's how OS works, there is nothing specific for asyncio itself.
You can observe the same using an old good sync approach, written with any programming language. 
Nothing to fix here.
History
Date User Action Args
2020-02-24 15:37:49asvetlovsetstatus: open -> closed
resolution: not a bug
messages: + msg362594

stage: resolved
2020-02-02 04:54:57cjrhsetnosy: + cjrh
messages: + msg361209
2019-12-06 22:56:09terry.reedysetnosy: + terry.reedy
messages: + msg357952
2019-12-06 17:41:00dontbugmecreate