classification
Title: asyncio.create_subprocess_exec throws RuntimeError yet still executes subprogram
Type: Stage:
Components: asyncio Versions: Python 3.9
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Clint Olsen, asvetlov, yselivanov
Priority: normal Keywords:

Created on 2022-01-10 18:55 by Clint Olsen, last changed 2022-01-12 17:27 by Clint Olsen.

Files
File name Uploaded Description Edit
example Clint Olsen, 2022-01-10 18:55
Messages (5)
msg410242 - (view) Author: Clint Olsen (Clint Olsen) Date: 2022-01-10 18:55
When stress testing my code in a process-limited environment I found that despite throwing an exception, it appears the process still executes. Attempting to catch/retry results in a duplicate.

Attached is a script that I was able to repro the problem on Linux. I cannot get it to behave similarly on MacOS.

The exception looks like:

/bin/sh: fork: retry: Resource temporarily unavailable
Traceback (most recent call last):
  File "/home/colsen/async/./example", line 16, in run
    proc = await asyncio.create_subprocess_exec('/bin/sh','-c', f'/bin/echo {_id} > {_uuid}.out')
  File "/home/utils/Python/3.9/3.9.7-20211101/lib/python3.9/asyncio/subprocess.py", line 236, in create_subprocess_exec
    transport, protocol = await loop.subprocess_exec(
  File "/home/utils/Python/3.9/3.9.7-20211101/lib/python3.9/asyncio/base_events.py", line 1661, in subprocess_exec
    transport = await self._make_subprocess_transport(
  File "/home/utils/Python/3.9/3.9.7-20211101/lib/python3.9/asyncio/unix_events.py", line 202, in _make_subprocess_transport
    watcher.add_child_handler(transp.get_pid(),
  File "/home/utils/Python/3.9/3.9.7-20211101/lib/python3.9/asyncio/unix_events.py", line 1381, in add_child_handler
    thread.start()
  File "/home/utils/Python/3.9/3.9.7-20211101/lib/python3.9/threading.py", line 892, in start
    _start_new_thread(self._bootstrap, ())
RuntimeError: can't start new thread

So, this script ended up producing 21 output files (duplicate on iteration 18).

I need a way to catch these errors, pause, and retry when they are recoverable.
msg410285 - (view) Author: Andrew Svetlov (asvetlov) * (Python committer) Date: 2022-01-11 10:32
What do you mean by *process-limited environment*?

It is very unusual configuration IMHO.
msg410321 - (view) Author: Clint Olsen (Clint Olsen) Date: 2022-01-11 18:51
In a multi-user environment, you should not expect to be able to spawn infinite processes. In some cases system administrators have reduced the thresholds to values that guarantee a machine is still responsive under heavy load. See limits.conf(5) as an example.

I just used ulimit to synthesize this situation.

Does this make sense?
msg410391 - (view) Author: Andrew Svetlov (asvetlov) * (Python committer) Date: 2022-01-12 11:19
Yes, your environment is clear.

As I see from traceback, you are stuck not with new process creation but with new thread starting.

The thread is used for waiting for the started process to finish.
It is the default configuration.
Maybe you need an alternative ChildWatcher setup by `asyncio.get_event_loop_policy().set_child_watcher(watcher)` call.

A relatively fresh Linux can use `PidfdChildWatcher`. `MultiLoopChildWatcher` is cross-platform.
`SafeChildWatcher` is the oldest and simplest implementation.

Please choose one and see how it works for you.
msg410417 - (view) Author: Clint Olsen (Clint Olsen) Date: 2022-01-12 17:27
Yes, I tried FastChildWatcher and I've had much better luck with that. If any exception gets thrown we _know_ for certain that the process wasn't created.
History
Date User Action Args
2022-01-12 17:27:53Clint Olsensetmessages: + msg410417
2022-01-12 11:19:43asvetlovsetmessages: + msg410391
2022-01-11 18:51:55Clint Olsensetmessages: + msg410321
2022-01-11 10:32:10asvetlovsetmessages: + msg410285
2022-01-10 18:55:53Clint Olsencreate