This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: concurrent.futures.ProcessPoolExecutor can deadlock when tcmalloc is used
Type: resource usage Stage: patch review
Components: Library (Lib) Versions: Python 3.11, Python 3.10, Python 3.9
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: gregory.p.smith Nosy List: gregory.p.smith, methane, pitrou, yilei
Priority: normal Keywords: 3.9regression, patch

Created on 2022-01-21 23:32 by yilei, last changed 2022-04-11 14:59 by admin.

Pull Requests
URL Status Linked Edit
PR 30847 open gregory.p.smith, 2022-01-24 08:55
Messages (7)
msg411208 - (view) Author: Yilei Yang (yilei) * Date: 2022-01-21 23:32
When Python is built and linked with tcmalloc, using ProcessPoolExecutor may deadlock. Here is a reproducible example:

$ cat t.py
from concurrent import futures
import sys
def work(iteration, item):
  sys.stdout.write(f'working: iteration={iteration}, item={item}\n')
  sys.stdout.flush()
for i in range(0, 10000):
    with futures.ProcessPoolExecutor(max_workers=2) as executor:
        executor.submit(work, i, 1)
        executor.submit(work, i, 2)

$ python t.py
working: iteration=0, item=1
working: iteration=0, item=2
working: iteration=1, item=1
working: iteration=1, item=2
...
working: iteration=3631, item=1
working: iteration=3631, item=2
<hang here>

The child process fails to finish. It's more likely to reproduce when the system is busy.

With some bisect search internally, this commit https://github.com/python/cpython/commit/1ac6e379297cc1cf8acf6c1b011fccc7b3da2cbe "triggered" the deadlock threshold with tcmalloc.
msg411213 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2022-01-21 23:59
For context, the fundamental problem is that os.fork() is unsafe to use in any process with threads.  concurrent/futures/process.py violates this.  So long as its design involves a thread, it can never be guaranteed not to deadlock.

POSIX forbids execution of most code after fork() has been called.  Returning to the CPython interpreter after os.fork() in the child process is never safe unless the parent process had no threads at fork time.

The change that triggered the issue moved from doing all of the os.fork() calls for the workers up front *before the thread was spawned* to doing them on demand after the thread was running.

The bugfix for this is simply a rollback to revert the bpo-39207 change. It was a performance enhancement, but it causes a new deadlock in code that didn't already have one when thread tuned malloc implementations are used.

The only way to safely launch worker processes on demand is to spawn a worker launcher process spawned prior to any thread creation that remains idle, with a sole job of spawn new worker processes for us. That sounds complicated. That'd be a feature. Lets go with the bugfix first.
msg411451 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2022-01-24 07:30
https://bugs.python.org/issue44733 for 3.11 attempts to build upon the dynamic spawning ability and will need reverting unless a non-thread+fork implementation is provided.
msg411543 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2022-01-25 01:55
> The only way to safely launch worker processes on demand is to spawn a worker launcher process spawned prior to any thread creation that remains idle, with a sole job of spawn new worker processes for us. That sounds complicated. That'd be a feature. Lets go with the bugfix first.

fork is not the only way to launch worker process. We have spawn. And sapwn is the default for macOS since Python 3.8.

Simple reverting seems not good for macOS users, since they need to pay cost for both of pre-spawning and spawn.
Can't we just pre-spawn only when fork is used?
msg411544 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2022-01-25 02:25
It sounds like we need to introspect the mp_context= passed to ProcessPoolExecutor (and it's default when None) to raise an error when max_tasks_per_child is incompatible with it.
msg411545 - (view) Author: Gregory P. Smith (gregory.p.smith) * (Python committer) Date: 2022-01-25 02:28
and similarly, the dynamic spawning could be kept when the mp_context is spawn (and possibly forkserver).
msg411574 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2022-01-25 11:06
Indeed, it seems this should only be disabled when the "fork" model is used, especially as the optimization is mostly valuable when spawning a worker is expensive.
History
Date User Action Args
2022-04-11 14:59:55adminsetgithub: 90622
2022-01-25 11:06:46pitrousetnosy: + pitrou
messages: + msg411574
2022-01-25 02:28:41gregory.p.smithsetmessages: + msg411545
2022-01-25 02:25:09gregory.p.smithsetmessages: + msg411544
2022-01-25 01:55:56methanesetnosy: + methane
messages: + msg411543
2022-01-24 08:55:43gregory.p.smithsetkeywords: + patch
stage: needs patch -> patch review
pull_requests: + pull_request29029
2022-01-24 07:30:22gregory.p.smithsetmessages: + msg411451
2022-01-21 23:59:53gregory.p.smithsetassignee: gregory.p.smith
type: resource usage

keywords: + 3.9regression
nosy: + gregory.p.smith
messages: + msg411213
stage: needs patch
2022-01-21 23:32:53yileicreate