classification
Title: Module level map & submit for concurrent.futures
Type: Stage:
Components: Versions:
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: bquinlan, josh.r, ncoghlan
Priority: normal Keywords:

Created on 2015-03-18 03:20 by ncoghlan, last changed 2019-05-07 01:23 by josh.r.

Messages (3)
msg238373 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2015-03-18 03:20
Currently, concurrent.futures requires you to explicitly create and manage the lifecycle of a dedicated executor to handle multithreaded and multiprocess dispatch of concurrent activities.

It may be beneficial to provide module level tmap(), pmap(), tsubmit() and psubmit() APIs that use a default ThreadExecutor and/or ProcessExecutor instance to provide concurrent execution, with reasonable default settings for the underlying thread & process pools.

(Longer names like map_thread, and map_process would also be possible, but tmap & pmap are much easier to type and seem sufficient for mnemonic purposes)

This would allow usage like:

  done, not_done = wait(tmap(func, data))

and:

  for f in as_completed(tmap(func, data)):
    result = f.result()

Ways to explicitly set and get the default thread and process pools would also need to be provided, perhaps based on the design of the event loop management interface in the asyncio module.
msg341634 - (view) Author: Brian Quinlan (bquinlan) * (Python committer) Date: 2019-05-06 19:58
Using a default executor could be dangerous because it could lead to deadlocks. For example:

mylib.py
--------

def my_func():
  tsubmit(...)
  tsubmit(...)
  tsubmit(somelib.some_func, ...)


somelib.py
----------

def some_func():
  tsubmit(...) # Potential deadlock if no more free threads.
msg341667 - (view) Author: Josh Rosenberg (josh.r) * (Python triager) Date: 2019-05-07 01:23
For the process based versions, it makes it too easy to accidentally fork bomb yourself, since each process that call psubmit would implicitly created another #CPUs workers of its own, so a process based version Brian's case with a mere four top-level psubmits each of which performs a single psubmit of its own would logically involve 1 + #CPUs + #CPU**2 total processes, without the user ever explicitly asking for them.

Especially in a fork-based context, this could easily trigger the Linux OOM-killer if the parent process is of even moderate size, since a four core system with a 1 GB parent process would suddenly be asking for up to 21 GB of memory. Most of that is only potentially used, given COW behavior, but the OOM killer assumes COW memory will eventually be copied (and it's largely right about that for CPython given the cyclic garbage collector's twiddling of reference counts), so it's hardly relevant if 21 GB is actually used; the OOM-killer doesn't care, and will murder the process anyway.

The alternative would be having the default process executor shared with the child processes, but that just means process usage would be subject to the same deadlocks as in Brian's threaded case.

This also defeats the purpose of the Executor model; Java, which pioneered it to my knowledge, intentionally required you to create the executor up front (typically in a single global location) because the goal is to allow you to change your program-wide parallelism model by changing a single line (the definition of the executor), with all uses of the executor remaining unchanged. Making a bunch of global functions implicitly tied to different executors/executor models means the parallelism is no longer centrally defined, so switching models means changes all over the code base (in Python, that's often unavoidable due to constraints involving pickling and data sharing, but there is no need to make it worse).
History
Date User Action Args
2019-05-07 01:23:19josh.rsetnosy: + josh.r
messages: + msg341667
2019-05-06 19:58:00bquinlansetmessages: + msg341634
2015-03-18 09:20:26ned.deilysetnosy: + bquinlan
2015-03-18 03:20:20ncoghlancreate