This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: concurrent.futures ProcessPoolExecutor submit() blocks on results being written
Type: performance Stage: resolved
Components: Extension Modules Versions: Python 3.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: bquinlan, dbarcay, pitrou, tomMoral
Priority: normal Keywords:

Created on 2018-06-22 19:38 by dbarcay, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (6)
msg320257 - (view) Author: Daniel Barcay (dbarcay) Date: 2018-06-22 19:38
I have tracked down the exact cause of a sizable performance issue in using concurrent.futures.ProcessPoolExecutors, especially visible in cases where large amounts of data are being copied across the result.

The line-number causing the bad behavior, and several remediation paths are included below. Since this affects core behavior of the module, I'm reticent to try out a patch myself unless someone chimes in on the approach.

---Bug Symptoms:
  ProcessPoolExecutor.submit() hangs for long periods of time non-deterministically (over 20 seconds in my job). See causes section below for exact cause. 
   This hanging makes multiprocess job submissions impossible from a real-time constrained main thread, where the results are large objects.

---Ideal behavior:
   submit() should not block on any results of other jobs, and non-blocking wake signal should be used instead of a blocking put() call.

---Bug Cause:
In ProcessPoolExecutor.submit() line 473, a wake signal is being sent to the management thread in the form of posting a message to the result queue, waking the thread if it was in recv() mode.

I'm not even sure that this wake-up is necessary, as removing it seems to work just fine for my use-case on OSX. However, let's presume that it is for the time being..

The fact that submit() blocks on the result_queue being serviced is unnecessary, and hinders large results from being sent back across in concurrent.futures.result().

---Possible remediations:

If a more fully-fledged Queue implementation were used, this signal could be replaced by the non-blocking version. Alternately multiprocess.Queue implementation could be extended to implement non-blocking put()


--- Reproduction Details
  I'm using concurrent.futures.ProcessPoolExecutor for a complicated data-processing use-case where the result is a large object to be sent across the result() channel. Create any such setup where the results are on the order of 50MB strings, submit 5-10 jobs at a time, and watch the time it takes to call submit().
msg320258 - (view) Author: Daniel Barcay (dbarcay) Date: 2018-06-22 19:42
Line number was incorrect due to local edits. 

Correct line number is process.py:L464  "self._result_queue.put(None)"
msg320259 - (view) Author: Daniel Barcay (dbarcay) Date: 2018-06-22 19:56
adding experts bquinlan and pitrou for concurrent.futures to nosy-list as per bug tracker directions.
msg320377 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2018-06-24 17:50
I'm not sure what happens exactly in your workload, but waiting 20 seconds when posting some data on an unbounded queue sounds enormous.
msg320378 - (view) Author: Thomas Moreau (tomMoral) * Date: 2018-06-24 18:09
This behavior results from the fact that in 3.6, the result_queue is used to pass messages to the queue_manager_thread. This behavior has been changed in 3.7 as we rely on a _ThreadWakeup object.

In 3.6, when the result_queue is filled with many large objects, the call to result_queue.put(None) will hang while the previous objects are being handled by the queue_manager_thread, causing a latency in the submit.
msg320688 - (view) Author: Daniel Barcay (dbarcay) Date: 2018-06-28 22:30
Just got the drop of the python3.7 release. I can confirm that this is fixed in python3.7 in my workload.

Nice job! Thanks for changing the mechanism of thread-sync. I'm grateful.
History
Date User Action Args
2022-04-11 14:59:02adminsetgithub: 78126
2018-06-28 22:30:04dbarcaysetstatus: open -> closed
resolution: fixed
messages: + msg320688

stage: resolved
2018-06-24 18:09:16tomMoralsetmessages: + msg320378
2018-06-24 17:50:06pitrousetnosy: + tomMoral
messages: + msg320377
2018-06-22 19:56:58dbarcaysetnosy: + bquinlan, pitrou
messages: + msg320259
2018-06-22 19:42:27dbarcaysetmessages: + msg320258
2018-06-22 19:38:44dbarcaycreate