I have tracked down the exact cause of a sizable performance issue in using concurrent.futures.ProcessPoolExecutors, especially visible in cases where large amounts of data are being copied across the result.
The line-number causing the bad behavior, and several remediation paths are included below. Since this affects core behavior of the module, I'm reticent to try out a patch myself unless someone chimes in on the approach.
---Bug Symptoms:
ProcessPoolExecutor.submit() hangs for long periods of time non-deterministically (over 20 seconds in my job). See causes section below for exact cause.
This hanging makes multiprocess job submissions impossible from a real-time constrained main thread, where the results are large objects.
---Ideal behavior:
submit() should not block on any results of other jobs, and non-blocking wake signal should be used instead of a blocking put() call.
---Bug Cause:
In ProcessPoolExecutor.submit() line 473, a wake signal is being sent to the management thread in the form of posting a message to the result queue, waking the thread if it was in recv() mode.
I'm not even sure that this wake-up is necessary, as removing it seems to work just fine for my use-case on OSX. However, let's presume that it is for the time being..
The fact that submit() blocks on the result_queue being serviced is unnecessary, and hinders large results from being sent back across in concurrent.futures.result().
---Possible remediations:
If a more fully-fledged Queue implementation were used, this signal could be replaced by the non-blocking version. Alternately multiprocess.Queue implementation could be extended to implement non-blocking put()
--- Reproduction Details
I'm using concurrent.futures.ProcessPoolExecutor for a complicated data-processing use-case where the result is a large object to be sent across the result() channel. Create any such setup where the results are on the order of 50MB strings, submit 5-10 jobs at a time, and watch the time it takes to call submit().
|