classification
Title: max_workers argument to concurrent.futures.ProcessPoolExecutor is not flexible enough
Type: enhancement Stage:
Components: Library (Lib) Versions: Python 3.8
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: aeros, bquinlan, pitrou, sam-s
Priority: normal Keywords:

Created on 2020-02-12 17:21 by sam-s, last changed 2020-02-20 21:02 by sam-s.

Files
File name Uploaded Description Edit
run-at-load-avg.py sam-s, 2020-02-17 00:24
Messages (6)
msg361905 - (view) Author: sds (sam-s) Date: 2020-02-12 17:21
The number of workers (max_workers) I want to use often depends on the server load.
Imagine this scenario: I have 64 CPUs and I need to run 200 processes.
However, others are using the server too, so currently loadavg is 50, thus I will set `max_workers` to (say) 20. 
But 2 hours later when those 20 processes are done, loadavg is now 0 (because the 50 processes run by my colleagues are done too), so I want to increase the pool size max_workers to 70.
It would be nice if it were possible to adjust the pool size depending on the server loadavg when a worker is started.
Basically, the intent is maintaining a stable load average and full resource utilization.
msg361906 - (view) Author: sds (sam-s) Date: 2020-02-12 17:25
cf https://github.com/joblib/joblib/issues/1006
msg361956 - (view) Author: sds (sam-s) Date: 2020-02-13 14:26
cf https://github.com/joblib/loky/issues/233
msg362065 - (view) Author: Kyle Stanley (aeros) * (Python committer) Date: 2020-02-16 10:12
So, essentially, are you looking for a way to dynamically adjust ProcessPoolExecutor's (PPE) max_workers, rather than just upon initialization?

This seems like it would be a very reasonable enhancement to the Executor API. Specifically for PPE, it would involve expanding the call queue, which is where pending work items are moved to just before being executed by the subprocesses (and then moved to the results queue). In order for futures submitted to the executor to be cancel-able, it has to have a fixed upper limit based on the number of max_workers (but not too low so it doesn't have idle workers).

Fortunately, we are able to adjust the *maxsize* of the call queue in real-time since it's part of the public API for Queue (rather than having to create a new queue and copy all of the elements over to a new one). 

Also, in order for this to be of real benefit, we would have to join all of the processes that are no longer being used. This would likely take some time to implement properly, but without doing so, there wouldn't be a whole lot of real benefit from being able to dynamically adjust the max workers; you'd still be using up the resources to keep a bunch of idle processes alive.

I also suspect that it would require fairly extensive tests to ensure it's stability, and would be decently involved to implement it in a way that doesn't negatively impact or modify the existing behavior for ProcessPoolExecutor. But with some time and effort, I suspect it would be possible to implement this feature.

Assuming Antoine P. and Brian Q. (primary maintainers/experts for concurrent.futures) are on-board with implementing this feature, I would be willing to look into it working on it when I get the chance to.
msg362113 - (view) Author: sds (sam-s) Date: 2020-02-17 00:24
I don't think you need this complexity - just keep the pool max size and submit jobs only when the loadavg drops below the threshold.
See my implementation attached.
msg362351 - (view) Author: sds (sam-s) Date: 2020-02-20 21:02
On a closer observation, I think you are eminently right.
Idle workers take far far far too much RAM.
In fact, I would like to be able to specify that the workers are to be spinned up on demand only and terminated immediately when they they are done.
History
Date User Action Args
2020-02-20 21:02:20sam-ssetmessages: + msg362351
2020-02-17 00:24:04sam-ssetfiles: + run-at-load-avg.py

messages: + msg362113
2020-02-16 10:12:42aerossetnosy: + bquinlan, pitrou
2020-02-16 10:12:25aerossetnosy: + aeros
messages: + msg362065
2020-02-13 14:26:39sam-ssetmessages: + msg361956
2020-02-12 17:25:17sam-ssetmessages: + msg361906
2020-02-12 17:21:04sam-screate