Issue 39617: max_workers argument to concurrent.futures.ProcessPoolExecutor is not flexible enough

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/83798

classification

Title:	max_workers argument to concurrent.futures.ProcessPoolExecutor is not flexible enough
Type:	enhancement	Stage:
Components:	Library (Lib)	Versions:	Python 3.8

process

Status:	open	Resolution:
Dependencies:		Superseder:
Assigned To:		Nosy List:	aeros, bquinlan, calestyo, pitrou, sam-s
Priority:	normal	Keywords:

Created on 2020-02-12 17:21 by sam-s, last changed 2022-04-11 14:59 by admin.

Files
File name	Uploaded	Description	Edit
run-at-load-avg.py	sam-s, 2020-02-17 00:24

Messages (6)
msg361905 - (view)	Author: sds (sam-s)	Date: 2020-02-12 17:21
The number of workers (max_workers) I want to use often depends on the server load. Imagine this scenario: I have 64 CPUs and I need to run 200 processes. However, others are using the server too, so currently loadavg is 50, thus I will set `max_workers` to (say) 20. But 2 hours later when those 20 processes are done, loadavg is now 0 (because the 50 processes run by my colleagues are done too), so I want to increase the pool size max_workers to 70. It would be nice if it were possible to adjust the pool size depending on the server loadavg when a worker is started. Basically, the intent is maintaining a stable load average and full resource utilization.
msg361906 - (view)	Author: sds (sam-s)	Date: 2020-02-12 17:25
cf https://github.com/joblib/joblib/issues/1006
msg361956 - (view)	Author: sds (sam-s)	Date: 2020-02-13 14:26
cf https://github.com/joblib/loky/issues/233
msg362065 - (view)	Author: Kyle Stanley (aeros) *	Date: 2020-02-16 10:12
So, essentially, are you looking for a way to dynamically adjust ProcessPoolExecutor's (PPE) max_workers, rather than just upon initialization? This seems like it would be a very reasonable enhancement to the Executor API. Specifically for PPE, it would involve expanding the call queue, which is where pending work items are moved to just before being executed by the subprocesses (and then moved to the results queue). In order for futures submitted to the executor to be cancel-able, it has to have a fixed upper limit based on the number of max_workers (but not too low so it doesn't have idle workers). Fortunately, we are able to adjust the maxsize of the call queue in real-time since it's part of the public API for Queue (rather than having to create a new queue and copy all of the elements over to a new one). Also, in order for this to be of real benefit, we would have to join all of the processes that are no longer being used. This would likely take some time to implement properly, but without doing so, there wouldn't be a whole lot of real benefit from being able to dynamically adjust the max workers; you'd still be using up the resources to keep a bunch of idle processes alive. I also suspect that it would require fairly extensive tests to ensure it's stability, and would be decently involved to implement it in a way that doesn't negatively impact or modify the existing behavior for ProcessPoolExecutor. But with some time and effort, I suspect it would be possible to implement this feature. Assuming Antoine P. and Brian Q. (primary maintainers/experts for concurrent.futures) are on-board with implementing this feature, I would be willing to look into it working on it when I get the chance to.
msg362113 - (view)	Author: sds (sam-s)	Date: 2020-02-17 00:24
I don't think you need this complexity - just keep the pool max size and submit jobs only when the loadavg drops below the threshold. See my implementation attached.
msg362351 - (view)	Author: sds (sam-s)	Date: 2020-02-20 21:02
On a closer observation, I think you are eminently right. Idle workers take far far far too much RAM. In fact, I would like to be able to specify that the workers are to be spinned up on demand only and terminated immediately when they they are done.

History
Date	User	Action	Args
2022-04-11 14:59:26	admin	set	github: 83798
2021-02-05 05:00:08	calestyo	set	nosy: + calestyo
2020-02-20 21:02:20	sam-s	set	messages: + msg362351
2020-02-17 00:24:04	sam-s	set	files: + run-at-load-avg.py messages: + msg362113
2020-02-16 10:12:42	aeros	set	nosy: + bquinlan, pitrou
2020-02-16 10:12:25	aeros	set	nosy: + aeros messages: + msg362065
2020-02-13 14:26:39	sam-s	set	messages: + msg361956
2020-02-12 17:25:17	sam-s	set	messages: + msg361906
2020-02-12 17:21:04	sam-s	create