classification
Title: asyncio uses too many threads by default
Type: resource usage Stage: resolved
Components: asyncio Versions: Python 3.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Epic_Wink, Vojtěch Boček, asvetlov, inada.naoki, yselivanov
Priority: normal Keywords: patch

Created on 2018-11-19 14:26 by Vojtěch Boček, last changed 2019-05-28 12:03 by inada.naoki. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 13618 merged inada.naoki, 2019-05-28 10:16
Messages (9)
msg330101 - (view) Author: Vojtěch Boček (Vojtěch Boček) Date: 2018-11-19 14:26
By default, asyncio spawns as many as os.cpu_count() * 5 threads to run I/O on. When combined with beefy machines (e.g. kubernetes servers) with, says, 56 cores, it results in very high memory usage.

This is amplified by the fact that the `concurrent.futures.ThreadPoolExecutor` threads are never killed, and are not re-used until `max_workers` threads are spawned.

Workaround:

    loop.set_default_executor(concurrent.futures.ThreadPoolExecutor(max_workers=8))

This is still not ideal as the program might not need max_workers threads, but they are still spawned anyway.

I've hit this issue when running asyncio program in kubernetes. It created 260 idle threads and then ran out of memory.

I think the default max_workers should be limited to some max value and ThreadPoolExecutor should not spawn new threads unless necessary.
msg340013 - (view) Author: Laurie Opperman (Epic_Wink) * Date: 2019-04-12 08:22
What about making it dependant on memory as well as logical processor count:

`n_workers = min(RAM_GB / some_number, N_CORES * 5)`
msg340015 - (view) Author: Inada Naoki (inada.naoki) * (Python committer) Date: 2019-04-12 08:28
node.js default threadpool size is 4 regardless number of cores.
https://nodejs.org/api/cli.html#cli_uv_threadpool_size_size

Since we has GIL, I think fixed-size pool is better idea.
msg343688 - (view) Author: Andrew Svetlov (asvetlov) * (Python committer) Date: 2019-05-27 20:23
asyncio uses bare concurrent.futures.ThreadPoolExecutor.
Tha question is: should asyncio reduce the number of threads in the pool or concurrent.futures should change the default value?
msg343733 - (view) Author: Inada Naoki (inada.naoki) * (Python committer) Date: 2019-05-28 05:15
Current default value is decided in here:
https://bugs.python.org/review/21527/#ps11902

It seems there were no strong reason for current cpu_count * 5.
I think cpu_count + 4 is better default value, because:

* When people are using threadpool for CPU heavy job which releases GIL, workers >= cpu_count is good.
* When people are using threadpool for multiplexing I/O, best workers number is vary on the workload.  But I think 4~16 is good for typical case.



> This is amplified by the fact that the `concurrent.futures.ThreadPoolExecutor` threads are never killed, and are not re-used until `max_workers` threads are spawned.

Note that this is fixed by #24882.
msg343737 - (view) Author: Andrew Svetlov (asvetlov) * (Python committer) Date: 2019-05-28 07:00
I'm ok with changing the default threads number limit.
Not sure about numbers.
If you want to limit to 16-20 that may be ok but `cpu_count + 4` doesn't work in this case. On cloud servers, I see 128 or even more cores very often. 160+4 is not that you want to propose, sure.

I insist on changing the default calculation schema in concurrent.futures, not in asyncio. There is no case for asyncio to be exceptional.
msg343740 - (view) Author: Inada Naoki (inada.naoki) * (Python committer) Date: 2019-05-28 07:17
> If you want to limit to 16-20 that may be ok but `cpu_count + 4` doesn't work in this case. On cloud servers, I see 128 or even more cores very often. 160+4 is not that you want to propose, sure.


I proposed cpu_count + 4 because #24882 almost fixed the problem of large maxworks.
If you don't like it, how about min(32, cpu_count+4)?


> I insist on changing the default calculation schema in concurrent.futures, not in asyncio. There is no case for asyncio to be exceptional.

Makes sense.
msg343743 - (view) Author: Andrew Svetlov (asvetlov) * (Python committer) Date: 2019-05-28 07:26
> how about min(32, cpu_count+4)?

I think it produces reasonable good numbers for any CPU count.

Do you have time for preparing a pull request?
msg343766 - (view) Author: Inada Naoki (inada.naoki) * (Python committer) Date: 2019-05-28 12:02
New changeset 9a7e5b1b42abcedb895b1ce49d83fe067d01835c by Inada Naoki in branch 'master':
bpo-35279: reduce default max_workers of ThreadPoolExecutor (GH-13618)
https://github.com/python/cpython/commit/9a7e5b1b42abcedb895b1ce49d83fe067d01835c
History
Date User Action Args
2019-05-28 12:03:20inada.naokisetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2019-05-28 12:02:59inada.naokisetmessages: + msg343766
2019-05-28 10:16:47inada.naokisetkeywords: + patch
stage: patch review
pull_requests: + pull_request13520
2019-05-28 07:26:21asvetlovsetmessages: + msg343743
2019-05-28 07:17:02inada.naokisetmessages: + msg343740
2019-05-28 07:00:45asvetlovsetmessages: + msg343737
2019-05-28 05:15:23inada.naokisetmessages: + msg343733
2019-05-27 20:23:58asvetlovsetmessages: + msg343688
2019-04-12 08:28:12inada.naokisetmessages: + msg340015
2019-04-12 08:22:15Epic_Winksetnosy: + Epic_Wink
messages: + msg340013
2019-04-12 07:12:02inada.naokisetnosy: + inada.naoki
2018-11-19 14:26:55Vojtěch Bočekcreate