Message356586
> I understand that there's *some* overhead associated with spawning a new thread, but from my impression it's not substantial enough to make a significant impact in most cases.
Although I think this still stands to some degree, I will have to rescind the following:
> Each individual instance of threading.Thread is only 64 bytes.
The 64 bytes was measured by `sys.getsizeof(threading.Thread())`, which only provides a surface level assessment. I believe this only includes the size of the reference to the thread object.
In order to get a better estimate, I implemented a custom get_size() function, that recursively adds the size of the object and all unique objects from gc.get_referents() (ignoring several redundant and/or unnecessary types). For more details, see https://gist.github.com/aeros/632bd035b6f95e89cdf4bb29df970a2a. Feel free to critique it if there are any apparent issues (for the purpose of measuring the size of threads).
Then, I used this function on three different threads, to figure how much memory was needed for each one:
Python 3.8.0+ (heads/3.8:1d2862a323, Nov 4 2019, 06:59:53)
[GCC 9.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import threading
>>> from get_size import get_size
>>> a = threading.Thread()
>>> b = threading.Thread()
>>> c = threading.Thread()
>>> get_size(a)
3995
>>> get_size(b)
1469
>>> get_size(c)
1469
1469 bytes seems to be roughly the amount of additional memory required for each new thread, at least on Linux kernel 5.3.8 and Python 3.8. I don't know if this is 100% accurate, but it at least provides an improved estimate over sys.getsizeof().
> But it spawns a new Python thread per process which can be a blocker issue if a server memory is limited. What if you want to spawn 100 processes? Or 1000 processes? What is the memory usage?
From my understanding, ~1.5KB/thread seems to be quite negligible for most modern equipment. The server's memory would have to be very limited for spawning an additional 1000 threads to be a bottleneck/blocker issue:
Python 3.8.0+ (heads/3.8:1d2862a323, Nov 4 2019, 06:59:53)
[GCC 9.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import threading
>>> from get_size import get_size
>>> threads = []
>>> for _ in range(1000):
... th = threading.Thread()
... threads.append(th)
...
>>> get_size(threads)
1482435
(~1.5MB)
Victor (or anyone else), in your experience, would the additional ~1.5KB per process be an issue for 99% of production servers? If not, it seems to me like the additional maintenance cost of keeping SafeChildWatcher and FastChildWatcher in asyncio's API wouldn't be worthwhile. |
|
Date |
User |
Action |
Args |
2019-11-14 08:30:03 | aeros | set | recipients:
+ aeros, vstinner, benjamin.peterson, asvetlov, yselivanov |
2019-11-14 08:30:02 | aeros | set | messageid: <1573720202.88.0.990406187175.issue38591@roundup.psfhosted.org> |
2019-11-14 08:30:02 | aeros | link | issue38591 messages |
2019-11-14 08:30:01 | aeros | create | |
|