Issue 37193: Memory leak while running TCP/UDPServer with socketserver.ThreadingMixIn

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/81374

classification

Title:	Memory leak while running TCP/UDPServer with socketserver.ThreadingMixIn
Type:	resource usage	Stage:	resolved
Components:	Library (Lib)	Versions:	Python 3.10, Python 3.9, Python 3.8, Python 3.7

process

Status:	closed	Resolution:	fixed
Dependencies:		Superseder:
Assigned To:		Nosy List:	Wei Li, jaraco, martin.panter, maru-n, miss-islington, ned.deily, pablogsal, vstinner
Priority:	normal	Keywords:	3.7regression, patch

Created on 2019-06-07 11:53 by maru-n, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
threadingmixin_memory_usage.png	maru-n, 2019-06-07 11:53

Pull Requests
URL	Status	Linked	Edit
PR 13893	merged	maru-n, 2019-06-07 13:02
PR 23087	closed	miss-islington, 2020-11-01 23:51
PR 23088	closed	miss-islington, 2020-11-01 23:51
PR 23089	closed	miss-islington, 2020-11-01 23:51
PR 23107	merged	jaraco, 2020-11-02 13:56
PR 23127	merged	jaraco, 2020-11-03 14:04
PR 24032	closed	miss-islington, 2021-01-01 00:37
PR 24033	closed	miss-islington, 2021-01-01 00:37
PR 24749	merged	miss-islington, 2021-03-04 15:49
PR 24750	merged	miss-islington, 2021-03-04 15:49

Messages (23)
msg344926 - (view)	Author: Norihiro Maruyama (maru-n) *	Date: 2019-06-07 11:53
UDP/TCPServer with socketserver.ThreadingMixin class (also ThreadingTCPServer and ThreadingUDPServer class) seems to be memory leak while running the server. https://docs.python.org/3/library/socketserver.html#socketserver.ThreadingMixIn My code which wrote to check this is the following. ``` class ThreadedTCPRequestHandler(socketserver.BaseRequestHandler): def handle(self): data = str(self.request.recv(1024), 'ascii') cur_thread = threading.current_thread() response = bytes("{}: {}".format(cur_thread.name, data), 'ascii') self.request.sendall(response) if __name__ == "__main__": HOST, PORT = "localhost", 0 server = socketserver.ThreadingTCPServer((HOST, PORT), ThreadedTCPRequestHandler) server.daemon_threads = False server.block_on_close = True with server: ip, port = server.server_address server_thread = threading.Thread(target=server.serve_forever) server_thread.daemon = True server_thread.start() for i in range(1000): with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as sock: sock.connect(server.server_address) sock.sendall(bytes("hello", 'ascii')) response = str(sock.recv(1024), 'ascii') print("Received: {}".format(response)) time.sleep(0.01) server.server_close() server.shutdown() ``` ( I wrote this based on https://docs.python.org/3/library/socketserver.html#asynchronous-mixins) Then I checked memory usage with profiling tool. (I used memory-profiler module https://pypi.org/project/memory-profiler/) $ mprof run python mycode.py $ mprof plot I attached result plot. And also I checked this also more long time and I found memory usage was increased endlessly. My environment is Hardware: MacBook Pro (15-inch, 2018) OS: MacOS 10.14 Python 3.7.1 I guess it caused by a thread object is not released in spite of the thread finished to process request and thread object will be made infinitely until server_close() is called.
msg345606 - (view)	Author: Wei Li (Wei Li)	Date: 2019-06-14 16:07
I got the same problem when uing the ThreadingTCPServer. I think adding "self._threads = list(filter(lambda x: x.is_alive(), self._threads))" at the last line in process_request method is a potential way to fix the bug
msg345796 - (view)	Author: Martin Panter (martin.panter) *	Date: 2019-06-17 01:47
Looking at the code, this would be caused by Issue 31233. I expect 3.7+ is affected. 3.6 has similar code, but the leaking looks to be disabled by default. 2.7 doesn't collect a "_threads" list at all. Looks like Victor was aware of the leak when he changed the code: <https://bugs.python.org/issue31233#msg304619>, but maybe he pushed the code and then forgot about the problem. A possible problem with Norihiro's solution is modifying the "_threads" list from multiple threads without any synchronization. (Not sure if that is a problem, or is it guaranteed to be okay due to GIL etc?) Also, since the thread is removing itself from the list, it will still run a short while after the removal, so there is a window when the "server_close" method will not wait for that thread. Might also defeat the "dangling thread" accounting that I believe was Victor's motivation for his change. Wei's proposal is to check for cleaning up when a new request is handled. That relies on a new request coming in to free up memory. Perhaps we could use similar strategy to the Forking mixin, which I believe cleans up expired children periodically, without relying on a new request.
msg345815 - (view)	Author: STINNER Victor (vstinner) *	Date: 2019-06-17 08:16
PR 13893 with an additional lock sounds like a reasonable solution. The code should be skipped if the thread is a daemon thread.
msg345817 - (view)	Author: STINNER Victor (vstinner) *	Date: 2019-06-17 08:20
Martin Panter: In addition to PR 13893 change, what do you think of also using a weakref? It might make the code even more reliable if something goes wrong.
msg346413 - (view)	Author: STINNER Victor (vstinner) *	Date: 2019-06-24 15:01
I marked bpo-37389 as a duplicate of this issue: """ msg346410 - (view) Author: Daniel W Forsyth (danf@dataforge.on.ca) Date: 2019-06-24 14:53 After putting a basic ThreadingUDPServer under load (500 messages per/second) I noticed that after a night it was consuming a lot of RAM given it does nothing with the data. On inception, I noticed the _thread count inside the server was growing forever even though the sub-threads are done. Setup a basic ThreadingUDPSever with handler that does nothing and check the request_queue_size, it seems to grow without limit. msg346411 - (view) Author: Daniel W Forsyth (danf@dataforge.on.ca) Date: 2019-06-24 14:59 The only way I could figure out to control it was to do this in a thread; for thread in server._threads: # type: Thread if not thread.is_alive(): server._threads.remove(thread) Shouldn't the server process do this when the thread is done? """
msg358133 - (view)	Author: Jason R. Coombs (jaraco) *	Date: 2019-12-09 19:26
This issue was also [reported in prometheus client](https://salsa.debian.org/debian/python-prometheus-client/commit/5aa256d8aab3b81604b855dc03f260342fc391fb) where the workaround was to use daemon threads.
msg358457 - (view)	Author: Martin Panter (martin.panter) *	Date: 2019-12-15 22:37
Another workaround might be to set the new "block_on_close" flag (Issue 33540) to False on the server subclass or instance. Victor: Replying to <https://bugs.python.org/issue37193#msg345817> "What do I think of also using a weakref?", I assume you mean maintaining "_threads" as a WeakSet rather than a list object. That seems a nice way to solve the problem, but it seems redundant to me if other code such as Maru's proposal was also added to clean up the list.
msg358647 - (view)	Author: Martin Panter (martin.panter) *	Date: 2019-12-18 22:08
FTR I have been trialling a patched Python 3.7 based on Maru's changes (revision 6ac217c) + review suggestions, and it has reduced the size of the leak (hit 1 GB over a couple days, vs only 60 MB increase over three days). The remaining leak could be explained by Issue 37788.
msg371339 - (view)	Author: Ned Deily (ned.deily) *	Date: 2020-06-12 08:31
I note this is marked as a 3.7regression and still open. Since the cutoff for the final 3.7 bugfix mode release is in a few days, I'm assuming this means that 3.7 users will have to live with this regression. If you feel that is a problem, speak up now.
msg371431 - (view)	Author: Jason R. Coombs (jaraco) *	Date: 2020-06-13 01:36
Thanks for the notice Ned. I've revived the PR and addressed all the comments from Victor. Any chance this can get into Python 3.7?
msg371433 - (view)	Author: Ned Deily (ned.deily) *	Date: 2020-06-13 03:53
> Any chance this can get into Python 3.7? Perhaps but there's a lot that needs to be done yet. Like any bugfix, it needs to be reviewed, merged to master, and get some buildbot exposure first before it is backported anywhere.
msg380168 - (view)	Author: Jason R. Coombs (jaraco) *	Date: 2020-11-01 23:51
New changeset c41559021213cfc9dc62a83fc63306b3bdc3e64b by MARUYAMA Norihiro in branch 'master': bpo-37193: remove thread objects which finished process its request (GH-13893) https://github.com/python/cpython/commit/c41559021213cfc9dc62a83fc63306b3bdc3e64b
msg380172 - (view)	Author: Pablo Galindo Salgado (pablogsal) *	Date: 2020-11-02 01:43
Commit c41559021213cfc9dc62a83fc63306b3bdc3e64b has introduced reference leaks: ---------------------------------------------------------------------- Ran 202 tests in 21.654s OK (skipped=1) ...... test_logging leaked [20, 20, 20] references, sum=60 test_logging leaked [20, 20, 20] memory blocks, sum=60 2 tests failed again: test_logging test_socketserver == Tests result: FAILURE then FAILURE == Example buildbot failure: https://buildbot.python.org/all/#/builders/562/builds/79/steps/5/logs/stdio As there is a release of 3.10 alpha 2 tomorrow I would be great if this could be fixed by tomorrow.
msg380209 - (view)	Author: STINNER Victor (vstinner) *	Date: 2020-11-02 11:43
The change fixing a leak in socketserver introduces a leak in socketserver :-) $ ./python -m test test_socketserver -u all -m test.test_socketserver.SocketServerTest.test_ThreadingTCPServer -R 3:3 0:00:00 load avg: 0.95 Run tests sequentially 0:00:00 load avg: 0.95 [1/1] test_socketserver beginning 6 repetitions 123456 ...... test_socketserver leaked [3, 3, 3] references, sum=9 test_socketserver leaked [3, 3, 3] memory blocks, sum=9 test_socketserver failed == Tests result: FAILURE == 1 test failed: test_socketserver Total duration: 497 ms Tests result: FAILURE
msg380212 - (view)	Author: STINNER Victor (vstinner) *	Date: 2020-11-02 11:49
I rejected the backport to 3.8 and 3.9 since the change causes a regression on master.
msg380217 - (view)	Author: Jason R. Coombs (jaraco) *	Date: 2020-11-02 13:10
I recommend a rollback. I’ll try to get to it later today.
msg380237 - (view)	Author: Pablo Galindo Salgado (pablogsal) *	Date: 2020-11-02 16:49
New changeset aca67da4fe68d5420401ac1782203d302875eb27 by Jason R. Coombs in branch 'master': Revert "bpo-37193: remove thread objects which finished process its request (GH-13893)" (GH-23107) https://github.com/python/cpython/commit/aca67da4fe68d5420401ac1782203d302875eb27
msg380354 - (view)	Author: Jason R. Coombs (jaraco) *	Date: 2020-11-04 19:43
I filed issue42263 to capture the underlying cause of the memory leak that led to the buildbot failures and the rollback.
msg384136 - (view)	Author: Pablo Galindo Salgado (pablogsal) *	Date: 2020-12-31 20:19
New changeset b5711c940f70af89f2b4cf081a3fcd83924f3ae7 by Jason R. Coombs in branch 'master': bpo-37193: Remove thread objects which finished process its request (GH-23127) https://github.com/python/cpython/commit/b5711c940f70af89f2b4cf081a3fcd83924f3ae7
msg388101 - (view)	Author: miss-islington (miss-islington)	Date: 2021-03-04 16:36
New changeset 0e76157b0ca70bd38157fed56680a7752a52668d by Miss Islington (bot) in branch '3.9': [3.9] bpo-37193: Remove thread objects which finished process its request (GH-23127) (GH-24750) https://github.com/python/cpython/commit/0e76157b0ca70bd38157fed56680a7752a52668d
msg388104 - (view)	Author: miss-islington (miss-islington)	Date: 2021-03-04 16:55
New changeset 0064d561b8e44f7a991188d7c5016c165bc89326 by Miss Islington (bot) in branch '3.8': [3.8] bpo-37193: Remove thread objects which finished process its request (GH-23127) (GH-24749) https://github.com/python/cpython/commit/0064d561b8e44f7a991188d7c5016c165bc89326
msg388503 - (view)	Author: STINNER Victor (vstinner) *	Date: 2021-03-11 12:01
Thank you for fixing the regression Jason R. Coombs ;-)

History
Date	User	Action	Args
2022-04-11 14:59:16	admin	set	github: 81374
2021-03-11 12:01:19	vstinner	set	messages: + msg388503
2021-03-04 16:55:43	miss-islington	set	messages: + msg388104
2021-03-04 16:36:50	miss-islington	set	messages: + msg388101
2021-03-04 15:49:32	miss-islington	set	pull_requests: + pull_request23520
2021-03-04 15:49:16	miss-islington	set	pull_requests: + pull_request23519
2021-01-01 00:37:39	miss-islington	set	pull_requests: + pull_request22874
2021-01-01 00:37:29	miss-islington	set	pull_requests: + pull_request22873
2020-12-31 20:26:17	pablogsal	set	status: open -> closed resolution: fixed stage: patch review -> resolved
2020-12-31 20:19:39	pablogsal	set	messages: + msg384136
2020-11-04 19:43:56	jaraco	set	messages: + msg380354
2020-11-03 14:04:47	jaraco	set	pull_requests: + pull_request22043
2020-11-02 16:49:31	pablogsal	set	messages: + msg380237
2020-11-02 13:56:01	jaraco	set	pull_requests: + pull_request22024
2020-11-02 13:10:52	jaraco	set	messages: + msg380217
2020-11-02 11:49:15	vstinner	set	messages: + msg380212
2020-11-02 11:43:41	vstinner	set	messages: + msg380209
2020-11-02 01:43:32	pablogsal	set	nosy: + pablogsal messages: + msg380172
2020-11-01 23:51:34	miss-islington	set	pull_requests: + pull_request22006
2020-11-01 23:51:27	miss-islington	set	pull_requests: + pull_request22005
2020-11-01 23:51:19	miss-islington	set	nosy: + miss-islington pull_requests: + pull_request22004
2020-11-01 23:51:12	jaraco	set	messages: + msg380168
2020-06-13 03:53:46	ned.deily	set	messages: + msg371433
2020-06-13 01:36:55	jaraco	set	messages: + msg371431
2020-06-12 22:37:53	gregory.p.smith	set	versions: + Python 3.8, Python 3.9, Python 3.10
2020-06-12 08:31:26	ned.deily	set	nosy: + ned.deily messages: + msg371339
2019-12-18 22:08:26	martin.panter	set	messages: + msg358647
2019-12-15 22:37:42	martin.panter	set	messages: + msg358457
2019-12-09 19:26:01	jaraco	set	messages: + msg358133
2019-12-09 19:17:40	jaraco	set	nosy: + jaraco
2019-06-24 15:01:41	vstinner	set	messages: + msg346413
2019-06-24 15:01:07	vstinner	link	issue37389 superseder
2019-06-17 08:20:03	vstinner	set	messages: + msg345817
2019-06-17 08:16:18	vstinner	set	messages: + msg345815
2019-06-17 01:47:45	martin.panter	set	keywords: + 3.7regression nosy: + martin.panter, vstinner messages: + msg345796
2019-06-14 16:07:44	Wei Li	set	nosy: + Wei Li messages: + msg345606
2019-06-07 13:02:57	maru-n	set	keywords: + patch stage: patch review pull_requests: + pull_request13767
2019-06-07 11:53:39	maru-n	create