classification
Title: Memory leak while running TCP/UDPServer with socketserver.ThreadingMixIn
Type: resource usage Stage: patch review
Components: Library (Lib) Versions: Python 3.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Wei Li, martin.panter, maru-n, vstinner
Priority: normal Keywords: 3.7regression, patch

Created on 2019-06-07 11:53 by maru-n, last changed 2019-06-24 15:01 by vstinner.

Files
File name Uploaded Description Edit
threadingmixin_memory_usage.png maru-n, 2019-06-07 11:53
Pull Requests
URL Status Linked Edit
PR 13893 open maru-n, 2019-06-07 13:02
Messages (6)
msg344926 - (view) Author: Norihiro Maruyama (maru-n) * Date: 2019-06-07 11:53
UDP/TCPServer with socketserver.ThreadingMixin class (also ThreadingTCPServer and ThreadingUDPServer class) seems to be memory leak while running the server.

https://docs.python.org/3/library/socketserver.html#socketserver.ThreadingMixIn

My code which wrote to check this is the following.

```
class ThreadedTCPRequestHandler(socketserver.BaseRequestHandler):
    def handle(self):
        data = str(self.request.recv(1024), 'ascii')
        cur_thread = threading.current_thread()
        response = bytes("{}: {}".format(cur_thread.name, data), 'ascii')
        self.request.sendall(response)


if __name__ == "__main__":
    HOST, PORT = "localhost", 0

    server = socketserver.ThreadingTCPServer((HOST, PORT), ThreadedTCPRequestHandler)
    server.daemon_threads = False
    server.block_on_close = True

    with server:
        ip, port = server.server_address

        server_thread = threading.Thread(target=server.serve_forever)

        server_thread.daemon = True
        server_thread.start()

        for i in range(1000):
            with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as sock:
                sock.connect(server.server_address)
                sock.sendall(bytes("hello", 'ascii'))
                response = str(sock.recv(1024), 'ascii')
                print("Received: {}".format(response))
            time.sleep(0.01)

        server.server_close()
        server.shutdown()
```
( I wrote this based on https://docs.python.org/3/library/socketserver.html#asynchronous-mixins)

Then I checked memory usage with profiling tool.
(I used memory-profiler module https://pypi.org/project/memory-profiler/)

$ mprof run python mycode.py
$ mprof plot

I attached result plot.

And also I checked this also more long time and I found memory usage was increased endlessly.

My environment is

Hardware: MacBook Pro (15-inch, 2018)
OS: MacOS 10.14
Python 3.7.1


I guess it caused by a thread object is not released in spite of the thread finished to process request and thread object will be made infinitely until server_close() is called.
msg345606 - (view) Author: Wei Li (Wei Li) Date: 2019-06-14 16:07
I got the same problem when uing the ThreadingTCPServer.

I think adding 
"self._threads = list(filter(lambda x: x.is_alive(), self._threads))"
at the last line in process_request method is a potential way to fix the bug
msg345796 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2019-06-17 01:47
Looking at the code, this would be caused by Issue 31233. I expect 3.7+ is affected. 3.6 has similar code, but the leaking looks to be disabled by default. 2.7 doesn't collect a "_threads" list at all.

Looks like Victor was aware of the leak when he changed the code: <https://bugs.python.org/issue31233#msg304619>, but maybe he pushed the code and then forgot about the problem.

A possible problem with Norihiro's solution is modifying the "_threads" list from multiple threads without any synchronization. (Not sure if that is a problem, or is it guaranteed to be okay due to GIL etc?) Also, since the thread is removing itself from the list, it will still run a short while after the removal, so there is a window when the "server_close" method will not wait for that thread. Might also defeat the "dangling thread" accounting that I believe was Victor's motivation for his change.

Wei's proposal is to check for cleaning up when a new request is handled. That relies on a new request coming in to free up memory. Perhaps we could use similar strategy to the Forking mixin, which I believe cleans up expired children periodically, without relying on a new request.
msg345815 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-06-17 08:16
PR 13893 with an additional lock sounds like a reasonable solution. The code should be skipped if the thread is a daemon thread.
msg345817 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-06-17 08:20
Martin Panter: In addition to PR 13893 change, what do you think of also using a weakref? It might make the code even more reliable if something goes wrong.
msg346413 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-06-24 15:01
I marked bpo-37389 as a duplicate of this issue:

"""
msg346410 - (view) 	Author: Daniel W Forsyth (danf@dataforge.on.ca) 	Date: 2019-06-24 14:53

After putting a basic ThreadingUDPServer under load (500 messages per/second)  I noticed that after a night it was consuming a lot of RAM given it does nothing with the data.

On inception, I noticed the _thread count inside the server was growing forever even though the sub-threads are done.

Setup a basic ThreadingUDPSever with handler that does nothing and check the request_queue_size, it seems to grow without limit.

msg346411 - (view) 	Author: Daniel W Forsyth (danf@dataforge.on.ca) 	Date: 2019-06-24 14:59

The only way I could figure out to control it was to do this in a thread;

        for thread in server._threads:  # type: Thread
            if not thread.is_alive():
                server._threads.remove(thread)

Shouldn't the server process do this when the thread is done?
"""
History
Date User Action Args
2019-06-24 15:01:41vstinnersetmessages: + msg346413
2019-06-24 15:01:07vstinnerlinkissue37389 superseder
2019-06-17 08:20:03vstinnersetmessages: + msg345817
2019-06-17 08:16:18vstinnersetmessages: + msg345815
2019-06-17 01:47:45martin.pantersetkeywords: + 3.7regression
nosy: + martin.panter, vstinner
messages: + msg345796

2019-06-14 16:07:44Wei Lisetnosy: + Wei Li
messages: + msg345606
2019-06-07 13:02:57maru-nsetkeywords: + patch
stage: patch review
pull_requests: + pull_request13767
2019-06-07 11:53:39maru-ncreate