classification
Title: Severe open file leakage running asyncio SSL server
Type: resource usage Stage: resolved
Components: asyncio Versions: Python 3.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: yselivanov Nosy List: asvetlov, fafhrd91, giampaolo.rodola, kyuupichan, mocmocamoc, yselivanov
Priority: normal Keywords: patch

Created on 2017-04-03 14:10 by kyuupichan, last changed 2017-12-20 19:27 by asvetlov. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 4825 merged mocmocamoc, 2017-12-12 22:43
PR 4939 merged asvetlov, 2017-12-20 10:38
Messages (14)
msg291071 - (view) Author: Neil Booth (kyuupichan) * Date: 2017-04-03 14:10
Original report at old repo here:  https://github.com/python/asyncio/issues/483

There this is reported fixed by https://github.com/python/cpython/pull/480

I wish to report that whilst the above patch might have a small positive effect, it is far from solving the actual issue.  Several users report eventual exhaustion of the open file resource running SSL asyncio servers.

Here are graphs provided by a friend running my ElectrumX server software, first accepting SSL connections and the second accepting TCP connections only.  Both of the servers were monkey-patched with the pull-480 fix above, so this is evidence it isn't solving the issue.

http://imgur.com/a/cWnSu

As you can see, the TCP server (which has far less connections; most users use SSL) has no leaked file handles, whereas the SSL server has over 300.

This becomes an easy denial of service vector against asyncio servers.  One way to trigger this (though I doubt it explains the numbers above) is simply to connect to the SSL server from telnet, and do nothing.  asyncio doesn't time you out, the telnet session seems to sit there forever, and the open file resources are lost in the SSL handshake stage until the remote host kindly decides to disconnect.

I suspect these resource issues all revolve around the SSL handshake process, certainly at the opening of a connection, but also perhaps when closing.

As the application author I am not informed by asyncio of a potential connection until the initial handshake is complete, so I cannot do anything to close these phantom socket connections.  I have to rely on asyncio to be properly handling DoS issues and it is not currently doing so robustly.
msg291135 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2017-04-04 19:56
I'm assigning this to myself to make sure I don't forget about this. If someone wants to tackle this please feel free to reassign.
msg296294 - (view) Author: Nikolay Kim (fafhrd91) * Date: 2017-06-18 21:51
question is, should asyncio handle timeouts or leave it to caller?

https://github.com/python/cpython/pull/480 fixes leak during handshake.
msg296297 - (view) Author: Neil Booth (kyuupichan) * Date: 2017-06-18 22:27
@Nikolay Kim

As I note in the original submission, 480 was tested and does NOT solve this issue.  Thanks.
msg296298 - (view) Author: Nikolay Kim (fafhrd91) * Date: 2017-06-18 22:35
I see. this is server specific problem. as a temp solution I'd use proxy for ssl termination.
msg308084 - (view) Author: Neil Booth (kyuupichan) * Date: 2017-12-12 00:37
I'm not sure what you mean about this being a server-specific problem.  It's clearly a bug in the asyncio SSL wrapper as using TCP instead of SSL with otherwise identical code doesn't leak open files.
msg308146 - (view) Author: Neil Aspinall (mocmocamoc) * Date: 2017-12-12 17:49
I think there's been some confusion about what PR 480 was meant to fix - it helps in cases where connections are closed during handshake, but if a server connection is waiting for a handshake but never receives any data at all then it stays in that state forever.

As for a fix, how about giving SSLProtocol a method like:

    def checkHandshakeDone(self):
        if self._in_handshake == True:
            self._abort()

and then at the end of _start_handshake() adding:

    self._loop.call_later(10, self.checkHandshakeDone)

Then if the handshake is not complete within ten seconds of starting, the connection will be aborted.
msg308673 - (view) Author: Andrew Svetlov (asvetlov) * (Python committer) Date: 2017-12-19 19:45
New changeset f7686c1f5553b24e3307506a18e18f6544de94d3 by Andrew Svetlov (Neil Aspinall) in branch 'master':
bpo-29970: Add timeout for SSL handshake in asyncio
https://github.com/python/cpython/commit/f7686c1f5553b24e3307506a18e18f6544de94d3
msg308676 - (view) Author: Andrew Svetlov (asvetlov) * (Python committer) Date: 2017-12-19 20:02
Fixed in 3.7
msg308765 - (view) Author: Andrew Svetlov (asvetlov) * (Python committer) Date: 2017-12-20 18:24
New changeset 51eb1c6b9c0b382dfd6e0428eacff0c7891a6fc3 by Andrew Svetlov in branch 'master':
bpo-29970: Make ssh_handshake_timeout None by default (#4939)
https://github.com/python/cpython/commit/51eb1c6b9c0b382dfd6e0428eacff0c7891a6fc3
msg308769 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2017-12-20 18:40
Should we backport this to 3.6?  This is a security issue.
msg308784 - (view) Author: Andrew Svetlov (asvetlov) * (Python committer) Date: 2017-12-20 19:22
The fix introduces a new parameter in public API.

That's why I think we shouldn't backport it.
msg308785 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2017-12-20 19:23
> The fix introduces a new parameter in public API.

Maybe we can get away with this if we do not document it in 3.6 and add a comment to the source code that using this new parameter will make the code incompatible with earlier 3.6.x versions?
msg308786 - (view) Author: Andrew Svetlov (asvetlov) * (Python committer) Date: 2017-12-20 19:27
Don't know.
Ask other coredevs maybe?
History
Date User Action Args
2017-12-20 19:27:29asvetlovsetmessages: + msg308786
2017-12-20 19:23:54yselivanovsetmessages: + msg308785
2017-12-20 19:22:00asvetlovsetmessages: + msg308784
2017-12-20 18:40:27yselivanovsetmessages: + msg308769
2017-12-20 18:24:45asvetlovsetmessages: + msg308765
2017-12-20 10:38:06asvetlovsetpull_requests: + pull_request4830
2017-12-19 20:02:17asvetlovsetmessages: + msg308676
2017-12-19 20:02:01asvetlovsetstatus: open -> closed
stage: patch review -> resolved
resolution: fixed
versions: - Python 3.5, Python 3.6
2017-12-19 19:45:44asvetlovsetnosy: + asvetlov
messages: + msg308673
2017-12-12 22:43:21mocmocamocsetkeywords: + patch
stage: patch review
pull_requests: + pull_request4717
2017-12-12 17:49:48mocmocamocsetnosy: + mocmocamoc
messages: + msg308146
2017-12-12 11:00:04vstinnersetnosy: - vstinner
2017-12-12 00:37:33kyuupichansetmessages: + msg308084
2017-06-18 22:35:38fafhrd91setmessages: + msg296298
2017-06-18 22:27:21kyuupichansetmessages: + msg296297
2017-06-18 21:51:55fafhrd91setnosy: + fafhrd91
messages: + msg296294
2017-04-04 19:56:12yselivanovsetassignee: yselivanov
messages: + msg291135
versions: + Python 3.7
2017-04-04 19:52:13brett.cannonsetnosy: + vstinner, giampaolo.rodola
2017-04-03 14:10:15kyuupichancreate