classification
Title: asyncio: race condition in SSLProtocol
Type: Stage: resolved
Components: asyncio Versions: Python 3.8
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: asvetlov, miss-islington, ned.deily, vstinner, yselivanov
Priority: Keywords: patch

Created on 2018-05-28 20:37 by vstinner, last changed 2018-05-29 15:17 by yselivanov. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 7175 merged vstinner, 2018-05-28 20:51
PR 7187 merged miss-islington, 2018-05-28 23:34
PR 7188 merged vstinner, 2018-05-28 23:51
PR 7192 merged yselivanov, 2018-05-29 04:39
PR 7193 merged miss-islington, 2018-05-29 05:01
Messages (15)
msg317916 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-05-28 20:37
While debugging bpo-32458 (functional test on START TLS), I found a race condition in SSLProtocol of asyncio/sslproto.py.

Sometimes, _sslpipe.feed_ssldata() is called before _sslpipe.shutdown().

* SSLProtocol.connection_made() -> SSLProtocol._start_handshake(): self._loop.call_soon(self._process_write_backlog)
* SSLProtoco.data_received(): direct call to self._sslpipe.feed_ssldata(data)
* Later, self._process_write_backlog() calls self._sslpipe.do_handshake()

The first write is delayed by call_soon(), whereas the first read is a direct call to the SSL pipe.
msg317917 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-05-28 20:38
Workaround:

diff --git a/Lib/asyncio/sslproto.py b/Lib/asyncio/sslproto.py
index 2bfa45dd15..4a5dbb38a1 100644
--- a/Lib/asyncio/sslproto.py
+++ b/Lib/asyncio/sslproto.py
@@ -592,7 +592,7 @@ class SSLProtocol(protocols.Protocol):
         # (b'', 1) is a special value in _process_write_backlog() to do
         # the SSL handshake
         self._write_backlog.append((b'', 1))
-        self._loop.call_soon(self._process_write_backlog)
+        self._process_write_backlog()
         self._handshake_timeout_handle = \
             self._loop.call_later(self._ssl_handshake_timeout,
                                   self._check_handshake_timeout)
msg317919 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-05-28 20:43
loop.start_tls() method is new in Python 3.7. If possible, I would prefer to not see it with a builtin race condition, since such race condition is really hard to debug :-( So I raise the priority to release blocker. Sorry again Ned!
msg317922 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-05-28 20:59
Yury Selivanov told me that usually it's safer to add call_soon(), than to remove call_soon(). But adding many call_soon() can make asyncio SSL slower.

SSLProtocol doesn't seem to like call_soon(), it's only used at:

* connection_lost(): call_soon(self._app_protocol.connection_lost, exc)
* connection_made() ~~> call_soon(self._process_write_backlog)
* _on_handshake_complete(): call_soon(self._process_write_backlog)

That's all. All other operations are done immediately.

_on_handshake_complete() contains a long comment:

        # In case transport.write() was already called. Don't call
        # immediately _process_write_backlog(), but schedule it:
        # _on_handshake_complete() can be called indirectly from
        # _process_write_backlog(), and _process_write_backlog() is not
        # reentrant.
        self._loop.call_soon(self._process_write_backlog)
msg317923 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2018-05-28 21:05
The fix is correct and the bug is now obvious: data_received() occur pretty much any time after connection_made() call; if call_soon() is used in connection_made(), data_received() may find the protocol in an incorrect state.

Kudos Victor for debugging this.
msg317924 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2018-05-28 21:05
And I agree, this should make it to 3.7.0
msg317960 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-05-28 23:33
New changeset be00a5583a2cb696335c527b921d1868266a42c6 by Victor Stinner in branch 'master':
bpo-33674: asyncio: Fix SSLProtocol race (GH-7175)
https://github.com/python/cpython/commit/be00a5583a2cb696335c527b921d1868266a42c6
msg317974 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2018-05-29 01:44
New changeset 0dd8fd03584b61cd769be88f5a2fb59b0d8f6d18 by Ned Deily (Miss Islington (bot)) in branch '3.7':
bpo-33674: asyncio: Fix SSLProtocol race (GH-7175) (#7187)
https://github.com/python/cpython/commit/0dd8fd03584b61cd769be88f5a2fb59b0d8f6d18
msg317981 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2018-05-29 04:46
New changeset 7593b8a5075ff45d71be9f62980be6a9c005afa9 by Yury Selivanov (Victor Stinner) in branch '3.6':
bpo-33674: asyncio: Fix SSLProtocol race (GH-7175) (GH-7188)
https://github.com/python/cpython/commit/7593b8a5075ff45d71be9f62980be6a9c005afa9
msg317983 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2018-05-29 05:00
New changeset f295587c45f96b62d24f9a12cef6931b0805f596 by Yury Selivanov in branch 'master':
bpo-33674: Pause the transport as early as possible (#7192)
https://github.com/python/cpython/commit/f295587c45f96b62d24f9a12cef6931b0805f596
msg317984 - (view) Author: miss-islington (miss-islington) Date: 2018-05-29 05:59
New changeset eca085993cb8620ba9232560f46d91326a13cdd2 by Miss Islington (bot) in branch '3.7':
bpo-33674: Pause the transport as early as possible (GH-7192)
https://github.com/python/cpython/commit/eca085993cb8620ba9232560f46d91326a13cdd2
msg318003 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-05-29 10:16
I fixed the main issue, so I remove the "release blocker" priority.
msg318004 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2018-05-29 10:28
> bpo-33674: Pause the transport as early as possible (#7192)
> https://github.com/python/cpython/commit/f295587c45f96b62d24f9a12cef6931b0805f596

Is it ok to always resume reading? Previously reading was only resumed if the transport was reading.
msg318025 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2018-05-29 13:39
yes, the method is idempotent.
msg318044 - (view) Author: Yury Selivanov (yselivanov) * (Python committer) Date: 2018-05-29 15:17
Closing this one.  Please open new issues to track regressions.
History
Date User Action Args
2018-05-29 15:17:16yselivanovsetstatus: open -> closed
resolution: fixed
messages: + msg318044

stage: patch review -> resolved
2018-05-29 13:39:36yselivanovsetmessages: + msg318025
2018-05-29 10:28:08vstinnersetmessages: + msg318004
2018-05-29 10:16:40vstinnersetpriority: release blocker ->

messages: + msg318003
2018-05-29 05:59:06miss-islingtonsetnosy: + miss-islington
messages: + msg317984
2018-05-29 05:01:30miss-islingtonsetpull_requests: + pull_request6827
2018-05-29 05:00:14yselivanovsetmessages: + msg317983
2018-05-29 04:46:53yselivanovsetmessages: + msg317981
2018-05-29 04:39:01yselivanovsetpull_requests: + pull_request6826
2018-05-29 01:44:18ned.deilysetmessages: + msg317974
2018-05-28 23:51:25vstinnersetpull_requests: + pull_request6823
2018-05-28 23:34:45miss-islingtonsetpull_requests: + pull_request6822
2018-05-28 23:33:37vstinnersetmessages: + msg317960
2018-05-28 21:05:29yselivanovsetmessages: + msg317924
2018-05-28 21:05:15yselivanovsetmessages: + msg317923
2018-05-28 20:59:47vstinnersetmessages: + msg317922
2018-05-28 20:51:06vstinnersetkeywords: + patch
stage: patch review
pull_requests: + pull_request6810
2018-05-28 20:43:58vstinnersetpriority: normal -> release blocker
nosy: + ned.deily
messages: + msg317919

2018-05-28 20:38:02vstinnersetmessages: + msg317917
2018-05-28 20:37:18vstinnercreate