classification
Title: Possible regression in ssl module in 3.7.1 and master
Type: Stage: resolved
Components: SSL Versions: Python 3.8, Python 3.7, Python 3.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: christian.heimes Nosy List: christian.heimes, miss-islington, ned.deily, njs, steve.dower
Priority: release blocker Keywords: 3.6regression, 3.7regression, patch

Created on 2018-09-21 02:05 by njs, last changed 2018-09-25 05:43 by njs. This issue is now closed.

Files
File name Uploaded Description Edit
ssl-unwrap-regression-example.py njs, 2018-09-21 04:51
Pull Requests
URL Status Linked Edit
PR 9468 merged njs, 2018-09-21 05:55
PR 9491 merged miss-islington, 2018-09-22 04:44
PR 9492 merged christian.heimes, 2018-09-22 04:47
Messages (16)
msg325940 - (view) Author: Nathaniel Smith (njs) * (Python committer) Date: 2018-09-21 02:05
Hey Ned, we just noticed that since a few days ago the trio testsuite is failing on 3.7-dev (but not 3.7.0), in a test checking an obscure feature in the ssl module: https://travis-ci.org/python-trio/trio/builds/431291929

And I just reproduced the issue locally with a build of the 3.7 branch HEAD, so somehow whatever is happening is definitely new in 3.7.1.

It could just be a bug in the test or something, I don't know because I haven't debugged it yet (I'll do that now). But with 3.7.1rc1 imminent I figured you'd rather know earlier rather than later.
msg325941 - (view) Author: Nathaniel Smith (njs) * (Python committer) Date: 2018-09-21 02:06
(And Christian, if you know of any risky-sounding recent changes in SSLObject.unwrap, lmk :-))
msg325942 - (view) Author: Nathaniel Smith (njs) * (Python committer) Date: 2018-09-21 02:23
Git bisect says:


1229664f30dd5fd4da32174a19258f8312464d45 is the first bad commit
commit 1229664f30dd5fd4da32174a19258f8312464d45
Author: Miss Islington (bot) <31488909+miss-islington@users.noreply.github.com>
Date:   Mon Sep 17 12:12:13 2018 -0700

    bpo-32533: Fixed thread-safety of error handling in _ssl. (GH-7158)
    
    (cherry picked from commit c6fd1c1c3a65217958b68df3a4991e4f306e9b7d)
    
    Co-authored-by: Steve Dower <steve.dower@microsoft.com>


Now let's see if I can narrow down exactly what the behavioral change is, besides "this random test started failing"...
msg325943 - (view) Author: Steve Dower (steve.dower) * (Python committer) Date: 2018-09-21 02:35
It should just be gathering SSL error codes before acquiring the GIL, which could switch thread and then get the wrong code.

Of course, it's possible it is in the backport. How easily can you test against master? (I'm AFK, so can't easily do anything right now.)
msg325944 - (view) Author: Nathaniel Smith (njs) * (Python committer) Date: 2018-09-21 02:39
The test doesn't involve any threads, so it does seem strange that this PR changed its behavior.

I haven't checked against master carefully, but the original observation was that our Travis "3.7-dev" and "3.8-dev" tests started failing in the same way sometime in the last week-ish, so my tentative guess is that it it's the same for both.
msg325945 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2018-09-21 02:48
Thanks for the heads-up!  If you become convinced it's a post-3.7.0 regression that would affect 3.7.1, please change the priority to "release blocker".
msg325951 - (view) Author: Nathaniel Smith (njs) * (Python committer) Date: 2018-09-21 04:51
The ssl module's unwrap() method is intended to do a clean shutdown of TLS encryption on a socket (or memory BIO or whatever). The idea is that it sends a "close notify" frame (i.e., tells the peer that we're shutting down), and then it waits for the peer to send a "close notify" back, and then the TLS layer is cleanly shut down.

If you have a non-blocking socket or memory BIO, of course, it can't actually wait for the peer to send the "close notify" back. So in this case its semantics are: the first time you call it, it sends a "close notify", and then if it hasn't seen a "close notify", it raises SSLWantReadError to tell you that, and that you need to keep calling it.

In the current 3.7 branch, however, it never raises SSLWantReadError, so there's no way for calling code to figure out whether the other side has sent a "close notify". So in the current 3.7 branch, it's impossible to do a clean shutdown of the TLS layer on a non-blocking socket or memory BIO, and this is a regression from 3.7.0.

The attached script demonstrates this -- on 3.7.0 it completes successfully, but on the 3.7 branch it fails.

Impact: This is maybe not as bad as it sounds, because 'unwrap' is *very* rarely used? (Lots of protocols have STARTTLS, but not many have STOPTLS!) But I'm still going to mark it as a release blocker for now because:

- it is a genuine regression, that makes a rare-but-intentional-and-documented feature unusable

- while I haven't fully understood why the thread-safety patch causes this problem, the changes that patch makes to 'unwrap' are very similar to the changes it makes to much more important functions like 'do_handshake' and 'send' and 'recv', so I'm nervous that the underlying issue might affect those as well.

- I figure if Ned doesn't think it's a release blocker, he can always remove the tag :-)
msg325952 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2018-09-21 04:54
Steve?  Christian?
msg325954 - (view) Author: Nathaniel Smith (njs) * (Python committer) Date: 2018-09-21 05:36
Oh, here it is:

https://github.com/python/cpython/commit/1229664f30dd5fd4da32174a19258f8312464d45#diff-e1cc5bf74055e388cda128367a814c8fR2587


-    if (err < 0) {
+    if (err.ssl < 0) {

Before in this function, 'err' was the return code from SSL_shutdown, which is -1 if an error occurred (and you're supposed to look at the SSL error code) or 0/1 on success. Now 'err.ssl' is the actual SSL error code, which as it turns out is never negative.
msg325962 - (view) Author: Nathaniel Smith (njs) * (Python committer) Date: 2018-09-21 06:11
PR posted. Also seems to affect 3.6, so adding that to the tags.
msg325968 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2018-09-21 06:49
Good catch, Nathaniel.

I was about to ask for a reproducer test, but then I saw that you have added one already. Fantastic!
msg326070 - (view) Author: miss-islington (miss-islington) Date: 2018-09-22 04:44
New changeset c0da582b227f311126e278b5553a7fa89c79b054 by Miss Islington (bot) (Nathaniel J. Smith) in branch 'master':
bpo-34759: Fix error handling in ssl 'unwrap()' (GH-9468)
https://github.com/python/cpython/commit/c0da582b227f311126e278b5553a7fa89c79b054
msg326074 - (view) Author: miss-islington (miss-islington) Date: 2018-09-22 05:00
New changeset c00f7037df3607c89323e68db3ab996b7df394de by Miss Islington (bot) in branch '3.7':
bpo-34759: Fix error handling in ssl 'unwrap()' (GH-9468)
https://github.com/python/cpython/commit/c00f7037df3607c89323e68db3ab996b7df394de
msg326075 - (view) Author: miss-islington (miss-islington) Date: 2018-09-22 05:10
New changeset 7529754d26f5e744ae25bee56fdc1937bcf08c7e by Miss Islington (bot) (Christian Heimes) in branch '3.6':
[3.6] bpo-34759: Fix error handling in ssl 'unwrap()' (GH-9468) (GH-9492)
https://github.com/python/cpython/commit/7529754d26f5e744ae25bee56fdc1937bcf08c7e
msg326323 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2018-09-25 05:41
This issue can now be closed, right?
msg326324 - (view) Author: Nathaniel Smith (njs) * (Python committer) Date: 2018-09-25 05:43
Whoops, yep, I forgot it doesn't auto-close.
History
Date User Action Args
2018-09-25 05:43:42njssetstatus: open -> closed
resolution: fixed
messages: + msg326324

stage: patch review -> resolved
2018-09-25 05:41:50ned.deilysetmessages: + msg326323
2018-09-22 05:10:10miss-islingtonsetmessages: + msg326075
2018-09-22 05:00:46miss-islingtonsetmessages: + msg326074
2018-09-22 04:47:53christian.heimessetpull_requests: + pull_request8902
2018-09-22 04:44:23miss-islingtonsetpull_requests: + pull_request8901
2018-09-22 04:44:15miss-islingtonsetnosy: + miss-islington
messages: + msg326070
2018-09-21 06:49:59christian.heimessetmessages: + msg325968
2018-09-21 06:11:40njssetkeywords: + 3.6regression

messages: + msg325962
versions: + Python 3.6
2018-09-21 05:55:03njssetkeywords: + patch
stage: patch review
pull_requests: + pull_request8881
2018-09-21 05:36:30njssetmessages: + msg325954
2018-09-21 04:54:02ned.deilysetmessages: + msg325952
2018-09-21 04:51:55njssetpriority: normal -> release blocker
files: + ssl-unwrap-regression-example.py
messages: + msg325951
2018-09-21 02:48:44ned.deilysetmessages: + msg325945
2018-09-21 02:39:51njssetmessages: + msg325944
2018-09-21 02:35:24steve.dowersetmessages: + msg325943
2018-09-21 02:23:25njssetnosy: + steve.dower
messages: + msg325942
2018-09-21 02:06:06njssetmessages: + msg325941
2018-09-21 02:05:23njscreate