Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ssl module doesn't support non-blocking handshakes #45592

Closed
chrisstawarz mannequin opened this issue Oct 9, 2007 · 29 comments
Closed

ssl module doesn't support non-blocking handshakes #45592

chrisstawarz mannequin opened this issue Oct 9, 2007 · 29 comments
Labels
docs Documentation in the Doc dir stdlib Python modules in the Lib dir tests Tests in the Lib/test dir type-feature A feature request or enhancement

Comments

@chrisstawarz
Copy link
Mannequin

chrisstawarz mannequin commented Oct 9, 2007

BPO 1251
Nosy @facundobatista, @jcea, @orsenthil, @pitrou, @giampaolo
Files
  • ssl_nonblocking_handshake_patch.txt
  • nonblocking_handshake.py
  • blocking_handshake.py
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2010-04-27.21:49:23.487>
    created_at = <Date 2007-10-09.22:48:18.002>
    labels = ['tests', 'type-feature', 'library', 'docs']
    title = "ssl module doesn't support non-blocking handshakes"
    updated_at = <Date 2012-06-28.16:04:26.694>
    user = 'https://bugs.python.org/chrisstawarz'

    bugs.python.org fields:

    activity = <Date 2012-06-28.16:04:26.694>
    actor = 'pitrou'
    assignee = 'janssen'
    closed = True
    closed_date = <Date 2010-04-27.21:49:23.487>
    closer = 'pitrou'
    components = ['Documentation', 'Library (Lib)', 'Tests']
    creation = <Date 2007-10-09.22:48:18.002>
    creator = 'chris.stawarz'
    dependencies = []
    files = ['8499', '8525', '8526']
    hgrepos = []
    issue_num = 1251
    keywords = ['patch']
    message_count = 29.0
    messages = ['56295', '56307', '56309', '56311', '56314', '56316', '56318', '56320', '56379', '56380', '56387', '56454', '56476', '56486', '57609', '63830', '63943', '66886', '66887', '66930', '72269', '72283', '72363', '72445', '79569', '79745', '104372', '164267', '164270']
    nosy_count = 9.0
    nosy_names = ['facundobatista', 'jafo', 'jcea', 'janssen', 'orsenthil', 'pitrou', 'giampaolo.rodola', 'chris.stawarz', 'gundlach']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = None
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue1251'
    versions = ['Python 2.6', 'Python 3.0']

    @chrisstawarz
    Copy link
    Mannequin Author

    chrisstawarz mannequin commented Oct 9, 2007

    The current version of the ssl module doesn't support non-blocking
    creation of SSLSocket objects. The reason for this is that the SSL
    handshaking (SSL_connect/SSL_accept) takes place during the
    construction of the SSLContext object (in newPySSLObject). This means
    that if the socket being wrapped is non-blocking, and the handshake
    fails with SSL_ERROR_WANT_READ/SSL_ERROR_WANT_WRITE, then the entire
    SSLContext is scrapped, and newPySSLObject must be run again in its
    entirety. Unfortunately, restarting from scratch on the same socket
    appears to confuse the remote host, and the new attempt fails.

    The attached patch fixes this problem by removing the handshaking code
    from newPySSLObject and adding a do_handshake method to SSLContext.
    It also adds a new parameter (do_handshake_on_connect) to the
    SSLSocket constructor and the wrap_socket function. The default value
    of the parameter is True, which preserves the current behavior of the
    module by immediately calling do_handshake after sslwrap. If
    do_handshake_on_connect is set to False, then the caller is
    responsible for calling do_handshake. This allows code that uses
    non-blocking sockets to first create the SSLSocket and then
    iteratively call do_handshake and select.select until the process
    completes (which is exactly how non-blocking reads and writes are
    handled).

    @chrisstawarz chrisstawarz mannequin added docs Documentation in the Doc dir stdlib Python modules in the Lib dir tests Tests in the Lib/test dir type-feature A feature request or enhancement labels Oct 9, 2007
    @gvanrossum
    Copy link
    Member

    Assigning to Bill Janssen.

    @janssen
    Copy link
    Mannequin

    janssen mannequin commented Oct 10, 2007

    By a bit of synchronicity, I've been looking at non-blocking SSL issues
    all day. Thanks, Chris. I'll take a look and fold it in. There are a
    number of other issues here, too, such as changing the socket's blocking-
    ness after it's wrapped (which asyncore does).

    @janssen
    Copy link
    Mannequin

    janssen mannequin commented Oct 10, 2007

    The larger problem here is that straightforward select() just doesn't
    work with SSL-wrapped sockets. If you depend on it for liveness, your
    program will eventually hang up.

    When packets in SSL arrive at a destination, they are pulled off the
    socket in chunks of sizes controlled by the encryption protocol being
    used, decrypted, and placed in SSL-internal buffers. The buffer content
    is then transferred to the application program through SSL_read(). If
    you've read only part of the decrypted data, there will still be pending
    input data on the SSL connection, but it won't show up on the underlying
    file descriptor via select(). Your code needs to call SSL_pending()
    explicitly to see if there is any pending data to be read.

    @chrisstawarz
    Copy link
    Mannequin Author

    chrisstawarz mannequin commented Oct 10, 2007

    Yeah, the pattern for doing non-blocking I/O with select() is
    different for SSL-wrapped sockets: You always have to try the
    potentially-blocking operation first, and then call select() and
    retry in response to SSL_ERROR_WANT_READ/WRITE. (You can also check
    SSL_pending(), but I don't think you really have to.) Also, unlike
    normal sockets, SSL-wrapped sockets *must* be set non-blocking.

    I can see how this pattern might not play nicely with asyncore. But
    I think this is a separate (though related) issue from the one I
    reported. As it's currently implemented, the ssl module provides no
    way of wrapping a socket without (potentially) blocking during the
    handshake, making it unusable by Twisted or any other package that
    requires all I/O to be non-blocking. Moving the handshaking into a
    separate method solves this problem.

    @janssen
    Copy link
    Mannequin

    janssen mannequin commented Oct 10, 2007

    "SSL-wrapped sockets *must* be set non-blocking."

    Can you say a bit more about why?

    @chrisstawarz
    Copy link
    Mannequin Author

    chrisstawarz mannequin commented Oct 10, 2007

    I meant that SSL-wrapped sockets must be set non-blocking in the case
    where you want to do non-blocking I/O with them using select(). This
    is another difference between SSL-wrapped sockets and normal
    sockets. With a normal socket, as long as you use select() to know
    when a read or write won't block, it shouldn't matter whether you've
    called setblocking(False) on the socket (although there may be corner
    cases where it does).

    With an SSL-wrapped socket, you have to try the I/O operation first,
    and then call select() if it fails with SSL_ERROR_WANT_READ/WRITE.
    But that won't happen if the socket is in blocking mode. In that
    case, the OpenSSL call will just block until the operation completes
    (or an error or disconnection occurs).

    That's my understanding, anyway, based on the OpenSSL man pages and
    my own usage.

    @janssen
    Copy link
    Mannequin

    janssen mannequin commented Oct 10, 2007

    Yes, that's correct.

    I've reviewed your patch, and it looks OK to me. I'll fold it in in the
    next go-around, once I've made some progress on the asyncore issues.

    @janssen
    Copy link
    Mannequin

    janssen mannequin commented Oct 12, 2007

    Chris,

    Looking at this a bit harder, I don't see that it accomplishes much.
    The loop in _ssl.c/do_handshake will never return WANT_READ or
    WANT_WRITE, so the loop in the test case, for instance, is unnecessary.
    The original code handled the non-blocking case just fine. I think this
    fix is unnecessary.

    @chrisstawarz
    Copy link
    Mannequin Author

    chrisstawarz mannequin commented Oct 12, 2007

    The loop in _ssl.c/do_handshake will never return WANT_READ or
    WANT_WRITE, so the loop in the test case, for instance, is
    unnecessary.

    I don't know why you think that, but it's easy enough to show that
    this statement is incorrect. I've attached two scripts
    (nonblocking_handshake.py and blocking_handshake.py). The first does
    basically the same thing as my test case, but connecting to a
    different server and with some print statements added. Here's the
    output I get when I run it using an up-to-date trunk checkout with my
    patch applied:

    $ ../build/bin/python2.6 nonblocking_handshake.py
    starting handshake
    need read
    need read
    need read
    handshake complete

    The second reproduces the situation that led me to file this bug
    report in the first place. Here's what happens when I run it with an
    *unpatched* trunk build:

    $ ../build/bin/python2.6 blocking_handshake.py
    starting handshake
    need read
    Traceback (most recent call last):
       File "blocking_handshake.py", line 14, in <module>
         s = ssl.wrap_socket(s,cert_reqs=ssl.CERT_NONE)
       File "/Users/cstawarz/Documents/Code/Python/svn/build/lib/ 
    python2.6/ssl.py", line 466, in wrap_socket
         ssl_version=ssl_version, ca_certs=ca_certs)
       File "/Users/cstawarz/Documents/Code/Python/svn/build/lib/ 
    python2.6/ssl.py", line 103, in __init__
         cert_reqs, ssl_version, ca_certs)
    ssl.SSLError: [Errno 1] _ssl.c:429: error:04077068:rsa  
    routines:RSA_verify:bad signature

    As you see, in both cases the handshaking fails with
    SSL_ERROR_WANT_READ. But without the fixes introduced by my patch,
    there's no way to handle the error.

    @janssen
    Copy link
    Mannequin

    janssen mannequin commented Oct 13, 2007

    It's my mistake; I was looking at too many patches at the same time.
    Thanks for the example.

    Bill

    @janssen
    Copy link
    Mannequin

    janssen mannequin commented Oct 15, 2007

    Perhaps we shouldn't expose this at the application level. We could
    check, in the C module's sslwrap, whether the socket is blocking or not,
    and do the right thing there, so that sslwrap would always succeed in
    one call. Since we are releasing the GIL whenever we do SSL_accept() or
    SSL_connect(), other threads get a chance to run, so doing it
    transparently shouldn't affect the Python program's liveness. And it
    would reduce the chances for application error.

    @chrisstawarz
    Copy link
    Mannequin Author

    chrisstawarz mannequin commented Oct 16, 2007

    Bill,

    You seem to be missing the whole point of doing non-blocking I/O,
    which is to handle multiple concurrent, I/O-bound tasks in a single
    OS thread. The application must be able to try the handshake, detect
    that it didn't complete because I/O needs to take place, and then
    retry it at a later time of its choosing (presumably when select()
    says it's OK to read from the socket). And between now and that
    later time, it can do other things, like read or write to other sockets.

    The point is that the *application* must have control over when it
    does I/O, when it waits for sockets to be ready for I/O, and what it
    does in between. There's just no way it can do this if the sslwrap()
    function doesn't return until the handshaking is complete. sslwrap()
    can't "do the right thing", because the right thing is to return
    control to the application until it's ready to try the handshake again.

    And this has nothing to do with the GIL or multiple threads. Like I
    said, the point of doing non-blocking I/O with select() is to *avoid*
    the need for multiple threads.

    @janssen
    Copy link
    Mannequin

    janssen mannequin commented Oct 16, 2007

    "You seem to be missing the whole point of doing non-blocking I/O,"

    You know, I think that's right. I'm so used to using threads by now
    that the whole select-based approach seems very odd to me by now.

    OK, I've folded your patch into the PyPI version, ssl 1.8. It will
    meander into the trunk and 3K version as well. Thanks for the help!

    Bill

    On 10/15/07, Chris Stawarz <report@bugs.python.org> wrote:

    Chris Stawarz added the comment:

    Bill,

    You seem to be missing the whole point of doing non-blocking I/O,
    which is to handle multiple concurrent, I/O-bound tasks in a single
    OS thread. The application must be able to try the handshake, detect
    that it didn't complete because I/O needs to take place, and then
    retry it at a later time of its choosing (presumably when select()
    says it's OK to read from the socket). And between now and that
    later time, it can do other things, like read or write to other sockets.

    The point is that the *application* must have control over when it
    does I/O, when it waits for sockets to be ready for I/O, and what it
    does in between. There's just no way it can do this if the sslwrap()
    function doesn't return until the handshaking is complete. sslwrap()
    can't "do the right thing", because the right thing is to return
    control to the application until it's ready to try the handshake again.

    And this has nothing to do with the GIL or multiple threads. Like I
    said, the point of doing non-blocking I/O with select() is to *avoid*
    the need for multiple threads.


    Tracker <report@bugs.python.org>
    <http://bugs.python.org/issue1251\>


    @janssen
    Copy link
    Mannequin

    janssen mannequin commented Nov 17, 2007

    This is now checked into the 3K branch.

    @jafo
    Copy link
    Mannequin

    jafo mannequin commented Mar 18, 2008

    Should this be back-ported to 2.6, or can it be closed?

    @janssen
    Copy link
    Mannequin

    janssen mannequin commented Mar 18, 2008

    I'm working on it. I'll close it when it's finished.

    @jcea
    Copy link
    Member

    jcea commented May 15, 2008

    I'm hitting this issue aswell. How is going?.

    I'm creating a socket with a, let say, 5 seconds timeout. The timeout
    works fine before the "wrap_socket()", and after it. But the timeout
    doesn't work WHILE in the "wrap_socket()" method call.

    What can I do?.

    If I need to call "do_handshake()" myself, working with
    SSL_ERROR_WANT_READ/WRITE, I think this *needs* to be documented somewhere.

    That is, any difference between "normal" sockets and "ssl" sockets need
    to be documented in docs. Explicitly.

    My opinion, of course :).

    @janssen
    Copy link
    Mannequin

    janssen mannequin commented May 16, 2008

    Here's what's in the 3.0 docs:

    The parameter do_handshake_on_connect specifies whether to do the SSL
    handshake automatically after doing a socket.connect(), or whether the
    application program will call it explicitly, by invoking the
    SSLSocket.do_handshake()<http://docs.python.org/dev/3.0/library/ssl.html#ssl.SSLSocket.do_handshake\>method.
    Calling
    SSLSocket.do_handshake()<http://docs.python.org/dev/3.0/library/ssl.html#ssl.SSLSocket.do_handshake\>explicitly
    gives the program control over the blocking behavior of the
    socket I/O involved in the handshake.

    Look at test.test_ssl.testNonBlockingHandshake() in 3.0alpha or in the PyPI
    module. I'm still working on 2.6.

    Bill

    On Thu, May 15, 2008 at 3:56 PM, Jesús Cea Avión <report@bugs.python.org>
    wrote:

    Jesús Cea Avión <jcea@jcea.es> added the comment:

    I'm hitting this issue aswell. How is going?.

    I'm creating a socket with a, let say, 5 seconds timeout. The timeout
    works fine before the "wrap_socket()", and after it. But the timeout
    doesn't work WHILE in the "wrap_socket()" method call.

    What can I do?.

    If I need to call "do_handshake()" myself, working with
    SSL_ERROR_WANT_READ/WRITE, I think this *needs* to be documented somewhere.

    That is, any difference between "normal" sockets and "ssl" sockets need
    to be documented in docs. Explicitly.

    My opinion, of course :).


    Tracker <report@bugs.python.org>
    <http://bugs.python.org/issue1251\>


    @jcea
    Copy link
    Member

    jcea commented May 16, 2008

    Thanks, Bill. I was reading 2.6 preview documentation, and nothing is
    said there.

    @orsenthil
    Copy link
    Member

    This issue is yet not fixed for both Py2.6 and Py3k. The tests which are
    present in code are not run (or disabled) in test_ssl.py

    I understand, customers have a good chance of hitting upon this issue.
    When ssl do_handshake() does not timeout and application just hangs!

    Janssen, would you like to close on this?

    bpo-1424152 (for certain scenarios) has a dependency upon this one.

    @janssen
    Copy link
    Mannequin

    janssen mannequin commented Sep 1, 2008

    I believe this is now implemented in all the branches. And when I run the
    tests, they run fine. There's still an issue with unwrap; it does a
    blocking tear-down of the SSL session, and can block when you don't want
    it to. I'll have to look further at that.

    @orsenthil
    Copy link
    Member

    Yes Janssen, I checked again and found it implemented in both trunk
    (py26) and py3k. All the tests pass as well.

    However, in one of my testcases for bpo-1424152, where I expected the
    timeout to happen for do_handshake(), it did not take effect. I shall
    look for the reasons.

    The following is the tail from the traceback.

    File "/usr/local/lib/python2.6/httplib.py", line 1095, in connect
    self.sock = ssl.wrap_socket(sock, self.key_file, self.cert_file)
    File "/usr/local/lib/python2.6/ssl.py", line 316, in wrap_socket
    suppress_ragged_eofs=suppress_ragged_eofs)
    File "/usr/local/lib/python2.6/ssl.py", line 116, in __init__
    self.do_handshake()
    File "/usr/local/lib/python2.6/ssl.py", line 260, in do_handshake
    self._sslobj.do_handshake()
    KeyboardInterrupt

    @janssen
    Copy link
    Mannequin

    janssen mannequin commented Sep 4, 2008

    Thanks. If you can identify a specific bug, I'll take a look at it.

    @facundobatista
    Copy link
    Member

    Bill, should this issue be closed?

    Or Senthil found a bug in the actual code and you're waiting for him to
    point out where is the problem or a way to reproduce it?

    @janssen
    Copy link
    Mannequin

    janssen mannequin commented Jan 13, 2009

    Well, maybe he found something -- never reported back. But it was a few
    months ago... I'm in no hurry to close it, though.

    @pitrou
    Copy link
    Member

    pitrou commented Apr 27, 2010

    do_handshake() not respecting the socket timeout was fixed in r80452.

    @pitrou pitrou closed this as completed Apr 27, 2010
    @gundlach
    Copy link
    Mannequin

    gundlach mannequin commented Jun 28, 2012

    I recently started getting what appears to be this bug in 2.6.6 and 2.7.3 when connecting to Google.

    My script does this:

      while True:
        get an IMAP connection to Google, if our old one timed out
        fetch unread messages.  if any:
          make an SMTP connection to Google and send some emails 
          mark the unread messages as read
        sleep 10 seconds

    It used to run fine on 2.6. I'm assuming something changed at Google, because it started hanging, whether in 2.6 or 2.7. Here's the stack trace when I eventually press Ctrl-C:

    File "./main.py", line 21, in loop
    the_emails = list(emails.messages('(UNSEEN)'))
    File "./emails.py", line 129, in messages
    for chunk in chunks_of_length(32, messages):
    File "./chunks.py", line 9, in chunks_of_length
    for item in iterable:
    File "./emails.py", line 90, in email_messages
    m = open_mailbox(label, readonly=True)
    File "./emails.py", line 30, in open_mailbox
    m = imaplib.IMAP4_SSL('imap.gmail.com', 993)
    File "/usr/lib64/python2.6/imaplib.py", line 1138, in __init__
    IMAP4.__init__(self, host, port)
    File "/usr/lib64/python2.6/imaplib.py", line 163, in __init__
    self.open(host, port)
    File "/usr/lib64/python2.6/imaplib.py", line 1150, in open
    self.sslobj = ssl.wrap_socket(self.sock, self.keyfile, self.certfile)
    File "/usr/lib64/python2.6/ssl.py", line 338, in wrap_socket
    suppress_ragged_eofs=suppress_ragged_eofs)
    File "/usr/lib64/python2.6/ssl.py", line 120, in __init__
    self.do_handshake()
    File "/usr/lib64/python2.6/ssl.py", line 279, in do_handshake
    self._sslobj.do_handshake()
    KeyboardInterrupt

    FWIW, for the last couple of days my script is *not* hanging, so perhaps Google changed something on their end, avoiding the Python bug.

    Should this Issue be reopened or should I file a new Issue?

    @pitrou
    Copy link
    Member

    pitrou commented Jun 28, 2012

    [...]

    Should this Issue be reopened or should I file a new Issue?

    I would prefer a new issue to be filed (assuming the error happens
    again).

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    docs Documentation in the Doc dir stdlib Python modules in the Lib dir tests Tests in the Lib/test dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants