Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Memory BIO to _ssl #66164

Closed
geertj mannequin opened this issue Jul 12, 2014 · 39 comments
Closed

Add support for Memory BIO to _ssl #66164

geertj mannequin opened this issue Jul 12, 2014 · 39 comments
Labels
extension-modules C modules in the Modules dir type-feature A feature request or enhancement

Comments

@geertj
Copy link
Mannequin

geertj mannequin commented Jul 12, 2014

BPO 21965
Nosy @gvanrossum, @pitrou, @vstinner, @giampaolo, @tiran, @ezio-melotti, @alex, @bdarnell, @1st1, @dstufft
Files
  • ssl-memory-bio.patch
  • ssl-memory-bio-2.patch: Updated patch
  • bio_python_options.py: A few options for the Python-level API
  • ssl-memory-bio-3.patch: Updated patch (adds Python-level API).
  • ssl-memory-bio-4.patch
  • ssl-memory-bio-4-incr1.patch
  • ssl-memory-bio-4-incr2.patch: Updated patch, incremental to 4. Makes SSLSocket use SSLObject.
  • ssl-memory-bio-5.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2014-10-05.22:22:59.394>
    created_at = <Date 2014-07-12.09:08:54.520>
    labels = ['extension-modules', 'type-feature']
    title = 'Add support for Memory BIO to _ssl'
    updated_at = <Date 2014-10-06.10:15:13.661>
    user = 'https://bugs.python.org/geertj'

    bugs.python.org fields:

    activity = <Date 2014-10-06.10:15:13.661>
    actor = 'vstinner'
    assignee = 'none'
    closed = True
    closed_date = <Date 2014-10-05.22:22:59.394>
    closer = 'pitrou'
    components = ['Extension Modules']
    creation = <Date 2014-07-12.09:08:54.520>
    creator = 'geertj'
    dependencies = []
    files = ['35928', '36189', '36191', '36248', '36475', '36483', '36791', '36806']
    hgrepos = []
    issue_num = 21965
    keywords = ['patch']
    message_count = 39.0
    messages = ['222833', '223518', '223597', '224478', '224480', '224704', '224723', '224724', '224733', '224734', '224834', '224835', '224926', '224932', '224952', '225103', '225893', '225895', '225910', '225949', '226131', '226184', '226237', '226939', '228321', '228322', '228324', '228481', '228504', '228506', '228509', '228579', '228583', '228614', '228620', '228623', '228624', '228626', '228654']
    nosy_count = 15.0
    nosy_names = ['gvanrossum', 'geertj', 'janssen', 'pitrou', 'vstinner', 'giampaolo.rodola', 'christian.heimes', 'ezio.melotti', 'alex', 'python-dev', 'sbt', 'Ben.Darnell', 'yselivanov', 'dstufft', 'chatgris']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue21965'
    versions = ['Python 3.5']

    @geertj
    Copy link
    Mannequin Author

    geertj mannequin commented Jul 12, 2014

    The attached patch adds a _MemoryBIO type to _ssl, and a _wrap_bio() method to _SSLContext. The patch also includes tests.

    For now I kept _wrap_bio() and _MemoryBIO semi-private. The reason is that it returns an _SSLSocket instead of an SSLSocket and this type has not been exposed before as part of the public API. Changing the result of _wrap_bio to return an SSLSocket is not appropriate IMHO because it should not inherit from socket.socket which would waste a file descriptor and None of the IO methods are relevant.

    The patch works for me and gives no errors with --with-pydebug. I've also used it in an experimental branch of Gruvi and all the tests pass there too.

    @geertj geertj mannequin added the type-feature A feature request or enhancement label Jul 12, 2014
    @ezio-melotti ezio-melotti added the extension-modules C modules in the Modules dir label Jul 16, 2014
    @geertj
    Copy link
    Mannequin Author

    geertj mannequin commented Jul 20, 2014

    Hi all (pitrou, haypo and all others) can I get some feedback on this patch?

    Thanks!

    @pitrou
    Copy link
    Member

    pitrou commented Jul 21, 2014

    The C part of the patch looks roughly ok to me (modulo a couple of comments). However, we must now find a way to expose this as a Python-level API.

    @geertj
    Copy link
    Mannequin Author

    geertj mannequin commented Aug 1, 2014

    I added a new patch that addresses the comments.

    @geertj
    Copy link
    Mannequin Author

    geertj mannequin commented Aug 1, 2014

    I've explored a few options for the Python-level API in the attachment "bio_python_options.py".

    Me personally I prefer the more light weight option #3. This is both out of selfish interest (less work for me), but also I believe that memory BIOs are an API that will be used almost exclusively by framework authors, not by end users like SSLSocket itself. So a more lower-level (but perfectly valid IMHO) API would be appropriate.

    @geertj
    Copy link
    Mannequin Author

    geertj mannequin commented Aug 4, 2014

    New patch with a Python-level API (option #3).

    This needs some more tests, and docs.

    @pitrou
    Copy link
    Member

    pitrou commented Aug 4, 2014

    I think the API choice looks reasonable, thank you (haven't looked at the patch in detail). A question though: does it support server-side SNI? AFAIR server-side SNI requires you to be able to change a SSL object's context.

    @pitrou
    Copy link
    Member

    pitrou commented Aug 4, 2014

    Am adding the asyncio maintainers as well as Ben Darnell (Tornado) to the nosy list, for feedback.

    @geertj
    Copy link
    Mannequin Author

    geertj mannequin commented Aug 4, 2014

    A question though: does it support server-side SNI? AFAIR server-side SNI requires you to be able to change a SSL object's context.

    Yes, it does. See the following comment in _servername_callback():

    /* Pass a PySSLSocket instance when using memory BIOs, but an ssl.SSLSocket

    • when using sockets. Note that the latter is not a subclass of the
    • former, but both do have a "context" property. THis supports the common
    • use case of setting this property in the servername callback. */

    The C-level _ssl._SSLSocket object is passed to the servername callback. It has a "context" property that can be set.

    I realize the above is an abstraction violation between the C and Python level. Now that we have an SSLObject Python level API, I could update the code to store a weakref to the SSLObject in the _SSLSocket (just like it does for SSLSocket). That way I can pass the Python level object into the callback. Any thoughts?

    @pitrou
    Copy link
    Member

    pitrou commented Aug 4, 2014

    Le 04/08/2014 11:21, Geert Jansen a écrit :

    I realize the above is an abstraction violation between the C and
    Python level. Now that we have an SSLObject Python level API, I could
    update the code to store a weakref to the SSLObject in the _SSLSocket
    (just like it does for SSLSocket). That way I can pass the Python level
    object into the callback. Any thoughts?

    I think it would make the exposed API nicer, although the implementation
    would be a bit uglier. Given Python's philosophy, I think the nicer API
    wins :-)

    @bdarnell
    Copy link
    Mannequin

    bdarnell mannequin commented Aug 5, 2014

    Looks good to me. I've added exarkun and glyph to the nosy list since Twisted's experience with PyOpenSSL may provide useful feedback even though Twisted will presumably stick with what they've got instead of switching to this new interface.

    @pitrou
    Copy link
    Member

    pitrou commented Aug 5, 2014

    By the way, this would allow ProactorEventLoop to support SSL, since it decouples the SSL protocol handling from the actual socket I/O.

    @exarkun
    Copy link
    Mannequin

    exarkun mannequin commented Aug 6, 2014

    Please do *not* add me to the nosy list of any issues.

    @pitrou
    Copy link
    Member

    pitrou commented Aug 6, 2014

    Perhaps Glyph wants to chime in :-)

    @glyph
    Copy link
    Mannequin

    glyph mannequin commented Aug 6, 2014

    I don't have a whole lot to add. I strongly recommended that this be done this way twice, once when ssl was added to Python and once when ssl was added to tulip, so I'm glad to see it's happening now. Regarding the specific implementation I am unlikely to have the interest in reviewing the code because I already have a working TLS implementation which does this. Nevertheless, if it works to get the proactor interfaces to support SSL, then it is almost certainly adequate.

    It would be great to eliminate the dependency on OpenSSL's writing-to-a-socket code entirely; Python already knows how to write to a socket, and it probably knows how to do it better than OpenSSL does.

    My only further input is that this code should all be deleted and replaced with pyOpenSSL or at least a separate thin wrapper over PyCA's Cryptography bindings. My Cassandra complex and I look forward to this advice becoming obvious to everyone else in 5-7 years :-). In the meanwhile, I will de-nosy myself.

    @geertj
    Copy link
    Mannequin Author

    geertj mannequin commented Aug 9, 2014

    Thanks to Ben and Glyph for their feedback. The memory BIO should allow ProactorEventLoop to support SSL. I say "should" because I have not looked at it myself. However, my Gruvi project is proactor (libuv) based and I have a private branch where SSL support is working using a proactor API.

    I need a few more days to create an updated patch. This patch will include Antoine's suggestion of passing the SSLObject instance to the servername callback, and an update to the docs.

    @pitrou
    Copy link
    Member

    pitrou commented Aug 25, 2014

    Geert, are you still trying to work on this?

    @geertj
    Copy link
    Mannequin Author

    geertj mannequin commented Aug 25, 2014

    Antoine, yes, I just got back from holiday. I will have an updated patch tomorrow.

    @geertj
    Copy link
    Mannequin Author

    geertj mannequin commented Aug 26, 2014

    Updated patch. Contains:

    • An "owner" attribute on a _ssl.SSLSocket that is used as the first argument to the SNI servername callback (implemented as a weakref).
    • Documentation

    I think this covers all outstanding issues that were identified. Antoine, please let me know if you have further feedback or if not whether this can be committed.

    @geertj
    Copy link
    Mannequin Author

    geertj mannequin commented Aug 27, 2014

    Adding small patch (incremental to patch #4) to fix a test failure.

    @pitrou
    Copy link
    Member

    pitrou commented Aug 30, 2014

    Nice work, thank you! The new API looks mostly good to me. I am wondering about a couple of things:

    • is it necessary to start exposing server_hostname, server_side and pending()?
    • SSLObject is a bit vague, should we call it SSLMemoryObject? or do you expect we may want to support other kinds of BIOs some day?
    • should the basic implementations in SSLObject be shared (using some kind of mixin) with SSLSocket, or is it unpractical to do so?

    I'll take a look at the code later.

    @geertj
    Copy link
    Mannequin Author

    geertj mannequin commented Aug 31, 2014

    Thanks Antoine. See my comments below:

    • is it necessary to start exposing server_hostname, server_side and pending()?

    At the C level I need server_hostname and server_side exposed because they are needed to implement the cert check in do_handshake(). SSLObject gets a C-level _SSLSocket passed to its constructor and doesn't create it itself. So it can't store these attributes.

    At the Python level SSLSocket already had these, albeit undocumented, so that's why I added them to SSLObject as well.

    We can leave these undocumented at the Python level if you prefer.

    • SSLObject is a bit vague, should we call it SSLMemoryObject? or do you expect we may want to support other kinds of BIOs some day?

    OpenSSL calls the struct just "SSL" which I think is even less descriptive. I think the best description in words is an "SSL protocol instance", however SSLProtocolInstance looks a bit too long to me. Maybe just "SSLInstance", would that be better than "SSLObject"?

    I don't think we want to tie the name to the Memory BIO as I think that it may be useful some day to support other BIOs notably the Socket BIO. I believe that the overall _ssl/ssl code could be simplified by:

    • Making SSLSocket inherit from SSLObject and socket.
    • Remove all socket handling from _ssl and use a Socket BIO instead.
    • Implement the blocking semantics for do_handshake(), unwrap(), read() and write() at the Python level.

    For testing and benchmarks, the null BIO might be useful as well.

    • should the basic implementations in SSLObject be shared (using some kind of mixin) with SSLSocket, or is it unpractical to do so?

    It's possible but I am not sure it would simplify the code a lot. For example, there's no notion of a "closed" or an "unwrapped" socket in SSLObject. If you look at the "cipher" method for example. This is how it looks for SSLSocket:

        def cipher(self):
            self._checkClosed()
            if not self._sslobj:
                return None
            else:
                return self._sslobj.cipher()

    And this is how it looks for SSLObject:

      def cipher(self):
          return self._sslobj.cipher()

    To use SSLObject as a mixin it would have to be aware of these two uses of its subclasses. It could be done but I don't think it's 100% clean either.

    @pitrou
    Copy link
    Member

    pitrou commented Sep 1, 2014

    We can leave these undocumented at the Python level if you prefer.

    I'd rather that indeed. If there's a specific need, we can expose them as a separate issue.

    Maybe just "SSLInstance", would that be better than "SSLObject"?

    That doesn't sound much better :-) Ok, let's keep SSLObject then.

    I believe that the overall _ssl/ssl code could be simplified by: [snip]

    That would be nice. Would that also handle e.g. socket timeouts?

    To use SSLObject as a mixin it would have to be aware of these two uses of its subclasses. It could be done but I don't think it's 100% clean either.

    Fair enough. We just have to make sure to implement and test new APIs twice (e.g the version() method in bpo-20421).

    @geertj
    Copy link
    Mannequin Author

    geertj mannequin commented Sep 15, 2014

    Antoine, sorry for the delay, we just had a new kid and I changed jobs :)

    Let me try if I can create an updated patch that where SSLObject is a mixin for SSLSocket. I think the argument about writing tests once is important. Be back in a few days..

    @geertj
    Copy link
    Mannequin Author

    geertj mannequin commented Oct 3, 2014

    New patch attached. This patch makes SSLSocket use SSLObject. The big benefit here is obviously test coverage.

    I decided against using SSLObject as a mixin, because all methods need to be reimplemented anyway because for SSLSocket they need to handle the non-SSL case. Instead, I made SSLSocket._sslobj an SSLObject rather than a _ssl._SSLSocket. The patch is rather small, so I kept it incremental to patch4.

    Test suite runs fine. I had to update one SSL test (test_unknown_channel_binding). Because the test for the binding type is now in SSLObject, a non-connected SSLSocket will return None even for an unknown binding. Arguably this is even more correct because the binding type can depend on the cryptographic protocol used, e.g. tls-unique doesn't work for SSLv2 (it's currently not checked and nobody cares about SSLv2, I'm just arguing from theory here).

    A second change is that the private _sslobj is now a different type. However since this is clearly an internal attribute, I think people that are using this should expect breakage.

    Antoine, please let me know if this is now ready for merging in your view or if not what you'd like me to do still. Thanks.

    @pitrou
    Copy link
    Member

    pitrou commented Oct 3, 2014

    Well... I would have expected this approach to yield a bigger reduction in code size. If it doesn't shrink the code, then I'm not sure it's worthwhile. What do you think?

    (also, why do you have to add an "owner" attribute?)

    @geertj
    Copy link
    Mannequin Author

    geertj mannequin commented Oct 3, 2014

    Well... I would have expected this approach to yield a bigger reduction in code size. If it doesn't shrink the code, then I'm not sure it's worthwhile. What do you think?

    I think the improved test coverage might still make it worthwhile. All tests are now exercising the SSLObject methods via SSLSocket. Also it's more future proof as the risk is less that you'd add a new method to SSLSocket without adding it to SSLObject as well.

    It's not clear cut. Either way is fine I think.

    (also, why do you have to add an "owner" attribute?)

    That is to support the first argument passed to the sever name callback set with set_servername_callback(). This will be an SSLSocket or an SSLObject instance depending on who's using it.

    @pitrou
    Copy link
    Member

    pitrou commented Oct 4, 2014

    One issue with the "owner" is that there is now a reference cycle between SSLSocket and SSLObject (something which the original design is careful to avoid by using weakrefs in the _ssl module).

    @geertj
    Copy link
    Mannequin Author

    geertj mannequin commented Oct 4, 2014

    One issue with the "owner" is that there is now a reference cycle between SSLSocket and SSLObject (something which the original design is careful to avoid by using weakrefs in the _ssl module).

    Note that owner is a weakref :) Did you look at the code?

    @pitrou
    Copy link
    Member

    pitrou commented Oct 4, 2014

    Ahhh. I had forgotten about that. It may be worthwhile to add a comment in SSLObject.__init__, then. Also, can you provide a cumulated patch?

    @geertj
    Copy link
    Mannequin Author

    geertj mannequin commented Oct 4, 2014

    Addded the comment about owner being a weakref, and added a new consolidated patch (ssl-memory-bio-5).

    @geertj
    Copy link
    Mannequin Author

    geertj mannequin commented Oct 5, 2014

    Maybe an example is useful on how the Memory BIO stuff can be used to implement SSL on top of a proactor event loop. I just added support for this to my Gruvi project in the branch "feat-memory-bio":

    An "SslPipe" utility class that uses the memory BIOs:

    https://github.com/geertj/gruvi/blob/feat-memory-bio/gruvi/ssl.py#L23

    A PEP-3156 style transport:

    https://github.com/geertj/gruvi/blob/feat-memory-bio/gruvi/ssl.py#L234

    And a backport of this for Python 2.7, 3,3 and 3.4:

    https://github.com/geertj/gruvi/blob/feat-memory-bio/gruvi/_sslcompat.c
    https://github.com/geertj/gruvi/blob/feat-memory-bio/gruvi/sslcompat.py

    @pitrou
    Copy link
    Member

    pitrou commented Oct 5, 2014

    SSLPipe looks interesting. I wonder if it can be used to reimplement _SelectorSslTransport in asyncio.selector_events (at least as an experiment).
    I'll take a look at the cumulated patch soon, thank you.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Oct 5, 2014

    New changeset a79003f25a41 by Antoine Pitrou in branch 'default':
    Issue bpo-21965: Add support for in-memory SSL to the ssl module.
    https://hg.python.org/cpython/rev/a79003f25a41

    @geertj
    Copy link
    Mannequin Author

    geertj mannequin commented Oct 5, 2014

    Thanks Antoine for merge!

    SSLPipe looks interesting. I wonder if it can be used to reimplement _SelectorSslTransport in asyncio.selector_events (at least as an experiment).

    Yes, it could be done quite easily. SslPipe has no dependency on other parts of Gruvi and if this is for Python 3.5 only then you don't need sslcompat either.

    Basically you want to install a read callback on the socket that, when fired, reads from the socket and stuffs the bytes into the memory BIO. It should then write() the returning data back to the socket. If there's a short write, then it should install a write callback to retry the write.

    The above is almost identical to what SslTransport in Gruvi does. The only different is that Gruvi uses a proactor on all platforms, so that it does not need to call read() itself but the callback is already called with the buffer.

    @pitrou
    Copy link
    Member

    pitrou commented Oct 5, 2014

    Le 05/10/2014 23:24, Geert Jansen a écrit :

    Yes, it could be done quite easily. SslPipe has no dependency on
    other
    parts of Gruvi and if this is for Python 3.5 only then you don't need
    sslcompat either.

    Yes, it works. Note that I had to modify SSLPipe to also notify of
    handshake failures (by passing an argument to the handshake callback).

    Here is draft diff against asyncio:
    https://gist.github.com/pitrou/f04fa9cbfec88cc37050

    However, I don't think this the right approach actually. Rather, the SSL
    layer should be implemented as a Protocol object that's also able to act
    as a transport for the actual application-level Protocol. It would
    completely decouple it from the transport and event loop implementation
    details.

    (I think that's how Twisted does it, btw)

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Oct 5, 2014

    New changeset 8da1aa71cd73 by Antoine Pitrou in branch 'default':
    Remove unused "block" argument in SSLObject.do_handshake() (issue bpo-21965)
    https://hg.python.org/cpython/rev/8da1aa71cd73

    @pitrou
    Copy link
    Member

    pitrou commented Oct 5, 2014

    I'm closing this issue, and will open a new one for asyncio and/or SSLPipe. Thank you very much, Geert!

    @pitrou pitrou closed this as completed Oct 5, 2014
    @vstinner
    Copy link
    Member

    vstinner commented Oct 6, 2014

    I have some comments and suggestions to enhance the new API. I chose to open a new issue: bpo-22564.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    extension-modules C modules in the Modules dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants