classification
Title: Add support for Memory BIO to _ssl
Type: enhancement Stage: resolved
Components: Extension Modules Versions: Python 3.5
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Ben.Darnell, alex, chatgris, christian.heimes, dstufft, ezio.melotti, geertj, giampaolo.rodola, gvanrossum, janssen, pitrou, python-dev, sbt, vstinner, yselivanov
Priority: normal Keywords: patch

Created on 2014-07-12 09:08 by geertj, last changed 2014-10-06 10:15 by vstinner. This issue is now closed.

Files
File name Uploaded Description Edit
ssl-memory-bio.patch geertj, 2014-07-12 09:08 review
ssl-memory-bio-2.patch geertj, 2014-08-01 08:33 Updated patch review
bio_python_options.py geertj, 2014-08-01 09:43 A few options for the Python-level API
ssl-memory-bio-3.patch geertj, 2014-08-04 10:04 Updated patch (adds Python-level API). review
ssl-memory-bio-4.patch geertj, 2014-08-26 07:37 review
ssl-memory-bio-4-incr1.patch geertj, 2014-08-27 08:12
ssl-memory-bio-4-incr2.patch geertj, 2014-10-03 10:52 Updated patch, incremental to 4. Makes SSLSocket use SSLObject.
ssl-memory-bio-5.patch geertj, 2014-10-04 23:28 review
Messages (39)
msg222833 - (view) Author: Geert Jansen (geertj) * Date: 2014-07-12 09:08
The attached patch adds a _MemoryBIO type to _ssl, and a _wrap_bio() method to _SSLContext. The patch also includes tests.

For now I kept _wrap_bio() and _MemoryBIO semi-private. The reason is that it returns an _SSLSocket instead of an SSLSocket and this type has not been exposed before as part of the public API. Changing the result of _wrap_bio to return an SSLSocket is not appropriate IMHO because it should not inherit from socket.socket which would waste a file descriptor and None of the IO methods are relevant.

The patch works for me and gives no errors with --with-pydebug. I've also used it in an experimental branch of Gruvi and all the tests pass there too.
msg223518 - (view) Author: Geert Jansen (geertj) * Date: 2014-07-20 16:16
Hi all (pitrou, haypo and all others) can I get some feedback on this patch?

Thanks!
msg223597 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-07-21 18:14
The C part of the patch looks roughly ok to me (modulo a couple of comments). However, we must now find a way to expose this as a Python-level API.
msg224478 - (view) Author: Geert Jansen (geertj) * Date: 2014-08-01 09:34
I added a new patch that addresses the comments.
msg224480 - (view) Author: Geert Jansen (geertj) * Date: 2014-08-01 09:39
I've explored a few options for the Python-level API in the attachment "bio_python_options.py".

Me personally I prefer the more light weight option #3. This is both out of selfish interest (less work for me), but also I believe that memory BIOs are an API that will be used almost exclusively by framework authors, not by end users like SSLSocket itself. So a more lower-level (but perfectly valid IMHO) API would be appropriate.
msg224704 - (view) Author: Geert Jansen (geertj) * Date: 2014-08-04 10:04
New patch with a Python-level API (option #3).

This needs some more tests, and docs.
msg224723 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-08-04 14:13
I think the API choice looks reasonable, thank you (haven't looked at the patch in detail). A question though: does it support server-side SNI? AFAIR server-side SNI requires you to be able to change a SSL object's context.
msg224724 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-08-04 14:15
Am adding the asyncio maintainers as well as Ben Darnell (Tornado) to the nosy list, for feedback.
msg224733 - (view) Author: Geert Jansen (geertj) * Date: 2014-08-04 15:21
> A question though: does it support server-side SNI? AFAIR server-side SNI requires you to be able to change a SSL object's context.

Yes, it does. See the following comment in _servername_callback():

  /* Pass a PySSLSocket instance when using memory BIOs, but an ssl.SSLSocket
   * when using sockets. Note that the latter is not a subclass of the
   * former, but both do have a "context" property. THis supports the common
   * use case of setting this property in the servername callback. */

The C-level _ssl._SSLSocket object is passed to the servername callback. It has a "context" property that can be set.

I realize the above is an abstraction violation between the C and Python level. Now that we have an SSLObject Python level API, I could update the code to store a weakref to the SSLObject in the _SSLSocket (just like it does for SSLSocket). That way I can pass the Python level object into the callback. Any thoughts?
msg224734 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-08-04 15:24
Le 04/08/2014 11:21, Geert Jansen a écrit :
>
> I realize the above is an abstraction violation between the C and
Python level. Now that we have an SSLObject Python level API, I could
update the code to store a weakref to the SSLObject in the _SSLSocket
(just like it does for SSLSocket). That way I can pass the Python level
object into the callback. Any thoughts?

I think it would make the exposed API nicer, although the implementation 
would be a bit uglier. Given Python's philosophy, I think the nicer API 
wins :-)
msg224834 - (view) Author: Ben Darnell (Ben.Darnell) * Date: 2014-08-05 14:36
Looks good to me.  I've added exarkun and glyph to the nosy list since Twisted's experience with PyOpenSSL may provide useful feedback even though Twisted will presumably stick with what they've got instead of switching to this new interface.
msg224835 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-08-05 14:39
By the way, this would allow ProactorEventLoop to support SSL, since it decouples the SSL protocol handling from the actual socket I/O.
msg224926 - (view) Author: Jean-Paul Calderone (exarkun) * (Python committer) Date: 2014-08-06 11:09
Please do *not* add me to the nosy list of any issues.
msg224932 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-08-06 13:21
Perhaps Glyph wants to chime in :-)
msg224952 - (view) Author: Glyph Lefkowitz (glyph) Date: 2014-08-06 17:13
I don't have a whole lot to add.  I strongly recommended that this be done this way twice, once when ssl was added to Python and once when ssl was added to tulip, so I'm glad to see it's happening now.  Regarding the specific implementation I am unlikely to have the interest in reviewing the code because I already have a working TLS implementation which does this.  Nevertheless, if it works to get the proactor interfaces to support SSL, then it is almost certainly adequate.

It would be great to eliminate the dependency on OpenSSL's writing-to-a-socket code entirely; Python already knows how to write to a socket, and it probably knows how to do it better than OpenSSL does.

My only further input is that this code should all be deleted and replaced with pyOpenSSL or at least a separate thin wrapper over PyCA's Cryptography bindings.  My Cassandra complex and I look forward to this advice becoming obvious to everyone else in 5-7 years :-).  In the meanwhile, I will de-nosy myself.
msg225103 - (view) Author: Geert Jansen (geertj) * Date: 2014-08-09 08:35
Thanks to Ben and Glyph for their feedback. The memory BIO should allow ProactorEventLoop to support SSL. I say "should" because I have not looked at it myself. However, my Gruvi project is proactor (libuv) based and I have a private branch where SSL support is working using a proactor API.

I need a few more days to create an updated patch. This patch will include Antoine's suggestion of passing the SSLObject instance to the servername callback, and an update to the docs.
msg225893 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-08-25 20:18
Geert, are you still trying to work on this?
msg225895 - (view) Author: Geert Jansen (geertj) * Date: 2014-08-25 20:26
Antoine, yes, I just got back from holiday. I will have an updated patch tomorrow.
msg225910 - (view) Author: Geert Jansen (geertj) * Date: 2014-08-26 07:37
Updated patch. Contains:

 * An "owner" attribute on a _ssl.SSLSocket that is used as the first argument to the SNI servername callback (implemented as a weakref).
 * Documentation

I think this covers all outstanding issues that were identified. Antoine, please let me know if you have further feedback or if not whether this can be committed.
msg225949 - (view) Author: Geert Jansen (geertj) * Date: 2014-08-27 08:12
Adding small patch (incremental to patch #4) to fix a test failure.
msg226131 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-08-30 09:01
Nice work, thank you! The new API looks mostly good to me. I am wondering about a couple of things:
- is it necessary to start exposing server_hostname, server_side and pending()?
- SSLObject is a bit vague, should we call it SSLMemoryObject? or do you expect we may want to support other kinds of BIOs some day?
- should the basic implementations in SSLObject be shared (using some kind of mixin) with SSLSocket, or is it unpractical to do so?

I'll take a look at the code later.
msg226184 - (view) Author: Geert Jansen (geertj) * Date: 2014-08-31 15:25
Thanks Antoine. See my comments below:

> - is it necessary to start exposing server_hostname, server_side and pending()?

At the C level I need server_hostname and server_side exposed because they are needed to implement the cert check in do_handshake(). SSLObject gets a C-level _SSLSocket passed to its constructor and doesn't create it itself. So it can't store these attributes.

At the Python level SSLSocket already had these, albeit undocumented, so that's why I added them to SSLObject as well.

We can leave these undocumented at the Python level if you prefer.

> - SSLObject is a bit vague, should we call it SSLMemoryObject? or do you expect we may want to support other kinds of BIOs some day?

OpenSSL calls the struct just "SSL" which I think is even less descriptive. I think the best description in words is an "SSL protocol instance", however SSLProtocolInstance looks a bit too long to me. Maybe just "SSLInstance", would that be better than "SSLObject"?

I don't think we want to tie the name to the Memory BIO as I think that it may be useful some day to support other BIOs notably the Socket BIO. I believe that the overall _ssl/ssl code could be simplified by:

 - Making SSLSocket inherit from SSLObject and socket.
 - Remove all socket handling from _ssl and use a Socket BIO instead.
 - Implement the blocking semantics for do_handshake(), unwrap(), read() and write() at the Python level.

For testing and benchmarks, the null BIO might be useful as well.

> - should the basic implementations in SSLObject be shared (using some kind of mixin) with SSLSocket, or is it unpractical to do so?

It's possible but I am not sure it would simplify the code a lot. For example, there's no notion of a "closed" or an "unwrapped" socket in SSLObject. If you look at the "cipher" method for example. This is how it looks for SSLSocket:

    def cipher(self):
        self._checkClosed()
        if not self._sslobj:
            return None
        else:
            return self._sslobj.cipher()

And this is how it looks for SSLObject:

  def cipher(self):
      return self._sslobj.cipher()

To use SSLObject as a mixin it would have to be aware of these two uses of its subclasses. It could be done but I don't think it's 100% clean either.
msg226237 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-09-01 17:43
> We can leave these undocumented at the Python level if you prefer.

I'd rather that indeed. If there's a specific need, we can expose them as a separate issue.

> Maybe just "SSLInstance", would that be better than "SSLObject"?

That doesn't sound much better :-) Ok, let's keep SSLObject then.

> I believe that the overall _ssl/ssl code could be simplified by: [snip]

That would be nice. Would that also handle e.g. socket timeouts?

> To use SSLObject as a mixin it would have to be aware of these two uses of its subclasses. It could be done but I don't think it's 100% clean either.

Fair enough. We just have to make sure to implement and test new APIs twice (e.g the version() method in issue20421).
msg226939 - (view) Author: Geert Jansen (geertj) * Date: 2014-09-15 20:52
Antoine, sorry for the delay, we just had a new kid and I changed jobs :)

Let me try if I can create an updated patch that where SSLObject is a mixin for SSLSocket. I think the argument about writing tests once is important. Be back in a few days..
msg228321 - (view) Author: Geert Jansen (geertj) * Date: 2014-10-03 10:52
New patch attached. This patch makes SSLSocket use SSLObject. The big benefit here is obviously test coverage.

I decided against using SSLObject as a mixin, because all methods need to be reimplemented anyway because for SSLSocket they need to handle the non-SSL case. Instead, I made SSLSocket._sslobj an SSLObject rather than a _ssl._SSLSocket. The patch is rather small, so I kept it incremental to patch4.

Test suite runs fine. I had to update one SSL test (test_unknown_channel_binding). Because the test for the binding type is now in SSLObject, a non-connected SSLSocket will return None even for an unknown binding. Arguably this is even more correct because the binding type can depend on the cryptographic protocol used, e.g. tls-unique doesn't work for SSLv2 (it's currently not checked and nobody cares about SSLv2, I'm just arguing from theory here).

A second change is that the private _sslobj is now a different type. However since this is clearly an internal attribute, I think people that are using this should expect breakage.

Antoine, please let me know if this is now ready for merging in your view or if not what you'd like me to do still. Thanks.
msg228322 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-10-03 10:58
Well... I would have expected this approach to yield a bigger reduction in code size. If it doesn't shrink the code, then I'm not sure it's worthwhile. What do you think?

(also, why do you have to add an "owner" attribute?)
msg228324 - (view) Author: Geert Jansen (geertj) * Date: 2014-10-03 12:02
> Well... I would have expected this approach to yield a bigger reduction in code size. If it doesn't shrink the code, then I'm not sure it's worthwhile. What do you think?

I think the improved test coverage might still make it worthwhile. All tests are now exercising the SSLObject methods via SSLSocket. Also it's more future proof as the risk is less that you'd add a new method to SSLSocket without adding it to SSLObject as well.

It's not clear cut. Either way is fine I think.

> (also, why do you have to add an "owner" attribute?)

That is to support the first argument passed to the sever name callback set with set_servername_callback(). This will be an SSLSocket or an SSLObject instance depending on who's using it.
msg228481 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-10-04 19:17
One issue with the "owner" is that there is now a reference cycle between SSLSocket and SSLObject (something which the original design is careful to avoid by using weakrefs in the _ssl module).
msg228504 - (view) Author: Geert Jansen (geertj) * Date: 2014-10-04 22:15
> One issue with the "owner" is that there is now a reference cycle between SSLSocket and SSLObject (something which the original design is careful to avoid by using weakrefs in the _ssl module).

Note that owner is a weakref :) Did you look at the code?
msg228506 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-10-04 22:27
Ahhh. I had forgotten about that. It may be worthwhile to add a comment in SSLObject.__init__, then. Also, can you provide a cumulated patch?
msg228509 - (view) Author: Geert Jansen (geertj) * Date: 2014-10-04 23:28
Addded the comment about owner being a weakref, and added a new consolidated patch (ssl-memory-bio-5).
msg228579 - (view) Author: Geert Jansen (geertj) * Date: 2014-10-05 15:22
Maybe an example is useful on how the Memory BIO stuff can be used to implement SSL on top of a proactor event loop. I just added support for this to my Gruvi project in the branch "feat-memory-bio":

An "SslPipe" utility class that uses the memory BIOs:

https://github.com/geertj/gruvi/blob/feat-memory-bio/gruvi/ssl.py#L23

A PEP-3156 style transport:

https://github.com/geertj/gruvi/blob/feat-memory-bio/gruvi/ssl.py#L234

And a backport of this for Python 2.7, 3,3 and 3.4:

https://github.com/geertj/gruvi/blob/feat-memory-bio/gruvi/_sslcompat.c
https://github.com/geertj/gruvi/blob/feat-memory-bio/gruvi/sslcompat.py
msg228583 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-10-05 15:34
SSLPipe looks interesting. I wonder if it can be used to reimplement _SelectorSslTransport in asyncio.selector_events (at least as an experiment).
I'll take a look at the cumulated patch soon, thank you.
msg228614 - (view) Author: Roundup Robot (python-dev) Date: 2014-10-05 18:42
New changeset a79003f25a41 by Antoine Pitrou in branch 'default':
Issue #21965: Add support for in-memory SSL to the ssl module.
https://hg.python.org/cpython/rev/a79003f25a41
msg228620 - (view) Author: Geert Jansen (geertj) * Date: 2014-10-05 21:24
Thanks Antoine for merge!

> SSLPipe looks interesting. I wonder if it can be used to reimplement _SelectorSslTransport in asyncio.selector_events (at least as an experiment).

Yes, it could be done quite easily. SslPipe has no dependency on other parts of Gruvi and if this is for Python 3.5 only then you don't need sslcompat either.

Basically you want to install a read callback on the socket that, when fired, reads from the socket and stuffs the bytes into the memory BIO. It should then write() the returning data back to the socket. If there's a short write, then it should install a write callback to retry the write.

The above is almost identical to what SslTransport in Gruvi does. The only different is that Gruvi uses a proactor on all platforms, so that it does not need to call read() itself but the callback is already called with the buffer.
msg228623 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-10-05 22:19
Le 05/10/2014 23:24, Geert Jansen a écrit :
> 
> Yes, it could be done quite easily. SslPipe has no dependency on
> other
parts of Gruvi and if this is for Python 3.5 only then you don't need
sslcompat either.

Yes, it works. Note that I had to modify SSLPipe to also notify of
handshake failures (by passing an argument to the handshake callback).

Here is draft diff against asyncio:
https://gist.github.com/pitrou/f04fa9cbfec88cc37050

However, I don't think this the right approach actually. Rather, the SSL
layer should be implemented as a Protocol object that's also able to act
as a transport for the actual application-level Protocol. It would
completely decouple it from the transport and event loop implementation
details.

(I think that's how Twisted does it, btw)
msg228624 - (view) Author: Roundup Robot (python-dev) Date: 2014-10-05 22:21
New changeset 8da1aa71cd73 by Antoine Pitrou in branch 'default':
Remove unused "block" argument in SSLObject.do_handshake() (issue #21965)
https://hg.python.org/cpython/rev/8da1aa71cd73
msg228626 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-10-05 22:22
I'm closing this issue, and will open a new one for asyncio and/or SSLPipe. Thank you very much, Geert!
msg228654 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-10-06 10:15
I have some comments and suggestions to enhance the new API. I chose to open a new issue: #22564.
History
Date User Action Args
2014-10-06 10:15:13vstinnersetmessages: + msg228654
2014-10-05 22:22:59pitrousetstatus: open -> closed
resolution: fixed
messages: + msg228626

stage: patch review -> resolved
2014-10-05 22:21:16python-devsetmessages: + msg228624
2014-10-05 22:19:08pitrousetmessages: + msg228623
2014-10-05 21:24:06geertjsetmessages: + msg228620
2014-10-05 18:42:16python-devsetnosy: + python-dev
messages: + msg228614
2014-10-05 15:34:44pitrousetmessages: + msg228583
2014-10-05 15:22:55geertjsetmessages: + msg228579
2014-10-04 23:29:00geertjsetfiles: + ssl-memory-bio-5.patch

messages: + msg228509
2014-10-04 22:27:13pitrousetmessages: + msg228506
2014-10-04 22:15:50geertjsetmessages: + msg228504
2014-10-04 19:17:56pitrousetmessages: + msg228481
2014-10-03 12:02:10geertjsetmessages: + msg228324
2014-10-03 10:58:29pitrousetmessages: + msg228322
2014-10-03 10:52:58geertjsetfiles: + ssl-memory-bio-4-incr2.patch

messages: + msg228321
2014-09-20 23:33:33chatgrissetnosy: + chatgris
2014-09-15 20:52:55geertjsetmessages: + msg226939
2014-09-01 17:43:41pitrousetmessages: + msg226237
2014-08-31 15:25:04geertjsetmessages: + msg226184
2014-08-30 09:01:11pitrousetmessages: + msg226131
2014-08-27 08:13:00geertjsetfiles: + ssl-memory-bio-4-incr1.patch

messages: + msg225949
2014-08-26 07:38:05geertjsetfiles: + ssl-memory-bio-4.patch

messages: + msg225910
2014-08-25 20:40:57alexsetnosy: + alex
2014-08-25 20:26:56geertjsetmessages: + msg225895
2014-08-25 20:18:05pitrousetmessages: + msg225893
2014-08-10 00:14:54glyphsetnosy: - glyph
2014-08-09 08:35:19geertjsetmessages: + msg225103
2014-08-06 17:13:06glyphsetnosy: gvanrossum, geertj, janssen, pitrou, vstinner, giampaolo.rodola, christian.heimes, glyph, ezio.melotti, sbt, Ben.Darnell, yselivanov, dstufft
messages: + msg224952
2014-08-06 13:21:18pitrousetmessages: + msg224932
2014-08-06 11:10:03exarkunsetnosy: - exarkun
2014-08-06 11:09:39exarkunsetnosy: gvanrossum, exarkun, geertj, janssen, pitrou, vstinner, giampaolo.rodola, christian.heimes, glyph, ezio.melotti, sbt, Ben.Darnell, yselivanov, dstufft
messages: + msg224926
2014-08-05 14:39:56pitrousetnosy: + sbt
messages: + msg224835
2014-08-05 14:36:24Ben.Darnellsetnosy: + exarkun, glyph
messages: + msg224834
2014-08-04 15:24:15pitrousetmessages: + msg224734
2014-08-04 15:21:40geertjsetmessages: + msg224733
2014-08-04 14:15:12pitrousetnosy: + gvanrossum, Ben.Darnell, yselivanov
messages: + msg224724
2014-08-04 14:13:27pitrousetmessages: + msg224723
2014-08-04 10:04:55geertjsetfiles: + ssl-memory-bio-3.patch

messages: + msg224704
2014-08-01 09:43:04geertjsetfiles: + bio_python_options.py
2014-08-01 09:42:31geertjsetfiles: - bio_python_options.py
2014-08-01 09:39:39geertjsetfiles: + bio_python_options.py

messages: + msg224480
2014-08-01 09:34:42geertjsetmessages: + msg224478
2014-08-01 08:33:32geertjsetfiles: + ssl-memory-bio-2.patch
2014-07-21 18:14:32pitrousetmessages: + msg223597
2014-07-20 16:16:43geertjsetmessages: + msg223518
2014-07-16 11:51:56ezio.melottisetnosy: + ezio.melotti

components: + Extension Modules
stage: patch review
2014-07-16 11:32:17vstinnersetnosy: + vstinner
2014-07-12 16:53:37pitrousetnosy: + janssen, pitrou, giampaolo.rodola, christian.heimes, dstufft
2014-07-12 09:08:54geertjcreate