classification
Title: ssl: post-commit review of the new memory BIO API
Type: Stage:
Components: Versions: Python 3.5
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: alex, geertj, pitrou, python-dev, vstinner
Priority: normal Keywords: patch

Created on 2014-10-06 10:14 by vstinner, last changed 2014-10-10 11:12 by vstinner. This issue is now closed.

Files
File name Uploaded Description Edit
ssl_doc.patch vstinner, 2014-10-09 12:50 review
Messages (11)
msg228653 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-10-06 10:14
Hi,

I just saw that the changeset a79003f25a41 was merged and it adds support for memory BIO to the Python ssl module, great! It's a nice addition, especially because it becomes possible to implement SSL for IOCP event loop in asyncio (#22560).

I read the changeset and I have a few comments. Sorry, I didn't have time before to review the patch. I prefer to open a new issue. I'm interested to work on patches, but I would prefer a first feedback before working on a patch.


+.. method:: SSLSocket.read(len=0, buffer=None)

Hum, I fear that some people will call read() with two parameters by mistake. I would prefer a keyword-only parameter: put a "$" in the call to PyArg_ParseTuple() in PySSL_SSLread().

The "read()" method is quite generic, many Python functions expect a "file-like" object with a read() method which only take one indexed parameter.

An alternative would be to implemented readinto() and drop the buffer parameter from read().

Same remark for SSLObject.read() and _ssl._SSLSocket.read().


+.. attribute:: SSLSocket.server_hostname
+
+   A ``bytes`` instance (...)
+
+   .. versionadded:: 3.5

Oh, I would prefer to have a Unicode string here, but I see that the attribute is *not* new, it's already present in Python 3.4.

The versionadded field is wrong. It looks like the attribute was introduced in Python 3.2:
https://docs.python.org/dev/whatsnew/3.2.html#ssl

Maybe the change is that the attribute now also exists in the _ssl module? But the _ssl module is not documented.


+.. method:: MemoryBIO.write_eof()
+
+   Write an EOF marker to the memory BIO. After this method has been called, it
+   is illegal to call :meth:`~MemoryBIO.write`.

What does it mean, "illegal"? An exception is raised? Maybe it would be more explicit to say that an exception is raised.


+- All IO on an :class:`SSLObject` is non-blocking. This means that for example
+  :meth:`~SSLSocket.read` will raise an :exc:`SSLWantReadError` if it needs
+  more data than the incoming BIO has available.

It would be nice to repeat this information in the documentation of the SSLObject.read() method.


+.. method:: SSLContext.wrap_bio(incoming, outgoing, server_side=False, \

Why not puting the documentation of this method where the SSLContext is documented?


+Some notes related to the use of :class:`SSLObject`:

I suggest to move these notes just after the documentation of the SSLObject class.


+The following methods are available from :class:`SSLSocket`:

methods and *attributes*.


+class SSLObject:
+    """This class implements an interface on top of a low-level SSL object as
+    implemented by OpenSSL. This object captures the state of an SSL connection
+    but does not provide any network IO itself. IO needs to be performed
+    through separate "BIO" objects which are OpenSSL's IO abstraction layer.
+
+    This class does not have a public constructor. Instances are returned by
+    ``SSLContext.wrap_bio``. This class is typically used by framework authors
+    that want to implement asynchronous IO for SSL through memory buffers.
+
+    When compared to ``SSLSocket``, this object lacks the following features:
+
+     * Any form of network IO incluging methods such as ``recv`` and ``send``.
+     * The ``do_handshake_on_connect`` and ``suppress_ragged_eofs`` machinery.
+    """

This documentation is useful. It should be copied in the documentation of the SSLObject class.


+.. class:: MemoryBIO
+
+   A memory buffer that can be used to pass data between Python and an SSL
+   protocol instance.
+
+.. attribute:: MemoryBIO.pending
+
+   Return the number of bytes currently in the memory buffer.

I prefer when class methods and attributes are part of the "class" block in the documentation, the documentation is rendered differently and it's more readable IMO.

Compare:
   https://docs.python.org/dev/library/ssl.html#ssl.MemoryBIO
with:
   https://docs.python.org/dev/library/io.html#io.IOBase

But the choice how to document methods and attributes was not made in the memory BIO changeset, it is older. I suggest to "upgrade" to style to use the same style than the io module (put everything in the ".. class:" block).

What do you think?
msg228664 - (view) Author: Geert Jansen (geertj) * Date: 2014-10-06 11:59
Hi Victor,

see below my comments:

* SSLSocket.read(), SSLOBject.read() and _ssl._SSLSocket.read() taking a buffer as the second positional argument.

Both SSLSocket.read() and _SSLSocket.read() already accepted two arguments so I went for consistency. The former has been publicly documented in prior releases so I don't think it can be changed?

* versionadded for server_hostname set to 3.5

This is when it was first documented. If it's more correct to specify when it was first implemented then I can put it to 3.2.

* server_hostname property is idna encoded bytes instead of unicode

Agreed that it should be changes to unicode.

Currently SSLSocket.server_hostname is whatever was passed to the constructor, which can be unicode or an already encoded bytes instance. SSLObject.server_hostname on the other hand is always a bytes instance.

Should SSLSocket.server_hostname also be changed to always return unicode even if a bytes was passed to the constructor? I'd tend to say yes especially because the attribute was not documented before. But it would be a change in behavior.

Now that I think of it - since SSLSocket now uses SSLObject to check the hostname, and SSLObject exposes server_hostname as a bytes instance, is hostname checking currently broken for non-ascii hostnames?

* Documentation suggestions.

Mostly make sense. I will have a look.
msg228687 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014-10-06 14:14
+.. attribute:: SSLSocket.server_hostname
+
+   A ``bytes`` instance (...)

Ah, this is a mistake. It's actually always a str instance (on SSLObject as well).
msg228693 - (view) Author: Geert Jansen (geertj) * Date: 2014-10-06 14:22
> +.. attribute:: SSLSocket.server_hostname
> +
> +   A ``bytes`` instance (...)
>
> Ah, this is a mistake. It's actually always a str instance (on SSLObject as well).

It is indeed, I stand corrected. I was confused by the decode -> encode roundtrip that happens in _ssl.

However I think that in theory SSLSocket.server_hostname could be a bytes, if a bytes was passed into the constructor. _ssl parses this argument with the "et" format which means it will let a correctly encoded byte string through. Not sure if anybody is using this though.
msg228741 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-10-06 22:03
> However I think that in theory SSLSocket.server_hostname could be a bytes, if a bytes was passed into the constructor.

newPySSLSocket() expects a char* string and use PyUnicode_Decode() to decode bytes.
msg228753 - (view) Author: Geert Jansen (geertj) * Date: 2014-10-07 02:10
> newPySSLSocket() expects a char* string and use PyUnicode_Decode() to decode bytes.

Yup, and this value is available as SSLSocket._sslobj.server_hostname. But SSLSocket.server_hostname is not this, it is what was passed to the constructor which can be a bytes instance.

For total cleanness maybe the constructor should raise a TypeError if server_hostname is passes as a bytes.
msg228763 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-10-07 10:02
> For total cleanness maybe the constructor should raise a TypeError if server_hostname is passes as a bytes.

Oh. So there are two attributes? SSLSocket.server_hostname and
SSLSocket._sslobj.server_hostname? The constructor should make sure
that both are consistent (ex: initialize SSLSocket.server_hostname
from sslobj.server_hostname), or SSLSocket.server_hostname should be a
property reading SSLSocket._sslobj.server_hostname (or a getter/setter
if we should be able to modify it).
msg228869 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-10-09 12:50
Here is a first patch for SSL documentation. If the patch looks fine, I will first apply revelant parts to Python 3.4 documentation.
msg228981 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2014-10-10 10:43
New changeset 54402b25f0b9 by Victor Stinner in branch '3.4':
Issue #22564: ssl doc: fix typos
https://hg.python.org/cpython/rev/54402b25f0b9

New changeset 61e52fda1006 by Victor Stinner in branch '3.4':
Issue #22564: ssl doc: document read(), write(), pending, server_side and
https://hg.python.org/cpython/rev/61e52fda1006

New changeset 63386eda0cb7 by Victor Stinner in branch '3.4':
Issue #22564: ssl doc: use "class" marker to document the SSLSocket class
https://hg.python.org/cpython/rev/63386eda0cb7

New changeset 8338ae599931 by Victor Stinner in branch '3.4':
Issue #22564: ssl doc: mention how SSLSocket are usually created
https://hg.python.org/cpython/rev/8338ae599931
msg228983 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2014-10-10 11:05
New changeset 61fbd3d5c307 by Victor Stinner in branch '3.4':
Issue #22564: ssl doc: mention asyncio in the non-blocking section
https://hg.python.org/cpython/rev/61fbd3d5c307

New changeset 11ad670ca663 by Victor Stinner in branch 'default':
Issue #22564: ssl doc: reorganize and reindent documentation of SSLObject and
https://hg.python.org/cpython/rev/11ad670ca663

New changeset 28fcf6c01bd9 by Victor Stinner in branch 'default':
Issue #22564: ssl doc, add more links to the non-blocking section
https://hg.python.org/cpython/rev/28fcf6c01bd9

New changeset 5f773540e2ef by Victor Stinner in branch 'default':
Issue #22564: cleanup SSLObject doc
https://hg.python.org/cpython/rev/5f773540e2ef
msg228984 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2014-10-10 11:12
Ok, I think I addressed most of my remarks and I consider the issue as done.

> For total cleanness maybe the constructor should raise a TypeError if server_hostname is passes as a bytes.

Please open a new issue to address this point.
History
Date User Action Args
2014-10-10 11:12:42vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg228984
2014-10-10 11:05:59python-devsetmessages: + msg228983
2014-10-10 10:43:38python-devsetnosy: + python-dev
messages: + msg228981
2014-10-09 12:50:08vstinnersetfiles: + ssl_doc.patch
keywords: + patch
messages: + msg228869
2014-10-07 10:02:38vstinnersetmessages: + msg228763
2014-10-07 02:10:33geertjsetmessages: + msg228753
2014-10-06 22:03:09vstinnersetmessages: + msg228741
2014-10-06 14:22:16geertjsetmessages: + msg228693
2014-10-06 14:14:21pitrousetmessages: + msg228687
2014-10-06 13:34:34alexsetnosy: + alex
2014-10-06 11:59:00geertjsetmessages: + msg228664
2014-10-06 10:14:44vstinnersetnosy: + geertj, pitrou
2014-10-06 10:14:25vstinnercreate