classification
Title: ssl.wrap_socket on a connected but failed connection succeeds and .peer_certificate gives AttributeError
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.4
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Ben.Darnell, kiilerix, kristjan.jonsson, pitrou, python-dev
Priority: normal Keywords:

Created on 2012-01-06 17:38 by kiilerix, last changed 2016-11-29 12:21 by kristjan.jonsson. This issue is now closed.

Messages (15)
msg150754 - (view) Author: Mads Kiilerich (kiilerix) * Date: 2012-01-06 17:38
According to http://docs.python.org/release/2.7.2/library/ssl wrap_socket can be used either on connected sockets or on un-connected sockets which then can be .connect'ed.

In the latter case SSLSocket.connect() will (AFAIK) always raise a nice exception if the connection fails before the ssl negotiation has completed.

But when an existing connected but failing socket is wrapped then SSLSocket.__init__ will create an instance with self._sslobj = None without negotiating ssl and without raising any exception. Many SSLSocket methods (such as .cipher) checks for self._sslobj, but for example .getpeercert doesn't and will derefence None. That can lead to spurious crashes in applications.

This problem showed up with Mercurial and connections from China to code.google.com being blocked by the Chinese firewall - see for example https://bugzilla.redhat.com/show_bug.cgi?id=771691 .

In that case

  import socket, ssl, time
  s = socket.create_connection(('code.google.com', 443))
  time.sleep(1)
  ssl_sock = ssl.wrap_socket(s)
  ssl_sock.getpeercert(True)

would fail with
  ...
      ssl_sock.getpeercert(True)
    File "/usr/lib64/python2.7/ssl.py", line 172, in getpeercert
      return self._sslobj.peer_certificate(binary_form)
  AttributeError: 'NoneType' object has no attribute 'peer_certificate'

The problem occurs in the case where The Chinese Wall responds correctly to the SYN with SYN+ACK but soon after sends a RST. The sleep is necessary to reproduce it consistently; that makes sure the RST has been received and getpeername fails. Otherwise getpeername succeeds and the connection reset is only seen later on while negotiation ssl, and socket errors there are handled 'correctly'.

The problem can be reproduced on Linux with
  iptables -t filter -A FORWARD -p tcp --dport 443 ! --tcp-flags SYN SYN -j REJECT --reject-with tcp-reset

I would expect that wrap_socket / SSLSocket.__init__ raised an exception if the wrapped socket has been connected but failed. Calling getpeername is insufficient to detect that (and it is too racy to be reliable).

Alternatively all SSLSocket methods should take care not to dereference self._sslobj and they should respond properly - preferably with a socket/ssl exception.

Alternatively the documentation should describe how all applications that wraps connected sockets has to verify that it actually was connected. Checking .cipher() is apparently currently the least ugly way to do that?

One good(?) reason to wrap connected sockets is to be able to use socket.create_connection which tries all IP adresses of a fqdn before it fails. (Btw: That isn't described in the documentation! That confused me while debugging this.)

I guess applications (like Mercurial) that for that reason wants to use create_connection with 2.7.2 and older should check .cipher() as a workaround?
msg150755 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-01-06 17:52
> Alternatively all SSLSocket methods should take care not to dereference 
> self._sslobj and they should respond properly - preferably with a
> socket/ssl exception.

In getpeercert()'s case, I think None would be the right thing to return (as cipher() and compression() already do).
But, yes, this deserves fixing.

> One good(?) reason to wrap connected sockets is to be able to use
> socket.create_connection which tries all IP adresses of a fqdn before
> it fails. (Btw: That isn't described in the documentation! That
> confused me while debugging this.)

Which documentation? create_connection's? Feel free to propose an improved wording.
msg150773 - (view) Author: Mads Kiilerich (kiilerix) * Date: 2012-01-06 22:14
I think it would be confusing if getpeercert returned None both for valid connections without certificates and also for invalid connections. I would almost prefer the current behaviour (AttributeError) if just it was documented and there was a documented way to check if the connection actually was alive. Do you agree that checking .cipher() is the recommended way to do that in a way that is compatible with past and future 2.x versions?

I hope the proper fix will ensure that an exception always is raised if the ssl handshake fails - and that a successful wrap_socket means that the ssl negotiation did succeed with the given constraints. It might however only be feasible to fix that for 3.x.

I filed Issue13724 for the create_socket documentation.
msg150780 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-01-07 00:26
> I hope the proper fix will ensure that an exception always is raised
> if the ssl handshake fails - and that a successful wrap_socket means
> that the ssl negotiation did succeed with the given constraints. It
> might however only be feasible to fix that for 3.x.

Actually, the handshake is not even attempted because getpeername()
returns ENOTCONN, which is interpreted as meaning "the socket hasn't
been connected yet". Any idea how to improve that?

(I find it strange that ENOTCONN is returned rather than say ECONNRESET)
msg150880 - (view) Author: Mads Kiilerich (kiilerix) * Date: 2012-01-08 17:21
I won't claim to know more about socket error codes than what the Linux man pages says. According to them only send() can fail with ECONNRESET, even though POSIX Programmer's Manual man pages mentions many others. getpeername() is however not documented as being able to return ECONNRESET, and ENOTCONN is apparently the most appropriate error code for getpeername() both on linux (3) and posix (3p).

So: ENOTCONN from getpeername just means that the connection isn't connected. It doesn't indicate that no connection attempt has been mode, and the use of getpeername in SSLSocket.__init__ is thus not usable for checking "if the underlying socket isn’t connected yet".

The wrap_socket API promises to be able to wrap both connected and unconnected sockets without being told explicit what the programmer intended. A system network socket knows if it is unused or failed, but I am not aware of any reliable cross-platform way to observe that (but that doesn't mean that none exist). The only way to reliably implement the documented wrap_socket API might thus be to maintain a flag in PySocketSockObject.

Introducing a new and more explicit way of wrapping connected sockets might be a simpler and more stable solution.

From another perspective: Any user of sockets must be aware that socket operations can fail at any time. It might thus not be a problem that wrap_socket fails to fail, as long as the programmer knows how to catch the failure in the next operation. From that point of view the problem is that it is surprising and undocumented how getpeercert can fail.
msg150885 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-01-08 17:50
> The only way to reliably implement the documented wrap_socket API
> might thus be to maintain a flag in PySocketSockObject.

Agreed. With the annoyance that the flag must be exposed to Python code,
since ssl's wrap_socket is written in Python. It may be private, though.

> Introducing a new and more explicit way of wrapping connected sockets
> might be a simpler and more stable solution.

I'm a bit wary of API bloat here.

> From another perspective: Any user of sockets must be aware that
> socket operations can fail at any time. It might thus not be a problem
> that wrap_socket fails to fail, as long as the programmer knows how to
> catch the failure in the next operation. From that point of view the
> problem is that it is surprising and undocumented how getpeercert can
> fail.

Thanks. So fixing how getpeercert behaves and either raise a dedicated
error or return None would improve things here, right?
msg150889 - (view) Author: Mads Kiilerich (kiilerix) * Date: 2012-01-08 18:19
> I'm a bit wary of API bloat here.

Yes, but explicit is better than magic ...

> Thanks. So fixing how getpeercert behaves and either raise a dedicated
> error or return None would improve things here, right?

Well ... that would at least make it theoretically possible to claim 
that it works as intended ;-)

A counter argument could be that retrieving the certificate that already 
has been used for negotiation isn't a socket operation. It would make 
sense to be able to look at it even after the socket has been closed. 
 From that point of view _sslobj should be kept "forever".

A return value of None would still not indicate if we had a working 
connection without certificate or a failed connection. That would be 
annoying.

My primary concern with my Mercurial hat on is to get the documentation 
updated so we know how to write code that works correctly also with 
previous Python versions.
msg150890 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012-01-08 18:32
> A return value of None would still not indicate if we had a working 
> connection without certificate or a failed connection. That would be 
> annoying.

Ah, right. Then raising e.g. a ValueError would be better.
msg150917 - (view) Author: Mads Kiilerich (kiilerix) * Date: 2012-01-09 00:45
I would find ValueError surprising here. socket.error or SSLError would be less surprising ... even though it technically isn't completely true. But isn't it more like a kind of RuntimeError to call getpeercert after the socket has been closed?
msg187950 - (view) Author: Ben Darnell (Ben.Darnell) * Date: 2013-04-28 02:46
We found a related issue with Tornado: https://github.com/facebook/tornado/pull/750

We call wrap_socket with do_handshake_on_connect=False and then call do_handshake when the socket is ready.  If the getpeername call in wrap_socket fails with ENOTCONN because the connection was reset immediately after it was opened, then do_handshake will fail with an AttributeError (because self._sslobj is None).  do_handshake should fail with a clearer error (perhaps socket.error with ENOTCONN or ECONNABORTED?) if self._sslobj is None.

Also, Mac OS X appears to return EINVAL from getpeername in this case, instead of ENOTCONN.  wrap_socket should probably treat ENOTCONN and EINVAL from getpeername as equivalent.
msg188088 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-04-29 19:43
Ok, so this is a slightly annoying issue, but which isn't critical since there's a workaround (catch the AttributeError). Therefore I'm inclined not to change the behaviour in a bugfix release, for fear of regressions for people who are used to the workaround.

In 3.4, it makes sense to improve behaviour. I would propose the following: when self._sslobj is None, call getpeername() on the underlying socket, and let it raise, so that the user gets an appropriate exception for the situation (ENOTCONN mostly, but who knows?). If getpeername() doesn't raise, raise a ValueError instead (since it's akin to calling read() or fileno() on a closed file).

What do you think?
msg188121 - (view) Author: Ben Darnell (Ben.Darnell) * Date: 2013-04-30 01:08
That proposal sounds good to me.  I agree that any change here should target 3.4, not backports to older versions.
msg188235 - (view) Author: Roundup Robot (python-dev) Date: 2013-05-01 18:52
New changeset e6b962fa44bb by Antoine Pitrou in branch 'default':
Issue #13721: SSLSocket.getpeercert() and SSLSocket.do_handshake() now raise an OSError with ENOTCONN, instead of an AttributeError, when the SSLSocket is not connected.
http://hg.python.org/cpython/rev/e6b962fa44bb
msg188236 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-05-01 18:57
Ok, this is fixed now. Thanks for the comments!
msg281988 - (view) Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) Date: 2016-11-29 12:21
fyi, I just observed this in the field in 2.7.3 using requests 2.5.3
I don't think requests has a workaround for 2.7 from reading the release logs.
History
Date User Action Args
2016-11-29 12:21:02kristjan.jonssonsetnosy: + kristjan.jonsson
messages: + msg281988
2013-05-01 18:57:52pitrousetstatus: open -> closed
resolution: fixed
messages: + msg188236

stage: needs patch -> resolved
2013-05-01 18:52:15python-devsetnosy: + python-dev
messages: + msg188235
2013-04-30 01:08:12Ben.Darnellsetmessages: + msg188121
2013-04-29 19:43:22pitrousetmessages: + msg188088
versions: + Python 3.4, - Python 2.7, Python 3.2, Python 3.3
2013-04-28 02:46:47Ben.Darnellsetnosy: + Ben.Darnell
messages: + msg187950
2012-01-09 00:45:30kiilerixsetmessages: + msg150917
2012-01-08 18:32:41pitrousetmessages: + msg150890
2012-01-08 18:19:31kiilerixsetmessages: + msg150889
2012-01-08 17:50:59pitrousetmessages: + msg150885
2012-01-08 17:21:54kiilerixsetmessages: + msg150880
2012-01-07 00:26:52pitrousetmessages: + msg150780
2012-01-06 22:14:34kiilerixsetmessages: + msg150773
2012-01-06 17:52:01pitrousetversions: + Python 3.2, Python 3.3
nosy: + pitrou

messages: + msg150755

stage: needs patch
2012-01-06 17:38:32kiilerixcreate