Title: ssl.SSLSocket timeout not working correctly when remote end is hanging
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 2.7, Python 2.6
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: janssen Nosy List: Jim.Duchek, dandrzejewski, giampaolo.rodola, jacques, janssen, jcea, matejcik, pitrou, vbabiy
Priority: high Keywords: patch

Created on 2009-01-29 22:44 by jacques, last changed 2010-09-27 03:51 by jcea. This issue is now closed.

File name Uploaded Description Edit
ssltimeout.patch pitrou, 2010-04-21 19:18
ssltimeout2.patch pitrou, 2010-04-23 21:51
Messages (9)
msg80790 - (view) Author: Jacques Grove (jacques) Date: 2009-01-29 22:44
In of Python 2.6.1 we have this code in  SSLSocket.__init__():

            if do_handshake_on_connect:
                timeout = self.gettimeout()

The problem is, what happens if the remote end (server) is hanging when
do_handshake() is called?  The result is that the user-requested timeout
will be ignored, and the connection will hang until the TCP socket
timeout expires.

This is easily reproducable with this test code:

import urllib2
urllib2.urlopen("https://localhost:9000/", timeout=2.0)

and running netcat on port 9000, i.e.:

nc -l -p 9000 localhost

If you use "http" instead of "https", the timeout works as expected
(after 2 seconds in this case).
msg88326 - (view) Author: Vitaly Babiy (vbabiy) Date: 2009-05-26 00:38
Why not just remove the removal of the timeout.
msg90886 - (view) Author: jan matejek (matejcik) * Date: 2009-07-24 15:00
i believe that the bug lies in bad implementation/backport of feature
from 3.0 patch for issue1251.

see this revision:
where the code was added for py3k branch.

the logic behind that code is that when the timeout is zero
(non-blocking socket), but the caller explicitly specifies that they
want to block (which only happens when an application handles
do_handshake by itself), timeout is set to None and reset to zero after
the call.
and this change was made to "better support non-blocking sockets", the
original patch (
) did not have this code at all

on the other hand, what the 2.6 version does is not making any sense. i
don't even see how it got to be that way instead of using py3k's
do_handshake with optional "block" parameter - seeing as the change was
made six months later and changed the API anyway

the correct thing to do here is simply

if do_handshake_on_connect:

without the timeout setting/resetting stuff
msg101722 - (view) Author: Jim Duchek (Jim.Duchek) Date: 2010-03-25 17:12
This is happening 'in the wild' to me fairly regularly.  Since it's not hanging in python, I can't watchdog/kill the thread it's happening in.  matejcik seems correct on fixing this, there's no need to unset the timeout here.
msg102381 - (view) Author: David Andrzejewski (dandrzejewski) Date: 2010-04-05 15:54
I believe this issue may be responsible for causing a very long hang in my application.  Here's an example of it hanging for 30 minutes.  Yes - minutes. 

[UI] 2010-04-03 11:33:34,209 DEBUG: Communicating with GUI on - timeout 10 seconds
[UI] 2010-04-03 12:03:16,118 ERROR: Error in send_gui_command()!
Traceback (most recent call last):
  File "javaui.pyc", line 72, in send_message
  File "client.pyc", line 506, in send_message_with_params
  File "client.pyc", line 230, in call
  File "client.pyc", line 94, in do_request
  File "httplib.pyc", line 1100, in connect
  File "ssl.pyc", line 350, in wrap_socket
  File "ssl.pyc", line 118, in __init__
  File "ssl.pyc", line 293, in do_handshake
error: [Errno 10053] An established connection was aborted by the software in your host machine

This is Python 2.6.4 on Windows.  (This particular time it happened on a 32-bit Server 2008 system). 

My guess is that after that 30 minute time period, Windows is seeing that the connection is hung and it's just aborting it?
msg103743 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-04-20 20:09
Bill, I think we should move forward with this. Do you agree that removing the timeout dance is the right solution?
msg103891 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-04-21 19:18
Here is a patch, with tests. I also had to rework the asyncore-based test server in test_ssl, and fixed an omission in _ssl.c's do_handshake method.

(works with OpenSSL 0.9.8k and 1.0.0)
msg104056 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-04-23 21:51
New patch fixing test_poplib failures.
msg104132 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010-04-24 21:33
Fixed in trunk (r80452) and 2.6 (r80453). Also ported relevant parts to 3.x (one half of the test had to be disabled because or #8524).
Date User Action Args
2010-09-27 03:51:55jceasetnosy: + jcea
2010-04-24 21:33:13pitrousetstatus: open -> closed
resolution: fixed
messages: + msg104132

stage: patch review -> resolved
2010-04-23 21:51:15pitrousetfiles: + ssltimeout2.patch

messages: + msg104056
2010-04-21 19:18:37pitrousetfiles: + ssltimeout.patch
keywords: + patch
messages: + msg103891

stage: patch review
2010-04-20 20:09:00pitrousetpriority: high
versions: + Python 2.7
nosy: + pitrou

messages: + msg103743
2010-04-05 15:54:57dandrzejewskisetmessages: + msg102381
2010-04-05 15:37:36dandrzejewskisetnosy: + dandrzejewski
2010-03-25 17:12:00Jim.Ducheksetnosy: + Jim.Duchek
messages: + msg101722
2009-07-24 15:00:32matejciksetmessages: + msg90886
2009-07-23 15:41:34matejciksetnosy: + matejcik
2009-05-26 00:38:23vbabiysetnosy: + vbabiy
messages: + msg88326
2009-01-31 01:52:27benjamin.petersonsetassignee: janssen
nosy: + janssen
2009-01-30 20:59:38giampaolo.rodolasetnosy: + giampaolo.rodola
2009-01-29 22:44:25jacquescreate