Issue8108
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2010-03-10 13:44 by flox, last changed 2022-04-11 14:56 by admin. This issue is now closed.
Files | ||||
---|---|---|---|---|
File name | Uploaded | Description | Edit | |
ssl_shutdown.patch | vstinner, 2010-03-12 00:53 | |||
testftpd.patch | giampaolo.rodola, 2010-03-12 22:01 | |||
issue8108_openssl_098m.diff | flox, 2010-03-25 00:28 | Patch, apply to 2.x | ||
issue8108_print_ssl_want_read.diff | flox, 2010-03-25 14:11 | debugging on 2.x | ||
issue8108_ftplib_ssl_098m_v2.diff | flox, 2010-03-25 14:52 | Patch, apply to 2.x | ||
newssl.patch | pitrou, 2010-04-06 10:46 | |||
newssl2.patch | pitrou, 2010-04-06 11:39 | |||
newssl3.patch | pitrou, 2010-04-06 13:17 | |||
newssl4.patch | pitrou, 2010-04-08 18:24 | |||
newssl5.patch | pitrou, 2010-04-08 20:15 | |||
python_ssl_v2.c.txt | dlmiles, 2010-04-09 16:46 | Revised newssl5.patch | ||
Modules__ssl.c.patch | dlmiles, 2010-04-11 16:47 | Patch for Modules/_ssl.c (still need newssl5.patch for test_ftplib.py fixes) | ||
newssl6.patch | pitrou, 2010-04-16 19:32 |
Messages (66) | |||
---|---|---|---|
msg100782 - (view) | Author: Florent Xicluna (flox) * ![]() |
Date: 2010-03-10 13:44 | |
There's such failure on trunk and py3k when openssl 0.9.8m is installed. (Debian platform) No failure with 0.9.8k. test_ftplib Exception in thread Thread-40: Traceback (most recent call last): File "./Lib/threading.py", line 530, in __bootstrap_inner self.run() File "./Lib/test/test_ftplib.py", line 223, in run asyncore.loop(timeout=0.1, count=1) File "./Lib/asyncore.py", line 211, in loop poll_fun(timeout, map) File "./Lib/asyncore.py", line 154, in poll write(obj) File "./Lib/asyncore.py", line 88, in write obj.handle_error() File "./Lib/asyncore.py", line 84, in write obj.handle_write_event() File "./Lib/test/test_ftplib.py", line 290, in handle_write_event super(SSLConnection, self).handle_write_event() File "./Lib/asyncore.py", line 440, in handle_write_event self.handle_write() File "./Lib/asynchat.py", line 174, in handle_write self.initiate_send() File "./Lib/asynchat.py", line 215, in initiate_send self.handle_close() File "./Lib/test/test_ftplib.py", line 43, in handle_close self.close() File "./Lib/test/test_ftplib.py", line 316, in close self.socket.unwrap() File "./Lib/ssl.py", line 272, in unwrap s = self._sslobj.shutdown() SSLError: [Errno 2] _ssl.c:1367: The operation did not complete (read) |
|||
msg100783 - (view) | Author: Florent Xicluna (flox) * ![]() |
Date: 2010-03-10 13:47 | |
It occurs only for trunk and 3.x, where the "ftp over ssl" is implemented. See #2054. And the Debian alpha buildbots have the same library 0.9.8m: http://www.python.org/dev/buildbot/all/builders/alpha%20Debian%20trunk http://www.python.org/dev/buildbot/all/builders/alpha%20Debian%203.x |
|||
msg100784 - (view) | Author: Florent Xicluna (flox) * ![]() |
Date: 2010-03-10 13:49 | |
OpenSSL changelog (scroll to 0.9.8m): http://www.openssl.org/news/changelog.html |
|||
msg100888 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2010-03-11 21:39 | |
The problem comes from SSL_Shutdown() on a non blocking connection. See: http://www.mail-archive.com/openssl-dev@openssl.org/msg24097.html |
|||
msg100889 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2010-03-11 21:48 | |
The patch changing SSL_Shutdown() behaviour on non blocking socket was written in 2006 by Darryl Miles and it will commited into 0.9.8m. The original patch: http://marc.info/?l=openssl-dev&m=115153998821797&w=2 In our testcase, SSL_Shutdown() fails with SSL_ERROR_WANT_READ, which means (extract of the email): "Return -1 WANT_READ while we are still waiting for the inbound recv notify alert to be received." |
|||
msg100892 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2010-03-11 23:06 | |
More informations: http://www.openssl.org/docs/ssl/SSL_shutdown.html http://rt.openssl.org/Ticket/Display.html?user=guest&pass=guest&id=1891 Extract of the manual page: If the underlying BIO is non-blocking, SSL_shutdown() will also return when the underlying BIO could not satisfy the needs of SSL_shutdown() to continue the handshake. In this case a call to SSL_get_error() with the return value of SSL_shutdown() will yield SSL_ERROR_WANT_READ or SSL_ERROR_WANT_WRITE. ***The calling process then must repeat the call after taking appropriate action to satisfy the needs of SSL_shutdown().*** The action depends on the underlying BIO. When using a non-blocking socket, nothing is to be done, but select() can be used to check for the required condition. When using a buffering BIO, like a BIO pair, data must be written into or retrieved out of the BIO before being able to continue. |
|||
msg100899 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2010-03-12 00:53 | |
There are different kind of bugs, but all of them are related to SSL_Shutdown(). It depends on the socket timeout: - blocking socket: nothing special (no change from previous OpenSSL version) - timeout or non blocking socket: if SSL_shutdown() returns -1 and SSL_get_error()==SSL_ERROR_WANT_READ, we have to read bytes from the socket (to read the "close" notify acknowledge?) I did't understood how much bytes have to be read, nor if we have to check the socket status using select() or not. Attached patch is a draft to demonstrate how the bug have to be fixed, but I don't know OpenSSL nor asynchat enough to write a good fix. TODO: unwrap() returns None on SSL_shutdown() error and if the socket has been closed. |
|||
msg100958 - (view) | Author: Bill Janssen (janssen) * ![]() |
Date: 2010-03-12 19:23 | |
Victor, I'll take a look. |
|||
msg100961 - (view) | Author: Bill Janssen (janssen) * ![]() |
Date: 2010-03-12 19:48 | |
What's happening is that the new state returns from SSL_shutdown() are saying, "the shutdown you asked for didn't happen this time, but call me again when you get a chance. And here is a hint about why it didn't happen, so that if you are sophisticated enough, you can get better information about whether subsequent calls to shutdown() will succeed." Given the underlying architectures we are dealing with, is it possible to handle this properly? Basically, what the SSL_shutdown code is saying is that all higher-level event loops should understand an additional kind of event, "shutting down the SSL socket", on a par with "read bytes" and "write bytes". But our event processing handlers, like asyncore, basically only deal with read and write. I don't see how to tell them that a "close" operation should turn into a "read" or "write". We could also think about turning non-blocking sockets into blocking sockets, by adding a time.sleep(0.1) and re-calling SSL_shutdown(). But maybe the right thing to do would be for the write handler in the asyncore loop to catch this, and re-try the write when it gets a chance. |
|||
msg100962 - (view) | Author: Giampaolo Rodola' (giampaolo.rodola) * ![]() |
Date: 2010-03-12 20:05 | |
I'm not sure this should be handled in ssl.py. ssl.py correctly raises ERROR_WANT_READ/WRITE if the shutdown operation didn't complete and that's ok. It should be up to the upper application (in our case asyncore) to deal with that. Moreover, the patch tries to solve the issue by using a while loop which is *evil* for asynchronous environments (a similar problem is debated in issue 3890). My suggestion is to modify the SSLConnection.close() method of the test server so that it just "return" in case of ERROR_WANT_READ/WRITE errors without calling the original asyncore's close method since the socket object must remain in the select loop as long as the connection is still around. A subsequent call to handle_read() and/or handle_write() should not be necessary as they should automatically be called by asyncore on the next poll() loop but I'm not 100% sure about this. In both cases I'd say that the only thing needed to be fixed here is the test server. |
|||
msg100963 - (view) | Author: Bill Janssen (janssen) * ![]() |
Date: 2010-03-12 20:10 | |
Looking at this further, what we're seeing is the trace of an ineffective attempt to handle an exception presumably raised from the FTP code. Can we see that exception? What's the actual state of the TCP connection at this point? |
|||
msg100964 - (view) | Author: Bill Janssen (janssen) * ![]() |
Date: 2010-03-12 20:18 | |
I think Giampaolo is right about this not being an ssl.py issue. Could the exception handling in ftp.py perhaps also be made more sophisticated? I'd really love to understand what the state of the TCP connection is here. I'm presuming that it's still open, because otherwise you'd get a different error from OpenSSL. So what's the error that's triggering this call? Is it simply that the sender has run out of stuff to send? Also, I think that asyncore.py is stingy about what information it sends. At the very least, it could invoke handle_error() with the exception it's handling as an argument. |
|||
msg100965 - (view) | Author: Bill Janssen (janssen) * ![]() |
Date: 2010-03-12 20:19 | |
And it would be interesting to know why all the SSL module tests don't fail in the same way. |
|||
msg100977 - (view) | Author: Giampaolo Rodola' (giampaolo.rodola) * ![]() |
Date: 2010-03-12 22:01 | |
> I'd really love to understand what the state of the TCP connection > is here. I'm presuming that it's still open, because otherwise you'd > get a different error from OpenSSL. So what's the error that's > triggering this call? Is it simply that the sender has run out of > stuff to send? "almost closed"? :) I don't know but I wouldn't label this as an "unexpected error" because when dealing with non blocking sockets errors like this one are perfectly legitimate as they "tell" the upper application what it is expected to do (for example EWOULDBLOCK on send() means "retry later"). A patch is in attachment but I have not been able to reproduce this issue on python 2.7 r78878 on Ubuntu 8.04 so it might not work (I suspect it might end up in an endless loop). If this doesn't work we can try the second option of calling handle_read/write right after shutdown(). |
|||
msg100980 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2010-03-12 23:24 | |
I posted on openssl-users mailing list. The author of patch (change SSL_shutdown() behaviour for non blocking socket) replied! http://marc.info/?l=openssl-users&m=126838806919896&w=2 |
|||
msg100985 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2010-03-13 02:58 | |
> I posted on openssl-users mailing list. The author of patch (change > SSL_shutdown() behaviour for non blocking socket) replied! > http://marc.info/?l=openssl-users&m=126838806919896&w=2 Interesting read, thank you. AFAICT, his proposed "quick fix" snippet should be good enough for us. Also, it seems the new "graceful shutdown" logic is really trying to handle at the transport level things which should be handled by the application protocol (detecting whether the other end still has things to send). |
|||
msg101065 - (view) | Author: Bill Janssen (janssen) * ![]() |
Date: 2010-03-14 19:44 | |
> AFAICT, his proposed "quick fix" snippet should be good enough for us. Depends on what we want. It just suppresses information that's now available. What we'd really like is for the caller to recognize that close() can fail, and should be re-tried if it does. That requires that we signal the error back up and out of the ssl module. It seems to me that any non-blocking code should recognize this and respect it. Again, why does this failure only show up in the FTP test? Why aren't we seeing it elsewhere? |
|||
msg101066 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2010-03-14 20:02 | |
> close() can fail Not close, unwrap() (shutdown()) ;-) > Again, why does this failure only show up in the FTP test? Maybe because only the FTP test uses an SSL socket in non blocking mode. |
|||
msg101070 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2010-03-14 22:42 | |
> Depends on what we want. It just suppresses information that's now > available. What we'd really like is for the caller to recognize that > close() can fail, and should be re-tried if it does. That requires > that we signal the error back up and out of the ssl module. It seems > to me that any non-blocking code should recognize this and respect it. Well, first, it will break code which used to work perfectly well (as the failing tests show). That OpenSSL decided to break compatibility in what looks like a minor bugfix release (0.9.8m) doesn't mean we should propagate the breakage. Second, I'm not an SSL expert, but it's not obvious to me that the application should have to ensure a "complete shutdown" before closing the socket. Often, stateful applicative protocols close their "session" anyway (*) before shutting down the transport layer, so I don't see what this additional precaution can buy (apart from code complication). (*) for example, FTP has the "QUIT" command |
|||
msg101073 - (view) | Author: Bill Janssen (janssen) * ![]() |
Date: 2010-03-15 00:05 | |
> Maybe because only the FTP test uses an SSL socket in non blocking mode. No, the SSL unit tests also do this. I think Giampaolo is right, what we're seeing is bad error handling in the FTP test code. |
|||
msg101074 - (view) | Author: Bill Janssen (janssen) * ![]() |
Date: 2010-03-15 00:11 | |
>> Depends on what we want. It just suppresses information that's now >> available. What we'd really like is for the caller to recognize that >> close() can fail, and should be re-tried if it does. That requires >> that we signal the error back up and out of the ssl module. It seems >> to me that any non-blocking code should recognize this and respect it. > Well, first, it will break code which used to work perfectly well (as > the failing tests show). That OpenSSL decided to break compatibility > in what looks like a minor bugfix release (0.9.8m) doesn't mean we > should propagate the breakage. I admit to being powerfully swayed by this argument. Of course, you could also say that it exposes bugs in code which hasn't been tested enough. Again, look at where this stack trace is coming from. It's a default error handler in code that was written 10 years ago. Perhaps that code needs more than just a default error handler; perhaps the handler in FTP_TLS should be more sophisticated. > Second, I'm not an SSL expert, but it's not obvious to me that the > application should have to ensure a "complete shutdown" before > closing the socket. Often, stateful applicative protocols close > their "session" anyway (*) before shutting down the transport layer, > so I don't see what this additional precaution can buy > (apart from code complication). What it buys is the ability for careful code to properly shut down an SSL session when using non-blocking sockets. If we apply the fix, we deny that, and I'm sure someone will immediately file a bug about it :-). |
|||
msg101517 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2010-03-22 15:58 | |
Ok, I finally agree that test_ftplib should be patched. The patch isn't exactly correct: it should ideally retry the unwrap() call later, rather than simply ignore the error. But since it's just used for testing, it looks sufficient. (we probably lack a higher-level abstraction around ssl sockets, especially non-blocking ones; there's too many domain-specific stuff to know for the average programmer.) |
|||
msg101534 - (view) | Author: Giampaolo Rodola' (giampaolo.rodola) * ![]() |
Date: 2010-03-22 21:14 | |
> The patch isn't exactly correct: it should ideally retry the unwrap() > call later, rather than simply ignore the error. But since it's just > used for testing, it looks sufficient. Actually unwrap() should already be called. What the patch does is returning in case of ERROR_WANT_READ/WRITE, and *explicitly avoids to close() the connection when such an event occurs*. This should grant that another asyncore poll() loop is executed, handle_close() automatically be called again and so unwrap(). As I said in comment #100977 I'm not 100% sure of this since I've not been able to reproduce the issue. I'm just presuming this as I know that asyncore works like this with plain-text sockets. I'd like to make sure this works as expected even if it's just a test. If some of the folks able to reproduce this issue could confirm that unwrap() gets called again it would be great. |
|||
msg101654 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2010-03-24 22:09 | |
Giampaolo, I tried committing your patch but the buildbot still fails, with the following errors: test_auth_issued_twice (test.test_ftplib.TestTLS_FTPClass) ... ok test_auth_ssl (test.test_ftplib.TestTLS_FTPClass) ... ok test_control_connection (test.test_ftplib.TestTLS_FTPClass) ... ok test_data_connection (test.test_ftplib.TestTLS_FTPClass) ... Exception in thread Thread-462: Traceback (most recent call last): File "/home/doko/buildarea/trunk.klose-debian-alpha/build/Lib/threading.py", line 530, in __bootstrap_inner self.run() File "/home/doko/buildarea/trunk.klose-debian-alpha/build/Lib/test/test_ftplib.py", line 223, in run asyncore.loop(timeout=0.1, count=1) File "/home/doko/buildarea/trunk.klose-debian-alpha/build/Lib/asyncore.py", line 211, in loop poll_fun(timeout, map) File "/home/doko/buildarea/trunk.klose-debian-alpha/build/Lib/asyncore.py", line 154, in poll write(obj) File "/home/doko/buildarea/trunk.klose-debian-alpha/build/Lib/asyncore.py", line 88, in write obj.handle_error() File "/home/doko/buildarea/trunk.klose-debian-alpha/build/Lib/asyncore.py", line 84, in write obj.handle_write_event() File "/home/doko/buildarea/trunk.klose-debian-alpha/build/Lib/test/test_ftplib.py", line 290, in handle_write_event super(SSLConnection, self).handle_write_event() File "/home/doko/buildarea/trunk.klose-debian-alpha/build/Lib/asyncore.py", line 440, in handle_write_event self.handle_write() File "/home/doko/buildarea/trunk.klose-debian-alpha/build/Lib/asynchat.py", line 174, in handle_write self.initiate_send() File "/home/doko/buildarea/trunk.klose-debian-alpha/build/Lib/asynchat.py", line 215, in initiate_send self.handle_close() File "/home/doko/buildarea/trunk.klose-debian-alpha/build/Lib/test/test_ftplib.py", line 43, in handle_close self.close() File "/home/doko/buildarea/trunk.klose-debian-alpha/build/Lib/test/test_ftplib.py", line 323, in close self.socket.unwrap() File "/home/doko/buildarea/trunk.klose-debian-alpha/build/Lib/ssl.py", line 258, in unwrap s = self._sslobj.shutdown() error: [Errno 32] Broken pipe ERROR test_login (test.test_ftplib.TestTLS_FTPClass) ... ok ====================================================================== ERROR: test_nlst (test.test_ftplib.TestTLS_FTPClassMixin) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/doko/buildarea/trunk.klose-debian-alpha/build/Lib/test/test_ftplib.py", line 515, in test_nlst self.assertEqual(self.client.nlst(), NLST_DATA.split('\r\n')[:-1]) File "/home/doko/buildarea/trunk.klose-debian-alpha/build/Lib/ftplib.py", line 495, in nlst self.retrlines(cmd, files.append) File "/home/doko/buildarea/trunk.klose-debian-alpha/build/Lib/ftplib.py", line 695, in retrlines conn = self.transfercmd(cmd) File "/home/doko/buildarea/trunk.klose-debian-alpha/build/Lib/ftplib.py", line 357, in transfercmd return self.ntransfercmd(cmd, rest)[0] File "/home/doko/buildarea/trunk.klose-debian-alpha/build/Lib/ftplib.py", line 670, in ntransfercmd conn, size = FTP.ntransfercmd(self, cmd, rest) File "/home/doko/buildarea/trunk.klose-debian-alpha/build/Lib/ftplib.py", line 324, in ntransfercmd host, port = self.makepasv() File "/home/doko/buildarea/trunk.klose-debian-alpha/build/Lib/ftplib.py", line 302, in makepasv host, port = parse227(self.sendcmd('PASV')) File "/home/doko/buildarea/trunk.klose-debian-alpha/build/Lib/ftplib.py", line 786, in parse227 raise error_reply, resp error_reply: 226 transfer complete ====================================================================== ERROR: test_retrbinary_rest (test.test_ftplib.TestTLS_FTPClassMixin) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/doko/buildarea/trunk.klose-debian-alpha/build/Lib/test/test_ftplib.py", line 475, in test_retrbinary_rest self.client.retrbinary('retr', received.append, rest=rest) File "/home/doko/buildarea/trunk.klose-debian-alpha/build/Lib/ftplib.py", line 678, in retrbinary conn = self.transfercmd(cmd, rest) File "/home/doko/buildarea/trunk.klose-debian-alpha/build/Lib/ftplib.py", line 357, in transfercmd return self.ntransfercmd(cmd, rest)[0] File "/home/doko/buildarea/trunk.klose-debian-alpha/build/Lib/ftplib.py", line 670, in ntransfercmd conn, size = FTP.ntransfercmd(self, cmd, rest) File "/home/doko/buildarea/trunk.klose-debian-alpha/build/Lib/ftplib.py", line 324, in ntransfercmd host, port = self.makepasv() File "/home/doko/buildarea/trunk.klose-debian-alpha/build/Lib/ftplib.py", line 302, in makepasv host, port = parse227(self.sendcmd('PASV')) File "/home/doko/buildarea/trunk.klose-debian-alpha/build/Lib/ftplib.py", line 786, in parse227 raise error_reply, resp error_reply: 200 type ok ====================================================================== ERROR: test_storbinary (test.test_ftplib.TestTLS_FTPClassMixin) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/doko/buildarea/trunk.klose-debian-alpha/build/Lib/test/test_ftplib.py", line 488, in test_storbinary self.client.storbinary('stor', f) File "/home/doko/buildarea/trunk.klose-debian-alpha/build/Lib/ftplib.py", line 727, in storbinary conn.unwrap() File "/home/doko/buildarea/trunk.klose-debian-alpha/build/Lib/ssl.py", line 258, in unwrap s = self._sslobj.shutdown() SSLError: [Errno 2] _ssl.c:1367: The operation did not complete (read) ====================================================================== ERROR: test_storbinary_rest (test.test_ftplib.TestTLS_FTPClassMixin) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/doko/buildarea/trunk.klose-debian-alpha/build/Lib/test/test_ftplib.py", line 500, in test_storbinary_rest self.client.storbinary('stor', f, rest=r) File "/home/doko/buildarea/trunk.klose-debian-alpha/build/Lib/ftplib.py", line 727, in storbinary conn.unwrap() File "/home/doko/buildarea/trunk.klose-debian-alpha/build/Lib/ssl.py", line 258, in unwrap s = self._sslobj.shutdown() SSLError: [Errno 2] _ssl.c:1367: The operation did not complete (read) ====================================================================== ERROR: test_storlines (test.test_ftplib.TestTLS_FTPClassMixin) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/doko/buildarea/trunk.klose-debian-alpha/build/Lib/test/test_ftplib.py", line 505, in test_storlines self.client.storlines('stor', f) File "/home/doko/buildarea/trunk.klose-debian-alpha/build/Lib/ftplib.py", line 746, in storlines conn.unwrap() File "/home/doko/buildarea/trunk.klose-debian-alpha/build/Lib/ssl.py", line 258, in unwrap s = self._sslobj.shutdown() SSLError: [Errno 2] _ssl.c:1367: The operation did not complete (read) ====================================================================== ERROR: test_data_connection (test.test_ftplib.TestTLS_FTPClass) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/doko/buildarea/trunk.klose-debian-alpha/build/Lib/test/test_ftplib.py", line 619, in test_data_connection self.client.prot_c() File "/home/doko/buildarea/trunk.klose-debian-alpha/build/Lib/ftplib.py", line 663, in prot_c resp = self.voidcmd('PROT C') File "/home/doko/buildarea/trunk.klose-debian-alpha/build/Lib/ftplib.py", line 249, in voidcmd return self.voidresp() File "/home/doko/buildarea/trunk.klose-debian-alpha/build/Lib/ftplib.py", line 224, in voidresp resp = self.getresp() File "/home/doko/buildarea/trunk.klose-debian-alpha/build/Lib/ftplib.py", line 210, in getresp resp = self.getmultiline() File "/home/doko/buildarea/trunk.klose-debian-alpha/build/Lib/ftplib.py", line 196, in getmultiline line = self.getline() File "/home/doko/buildarea/trunk.klose-debian-alpha/build/Lib/ftplib.py", line 183, in getline line = self.file.readline() File "/home/doko/buildarea/trunk.klose-debian-alpha/build/Lib/socket.py", line 445, in readline data = self._sock.recv(self._rbufsize) timeout: timed out |
|||
msg101661 - (view) | Author: Florent Xicluna (flox) * ![]() |
Date: 2010-03-25 00:28 | |
This patch solves the problem with 0.9.8m on my Debian workstation. Actually there's 4 kind of errors on shutdown(): - SSL_ERROR_WANT_READ - SSL_ERROR_WANT_WRITE - socket.error: errno.PIPE (32) - socket.error: 0 Since the command is "shutdown()", the patch silence all the socket errors, because they are probably due to the remote side closing the connection abruptly. See the details on the Darryl's analysis: http://bugs.python.org/issue8108#msg100980 This patch is very closed to Darryl's proposal. |
|||
msg101677 - (view) | Author: Giampaolo Rodola' (giampaolo.rodola) * ![]() |
Date: 2010-03-25 08:31 | |
@Antoine Pitrou: ok, I think I'll have to be able to replicate the error in order to try to fix this. I'm gonna try on FreeBSD and another Linux box later today. |
|||
msg101678 - (view) | Author: Florent Xicluna (flox) * ![]() |
Date: 2010-03-25 08:32 | |
I don't know if it is better to fix it in the ftplib or the ssl module. The previous patch fixes it at the SSL level (stdlib only). This patch implements the fix at the ftplib level (stdlib + test). |
|||
msg101694 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2010-03-25 13:30 | |
Florent, you shouldn't need to silence socket.error at this level. I think what needs to happen is that, when unwrap() fails in SSLConnection.close(), handle_read() and handle_write() retry the unwrap() call. That's what I gather from the explanation of the new behaviour. |
|||
msg101699 - (view) | Author: Florent Xicluna (flox) * ![]() |
Date: 2010-03-25 14:11 | |
You're right, there's no need to catch the socket error in "ftplib" module. Here is the result of the "debugging" patch in the console: $ ./python -m test.regrtest -uall test_ftplib test_ftplib success success success success success success success success 2702 ssl want read 2801 ssl want read 2661 ssl want read 2274 ssl want read 16742 ssl want read 2397 ssl want read 1 test OK. IMHO, there's no need to loop on "unwrap()", according to Darryl's analysis. |
|||
msg101704 - (view) | Author: Giampaolo Rodola' (giampaolo.rodola) * ![]() |
Date: 2010-03-25 14:50 | |
Florent can you give me a clue on how to make python use OpenSSL 0.9.8k on Ubuntu? Despite I compiled and installed 0.9.8k from sources the system version keeps being 0.9.8g and I guess that Python is using that one. Do I have to recompile Python by using a directive to tell what specific OpenSSL version to use? |
|||
msg101705 - (view) | Author: Florent Xicluna (flox) * ![]() |
Date: 2010-03-25 14:52 | |
Patch ftplib/test_ftplib updated. It is as before (0.9.8k): SSL_ERROR_WANT_READ is ignored on connection unwrap(). |
|||
msg101706 - (view) | Author: Florent Xicluna (flox) * ![]() |
Date: 2010-03-25 14:57 | |
Giampaolo, IMHO you need 0.9.8m to reproduce the issue. I installed it from the Debian testing repo (I'm on debian). ~ $ apt-cache policy openssl openssl: Installed : 0.9.8m-2 Candidate : 0.9.8m-2 Table de version : *** 0.9.8m-2 0 400 ftp://ftp.lip6.fr testing/main Packages 300 ftp://ftp.lip6.fr sid/main Packages 100 /var/lib/dpkg/status 0.9.8g-15+lenny6 0 900 ftp://ftp.lip6.fr lenny/main Packages 900 http://security.debian.org lenny/updates/main Packages Maybe you could catch it on "Lucid" repositories, and if you have some chance, it may be compatible with your ubuntu version. |
|||
msg101729 - (view) | Author: Giampaolo Rodola' (giampaolo.rodola) * ![]() |
Date: 2010-03-25 19:33 | |
Ok, I've finally been able to reproduce the issue by installing OpenSSL 0.9.8m on my Ubuntu box. By doing some debugging I'm starting to think that maybe there's something wrong with the shutdown() method implementation itself because no matter what, sooner or later it always ends up raising "error: [Errno 0] Error". I've cheated by trapping shutdown() in a while loop waiting for it to succeed, I've tried to manually call handle_read() and handle_write(), I've also tried to set the socket back to blocking mode just to see if shutdown() would return but I've always ended up encountering "error: [Errno 0] Error". IMHO, before going any further I'd investigate on why this call can just fail without providing any information about what happened. From the upper application's point of view there's nothing it can do in case of such an error. Assuming that "the remote side closed the connection abruptly" and close the socket is just wrong. |
|||
msg102292 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2010-04-03 20:15 | |
I don't think this should be a release blocker for the beta, although I agree it should be one for the final release (and RCs). By the way, it doesn't affect maintenance branches. |
|||
msg102293 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2010-04-03 20:16 | |
Giampaolo, which call exactly triggers the "error: [Errno 0] Error"? It doesn't seem it can be _sslobj.shutdown itself. |
|||
msg102394 - (view) | Author: Giampaolo Rodola' (giampaolo.rodola) * ![]() |
Date: 2010-04-05 18:25 | |
Yes, it's the _sslobj.shutdow() call: File "test_ftplib.py", line 332, in handle_close self.socket = self.socket.unwrap() File "/usr/local/lib/python2.7/ssl.py", line 258, in unwrap s = self._sslobj.shutdown() error: [Errno 0] Error Since it's not clear to me where exactly this comes from, whether it's from the Python C binding or OpenSSL itself, I tried to put some debugging printf() calls in Modules/_ssl.c, but it seems that after installing OpenSSL 0.9.8m I'm no longer able to compile Python 2.7 from sorces. This is what I get when I run ./configure; make: gcc -pthread -fPIC -fno-strict-aliasing -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I. -IInclude -I./Include -I/usr/local/include -I/home/giampaolo/svn/python-2.7/Include -I/home/giampaolo/svn/python-2.7 -c /home/giampaolo/svn/python-2.7/Modules/_ssl.c -o build/temp.linux-i686-2.7/home/giampaolo/svn/python-2.7/Modules/_ssl.o gcc -pthread -shared build/temp.linux-i686-2.7/home/giampaolo/svn/python-2.7/Modules/_ssl.o -L/usr/local/lib -lssl -lcrypto -o build/lib.linux-i686-2.7/_ssl.so *** WARNING: renaming "_ssl" since importing it failed: build/lib.linux-i686-2.7/_ssl.so: undefined symbol: inflate |
|||
msg102395 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2010-04-05 18:30 | |
> Since it's not clear to me where exactly this comes from, whether it's > from the Python C binding or OpenSSL itself, I tried to put some > debugging printf() calls in Modules/_ssl.c, but it seems that after > installing OpenSSL 0.9.8m I'm no longer able to compile Python 2.7 > from sorces. Have you tried `make distclean`? |
|||
msg102396 - (view) | Author: Giampaolo Rodola' (giampaolo.rodola) * ![]() |
Date: 2010-04-05 18:37 | |
No I haven't, but I tried just now and I get the same error. |
|||
msg102398 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2010-04-05 19:28 | |
> No I haven't, but I tried just now and I get the same error. I think "inflate" is a function exported by the zlib. Perhaps you can add "-lz" to the linking flags. (Googling hints that OpenSSL can depend on the zlib if compression is enabled) |
|||
msg102407 - (view) | Author: Giampaolo Rodola' (giampaolo.rodola) * ![]() |
Date: 2010-04-05 20:43 | |
As suggested in this thread: http://mirt.net/pipermail/stunnel-users/2005-July/000661.html ...I made the following change to the Makefile: - LIBS= -lpthread -ldl -lutil + LIBS= -lpthread -ldl -lutil -lz That made the error change as follows: running build_ext INFO: Can't locate Tcl/Tk libs and/or headers Python build finished, but the necessary bits to build these modules were not found: _bsddb _curses _curses_panel _sqlite3 _tkinter bsddb185 bz2 dbm gdbm readline sunaudiodev |
|||
msg102410 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2010-04-05 20:46 | |
I've tried to build against OpenSSL 1.0.0, and I get further failures in test_ssl. Since I don't know whether they are related, I've created a separate issue for them: issue8322 |
|||
msg102449 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2010-04-06 10:45 | |
Ok, here is a patch fixing all test_ftplib failures (tested with openssl 0.9.8k and 1.0.0). There are a couple of fixes here: - a workaround for the strange SSL_shutdown() error return, also witnessed by LightHTTPd users (see URL in comments) - an auto-retry for SSLSocket.unwrap() in blocking mode - a bizarre issue in test_ftplib where DTPHandler.handle_close() gets called many times in a test (test_data_connection) I think the SSL_shutdown() issue should be checked on the OpenSSL mailing-list. |
|||
msg102450 - (view) | Author: Giampaolo Rodola' (giampaolo.rodola) * ![]() |
Date: 2010-04-06 11:05 | |
def handle_error(self): - raise + # Ignore errors while closing, because the remote end could have + # abruptly shut down the TCP connection while we are still + # waiting for SSL shutdown to finish. + if not self._ssl_closing: + raise I don't like this. I should have said that I think my original patch is wrong as it attempts to shutdown() the SSL layer in asyncore's close() method. I think that I should have done that in handle_close() instead of close() because the two methods are called in different circumstances and have different meanings: - handle_close() is supposed to be called when the connection is a about to be closed "in a clean way" (for example: the other end properly finished to transmit all the data). - close() may be called in different situations, also when the connection gets closed abruptly. In this case we should avoid to shutdown() the SSL layer as we might be dealing with a "dead" connection. The right thing to do in this case is just close() the socket without doing anything else. IMHO, what we should try to avoid to do here is silently ignoring any error condition as you did in handle_error() method. What happens if you move the SSL shutdown() call in handle_close() method and leave handle_error() unchanged? |
|||
msg102451 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2010-04-06 11:11 | |
> What happens if you move the SSL shutdown() call in handle_close() > method and leave handle_error() unchanged? I get occasional crashes such as the following: Exception in thread Thread-60: Traceback (most recent call last): File "/home/antoine/cpython/newssl/Lib/threading.py", line 530, in __bootstrap_inner self.run() File "/home/antoine/cpython/newssl/Lib/test/test_ftplib.py", line 229, in run asyncore.loop(timeout=0.1, count=1) File "/home/antoine/cpython/newssl/Lib/asyncore.py", line 211, in loop poll_fun(timeout, map) File "/home/antoine/cpython/newssl/Lib/asyncore.py", line 154, in poll write(obj) File "/home/antoine/cpython/newssl/Lib/asyncore.py", line 88, in write obj.handle_error() File "/home/antoine/cpython/newssl/Lib/asyncore.py", line 84, in write obj.handle_write_event() File "/home/antoine/cpython/newssl/Lib/test/test_ftplib.py", line 313, in handle_write_event super(SSLConnection, self).handle_write_event() File "/home/antoine/cpython/newssl/Lib/asyncore.py", line 440, in handle_write_event self.handle_write() File "/home/antoine/cpython/newssl/Lib/asynchat.py", line 174, in handle_write self.initiate_send() File "/home/antoine/cpython/newssl/Lib/asynchat.py", line 215, in initiate_send self.handle_close() File "/home/antoine/cpython/newssl/Lib/test/test_ftplib.py", line 352, in handle_close self._do_ssl_shutdown() File "/home/antoine/cpython/newssl/Lib/test/test_ftplib.py", line 290, in _do_ssl_shutdown self.socket.unwrap() File "/home/antoine/cpython/newssl/Lib/ssl.py", line 264, in unwrap s = self._sslobj.shutdown() error: [Errno 32] Broken pipe |
|||
msg102452 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2010-04-06 11:15 | |
By the way, this "broken pipe" error could be due to the fact that the FTP_TLS client never tries to call unwrap() on its SSL socket. Perhaps the close() method should be overriden? |
|||
msg102455 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2010-04-06 11:39 | |
Here is a new patch. It doesn't change handle_error() but instead ignores abruptly closed sockets when doing the SSL shutdown. The underlying problem seems to be an asyncore limitation: close() can be called after an error such as EPIPE or ECONNABORTED, but asyncore doesn't give us any way to know in advance. |
|||
msg102456 - (view) | Author: Giampaolo Rodola' (giampaolo.rodola) * ![]() |
Date: 2010-04-06 11:43 | |
> By the way, this "broken pipe" error could be due to the fact that the > FTP_TLS client never tries to call unwrap() on its SSL socket. Perhaps > the close() method should be overriden? ftplib.FTP_TLS class already calls unwrap() but only when closing a "secured" *data* connection. This is never done for the *control* connection as the examples shown in RFC-4217 do that only when dealing with the CCC command which is intended to switch the control connection back to clear text. Since ftplib.py does not implement the CCC command I would avoid to override its close() method. As for our test server we are currently shutting down the SSL layer for both connections (control and data) so this might be the cause of EPIPE. At this proposal I suggest to move the shutdown() call from SSLConnection.handle_close() to DummyTLS_DTPHandler.handle_close() and see what happens. If EPIPE keeps being raised then I would investigate on what EPIPE means in terms of OpenSSL itself. It might mean that the socket is supposed to be closed in which case it's perfectly legitimate. I'd have a few questions: > a workaround for the strange SSL_shutdown() error return, also > witnessed by LightHTTPd users (see URL in comments) What happens now? I mean... what do we get instead of socket.error: 0? > an auto-retry for SSLSocket.unwrap() in blocking mode Wasn't SSL_MODE_AUTO_RETRY option applied in issue 8222 supposed to already do that for us? > I think the SSL_shutdown() issue should be checked on the OpenSSL > mailing-list. If socket.error: 0 comes from OpenSSL then yes. I was suspecting something like this. |
|||
msg102457 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2010-04-06 11:54 | |
> As for our test server we are currently shutting down the SSL layer for > both connections (control and data) so this might be the cause of EPIPE. I don't understand your reasoning. These are separate connections, closing one shouldn't affect the other. > At this proposal I suggest to move the shutdown() call from > SSLConnection.handle_close() to DummyTLS_DTPHandler.handle_close() and > see what happens. What happens is that you are taking shortcuts with SSL since you aren't trying to do a proper shutdown :) If you really think this is a good idea, feel free to give it a try. But please see newssl2.patch as well. (of course this is only a test server but AFAIU you would like to make it right) > What happens now? I mean... what do we get instead of socket.error: 0? A normal successful return. > Wasn't SSL_MODE_AUTO_RETRY option applied in issue 8222 supposed to > already do that for us? SSL_MODE_AUTO_RETRY deals with SSL negotiation. SSL_shutdown() has its own separate semantics, which have changed *even in blocking mode*. If I disable auto-retry I get failures such as: ====================================================================== ERROR: test_storbinary (test.test_ftplib.TestTLS_FTPClassMixin) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/antoine/cpython/newssl/Lib/test/test_ftplib.py", line 505, in test_storbinary self.client.storbinary('stor', f) File "/home/antoine/cpython/newssl/Lib/ftplib.py", line 727, in storbinary conn.unwrap() File "/home/antoine/cpython/newssl/Lib/ssl.py", line 264, in unwrap s = self._sslobj.shutdown() SSLError: [Errno 2] _ssl.c:1374: The operation did not complete (read) (note how "the operation did not complete" seems to mean that the socket is non-blocking, but the error is happening with the regular blocking FTP client) |
|||
msg102459 - (view) | Author: Giampaolo Rodola' (giampaolo.rodola) * ![]() |
Date: 2010-04-06 12:23 | |
> I don't understand your reasoning. These are separate connections, > closing one shouldn't affect the other. I meant another thing. I was talking about the fact that the test server attempts to shutdown() both control and data connections while the client only shutdown() the latter one. Maybe that's why you get EPIPE. > If you really think this is a good idea, feel free to give it a try. I whish but I'm having serious setup problems problems because I attempted to install different OpenSSL versions and now I'm no longer able to compile Python. Anyway, I'll give another try later today. =) > But please see newssl2.patch as well. - ftplib.FTP_TLS.quit: I wouldn't call unwrap() here as RFC-4217 does that only when dealing with CCC command which is not implemented by ftplib.py. - I'd be for calling _do_ssl_shutdown() from handle_close() instead of from close() (reasons explained in my previous message) - If EPIPE is still raised I'd try to move the shutdown() code from SSLConnection.handle_close() to DummyTLS_DTPHandler.handle_close(). |
|||
msg102462 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2010-04-06 13:10 | |
> - ftplib.FTP_TLS.quit: I wouldn't call unwrap() here as RFC-4217 does > that only when dealing with CCC command which is not implemented by > ftplib.py. Ok. > - I'd be for calling _do_ssl_shutdown() from handle_close() instead of > from close() (reasons explained in my previous message) This provokes test failures such as: ====================================================================== ERROR: test_dir (test.test_ftplib.TestTLS_FTPClassMixin) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/antoine/cpython/newssl/Lib/test/test_ftplib.py", line 537, in test_dir self.client.dir(lambda x: l.append(x)) File "/home/antoine/cpython/newssl/Lib/ftplib.py", line 511, in dir self.retrlines(cmd, func) File "/home/antoine/cpython/newssl/Lib/ftplib.py", line 714, in retrlines return self.voidresp() File "/home/antoine/cpython/newssl/Lib/ftplib.py", line 224, in voidresp resp = self.getresp() File "/home/antoine/cpython/newssl/Lib/ftplib.py", line 210, in getresp resp = self.getmultiline() File "/home/antoine/cpython/newssl/Lib/ftplib.py", line 196, in getmultiline line = self.getline() File "/home/antoine/cpython/newssl/Lib/ftplib.py", line 183, in getline line = self.file.readline() File "/home/antoine/cpython/newssl/Lib/socket.py", line 445, in readline data = self._sock.recv(self._rbufsize) File "/home/antoine/cpython/newssl/Lib/ssl.py", line 98, in <lambda> self.recv = lambda buflen=1024, flags=0: SSLSocket.recv(self, buflen, flags) File "/home/antoine/cpython/newssl/Lib/ssl.py", line 217, in recv return self.read(buflen) File "/home/antoine/cpython/newssl/Lib/ssl.py", line 138, in read return self._sslobj.read(len) SSLError: The read operation timed out I have already tested quite a bunch of possibilities (handle_close or close, silencing EPIPE or not...) > - If EPIPE is still raised I'd try to move the shutdown() code from > SSLConnection.handle_close() to DummyTLS_DTPHandler.handle_close(). This only makes the server less respectful of the protocol... I think it is fine to silence EPIPE and friends during SSL shutdown since, as I've said, asyncore will call handle_close() when the remote ends has closed the TCP connection. |
|||
msg102464 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2010-04-06 13:17 | |
New patch without the quit() method change. |
|||
msg102639 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2010-04-08 18:24 | |
Ok, I raised the "errno 0" issue on the openssl mailing-list, and I had a nice response by Darryl Miles here: http://www.mail-archive.com/openssl-users@openssl.org/msg60710.html The bottom line is that any "socket error" return from unwrap(), that is any "SSL_ERROR_SYSCALL" return from SSL_shutdown(), means that the socket is closed (abruptly or not). In this case, we can't do anything except ignore the error in test_ftplib. We should also remove the special casing hack in _ssl.c, even though a socket error with "errno 0" isn't particularly informative. Here is a new patch. |
|||
msg102641 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2010-04-08 20:15 | |
My analysis of SSL_shutdown() was missing something: timeout sockets are really non-blocking sockets in disguise. I guess it's never too late to notice. OpenSSL doesn't know about the timeout, so we have to do it ourselves. Ironically, the infrastructure is already in place in _ssl.c, and used for read() and write() methods. So here is a new patch, taking this into account. |
|||
msg102689 - (view) | Author: Darryl Miles (dlmiles) | Date: 2010-04-09 06:54 | |
In order to build Python with a specific version of OpenSSL followed the CYGWIN instructions and edited Modules/Setup to make it read (note - I added "-L$(SSL)" into the linker options too, since by default on CentOS 5.4 i386 OpenSSL build in static library mode ala ../openssl-1.0.0/libssl.a) : SSL=../openssl-1.0.0 _ssl _ssl.c \ -DUSE_SSL -I$(SSL)/include -I$(SSL)/include/openssl \ -L$(SSL)/lib -L$(SSL) -lssl -lcrypto It is not clear to me what Python's goals are: * To be backward compatible, in which case I don't know your historical use of SSL_shutdown(). * To be a thin-layer (1:1) over OpenSSL, so that "power users" can harness the full potential of OpenSSL if they are willing to understand the finer points. * To provide a full-featured Python API. * To provide a Python API that is easy to use within the Python paradigm. These goals may not be convergent. |
|||
msg102731 - (view) | Author: Darryl Miles (dlmiles) | Date: 2010-04-09 16:46 | |
I've updated my attachment to the bug, if you read the old one please re-read the updated version (since some points in there were not accurate). With regards to the OpenSSL error return -1/ERROR_SYSCALL with errno==0 being observed, I shall respond into the OpenSSL mailing list with a fuller response. The man page SSL_get_error(3) does explain what getting a zero error means in relation to end-of-file at the BIO/TCP socket level. In light of the presumption by me that the problem was because one end did a syscall close(fd) this makes perfect sense in the context of your observation and OpenSSL appears to be working as documented. There is also code to print out the error in Python at Modules/_ssl.c method PySSL_SetError() so I'm not sure of the source of the funny looking error printing in relation to the ftpcli test case, consider it to be an error message formatting glitch. Now the issue I see here is that there are clearly 3 use cases Python should provide: * one-shot raw mode (don't enter the loop at all, as per newssl5.patch/my attachment, this is more or less what you already have in CVS, but I would remove the 2nd call to SSL_shutdown(), raw mode means exactly that; the caller is in charge of calling it again, thin layer for Python power users) [case-1] * perform IO sleep/wait as necessary until we observe SSL_shutdown()==0 (or better! so this will return if 0 or 1 is returned) [case-2] * perform IO sleep/wait as necessary until we observe SSL_shutdown()==1 [case-3] I presume you already have a way of handling the configuration of I/O timeouts as per Python's IO processing model (that is provided by some other API mechanism). The question is what is the best way to provide them (what is inline with the Python paradigm?) : * one method, keep existing named method, add new optional argument that can indicate all 3 modes of operation. Debate which of the 3 modes of operation is the default when the argument is not specified, case-1 seems to most backwardly compatible. [I am presuming python supports optional arguments] * new method, keep existing as-is (to cover case 1), implement case-2 and case-3 in the method which also take an argument for the user to specify which use case they want. From this a patch should be straight-forward. Then we can look to see if the FTP client or server is doing anything wrong in light of having the building blocks in place to achieve any goal on top of OpenSSL. |
|||
msg102852 - (view) | Author: Darryl Miles (dlmiles) | Date: 2010-04-11 16:47 | |
I am unable to get "make test" to run from an unpatched version in SVN (details below of make output). Please find attached an updated patch for your consideration (and testing, as I can't test it due to 'make test' SIGSEGV on CentOS 5.4 i386). Patch Notes: 1) Some thing that concern me, the "unwrap()" philosophy looks to be used to remove SSL from the Python high-level socket handle, so you can go back to plaintext mode. You can ONLY perform an unwrap() AFTER an SSL_shutdown()==1 has been observed (you need to wait for the other end to do something voluntarily). So you must retry the SSL_shutdown() over and over while you sleep-wait for IO, so this is akin to calling the ssl.shutdown(ssl.SSL_SHUTDOWN_MODE_BOTH) and getting back success. Also if it is your intention to properly implement an unwrap() like this you should disable IO read-ahead mode before calling shutdown for the second time, SSL_set_read_ahead(ssl, 0). This stops OpenSSL from eating too many bytes accidentally (probably from the kernel into its own buffers), from the inbound IO stream, which may not be SSL protocol data, it maybe plain text data (behind the last byte of SSL protocol data). 2) Due to the IO waiting it looks also necessary to copy the setup of SSL_set_nbio() from the read/write paths so the check_socket_and_wait_for_timeout() works in sympathy to the callers IO timeout reconfiguration. 3) My patch presumes the allocation of the type "struct PySSLObject" uses calloc() or some other memory zeroing strategy. There is a new member in that struct to track if SSL_shutdown() has previously returned a zero value. 4) The SSL_peek() error path needs checking to see if the error return is consistent with the Python paradigm. 5) Please check I have understand the VARARGS method correctly. I have made the default to SSL_SHUTDOWN_MODE_SENT (despite backward compatibly being SSL_SHUTDOWN_MODE_ONCE), this is because I would guess that most high-level applications did not intend to use it in raw mode; nor be bothered with the issues surrounding correct usage. I would guess high-level applications wanted Python to take the strain here. 6) I suspect you need to address your unwrap() policy a little better, the shutdown operation and the unwrap() are two different matters. The shutdown() should indicate success or not (in respect of the mode being requested, raw mode is a tricky one as the caller would want to the exact error return so it can do the correct thing), unwrap() should itself call ssl.shutdown(ssl.SSL_SHUTDOWN_MODE_BOTH) until it sees success and then remove the socket (and deallocate SSL objects). As things stand SSL_SHUTDOWN_MODE_ONCE does not work in a useful way since the error returns are not propagated to the caller, because "unwrap" is mixed into this. So that would still need fixing. building works ok, testing fails with SIGSEGV. Is this something to do with no having _bsddb built ? I have db-4.3 working. Maybe someone can reply by email on the matter. # make running build running build_ext building dbm using gdbm Python build finished, but the necessary bits to build these modules were not found: bsddb185 sunaudiodev To find the necessary bits, look in setup.py in detect_modules() for the module's name. running build_scripts # make test running build running build_ext building dbm using gdbm Python build finished, but the necessary bits to build these modules were not found: bsddb185 sunaudiodev To find the necessary bits, look in setup.py in detect_modules() for the module's name. running build_scripts find ./Lib -name '*.py[co]' -print | xargs rm -f ./python -Wd -3 -E -tt ./Lib/test/regrtest.py -l == CPython 2.7a4+ (trunk:79902M, Apr 11 2010, 16:38:55) [GCC 4.1.2 20080704 (Red Hat 4.1.2-46)] == Linux-2.6.18-164.15.1.el5-i686-with-redhat-5.4-Final == /root/python-svn/build/test_python_29248 test_grammar test_opcodes test_dict test_builtin test_exceptions test_types test_unittest test_doctest test_doctest2 test_MimeWriter test_SimpleHTTPServer test_StringIO test___all__ /root/python-svn/Lib/test/test___all__.py:10: DeprecationWarning: in 3.x, the bsddb module has been removed; please use the pybsddb project instead import bsddb /root/python-svn/Lib/bsddb/__init__.py:67: PendingDeprecationWarning: The CObject type is marked Pending Deprecation in Python 2.7. Please use capsule objects instead. import _bsddb make: *** [test] Segmentation fault |
|||
msg102857 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2010-04-11 17:45 | |
To just run the test_ftplib tests, you can use: $ ./python -m test.regrtest -v -uall test_ftplib (Other tests involving SSL sockets are test_ssl, test_smtpnet, test_imaplib and test_poplib) > 1) Some thing that concern me, the "unwrap()" philosophy looks to be > used to remove SSL from the Python high-level socket handle, so you > can go back to plaintext mode. You can ONLY perform an unwrap() AFTER > an SSL_shutdown()==1 has been observed (you need to wait for the other > end to do something voluntarily). When the SSL shutdown fails, an exception is raised, which means the rest of the unwrapping (at the Python "high-level" socket level) doesn't occur. Therefore, it is safe to call unwrap() again from user code because the SSL object is still there. > Also if it is your intention to properly implement an unwrap() like > this you should disable IO read-ahead mode before calling shutdown for > the second time, SSL_set_read_ahead(ssl, 0). This stops OpenSSL from > eating too many bytes accidentally (probably from the kernel into its > own buffers), from the inbound IO stream, which may not be SSL > protocol data, it maybe plain text data (behind the last byte of SSL > protocol data). Do you know how to cook a simple test to exercise this? > 2) Due to the IO waiting it looks also necessary to copy the setup of > SSL_set_nbio() from the read/write paths so the > check_socket_and_wait_for_timeout() works in sympathy to the callers > IO timeout reconfiguration. Thanks for spotting this. > 5) Please check I have understand the VARARGS method correctly. I > have made the default to SSL_SHUTDOWN_MODE_SENT (despite backward > compatibly being SSL_SHUTDOWN_MODE_ONCE), this is because I would > guess that most high-level applications did not intend to use it in > raw mode; nor be bothered with the issues surrounding correct usage. > I would guess high-level applications wanted Python to take the strain > here. Yes, sounds right indeed. I'm not sure we need a choice of shutdown modes at all. > building works ok, testing fails with SIGSEGV. Is this something to > do with no having _bsddb built ? I have db-4.3 working. Maybe > someone can reply by email on the matter. _bsddb seems to be built, it's the old bsddb185 which isn't. The module apparently breaks when importing it, can you open a separate issue for it? I'd like Bill Janssen's opinion on these proposed changes. Bill, can you take a look? |
|||
msg102867 - (view) | Author: Darryl Miles (dlmiles) | Date: 2010-04-11 18:54 | |
To explain why you need 2 modes, a client/server would expect to do the following pseudo actions for maximum efficiency: set_socket_timeout(600_SECONDS) # or useful default send_data_over_ssl("QUIT\r\n") shutdown(SSL_SHUTDOWN_MODE_SENT) flush_data_down_to_socket() # maybe automatic/implied (OpenSSL users with custom BIO layers should be aware of this step) shutdown(socket, SHUT_WR) # this is optional, TCP socket level shutdown recv_data_over_ssl() = "250 Bye bye!\r\n" # this will take time to arrive set_socket_io_timeout(5_SECONDS) shutdown(SSL_SHUTDOWN_MODE_BOTH) # this is optional! some clients may choose to skip it entirely close()/unwrap() A server would: recv_data_over_ssl() = "QUIT\r\n" # would be sitting idle waiting for this command send_data_over_ssl("250 Bye bye!\r\n") shutdown(SSL_SHUTDOWN_MODE_SENT) flush_data_down_to_socket() # maybe automatic/implied (OpenSSL users with custom BIO layers should be aware of this step) shutdown(socket, SHUT_WR) # this is optional, TCP socket level shutdown set_socket_io_timeout(30_SECONDS) shutdown(SSL_SHUTDOWN_MODE_BOTH) # a good server would implement this step close()/unwrap() Now if your outbound data is CORKed and flushed, the flush points would cause all the SSL data from both the 'last sent data' and the 'send shutdown notify' to go out in the same TCP segment and arrive at the other end more or less together. Doing any of the above in a different order introduces some kind of inefficiency. shutdown(fd, SHUT_WR) are often used at the socket level to help the manage TIME_WAIT. The client has to wait for the QUIT response message anyway. With the above sequence there is no additional time delay or cost with both parties performing a SSL protocol shutdown at the same time. Despite the IO timeouts existing (to provide a safety net). If the client is talking to a buggy server the worst case scenario is that it receives the quit response but the server never does an SSL shutdown and the server doesn't close the socket connection. In this situation the client will have to wait for IO timeout, some clients in other software use blocking sockets and don't have a timeout so they end up hooked (forever). |
|||
msg103354 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2010-04-16 19:32 | |
After thinking a bit about it, we're in quasi-feature freeze now for 2.x and it doesn't sound reasonable to add options to unwrap(); we would need to write tests to exercise these options and their behaviour. Also, they aren't strictly needed to fix the issues at hand, but provide a functional benefit. So I've taken some of Darryl's ideas and fixes and incorporated them. Here is a new patch with them. I've checked compatibility with OpenSSL 1.0.0 and 0.9.8k. If nobody manifests him/herself I will commit it soon. (note: I dropped the idea of checking for application data with SSL_peek(), since it seems to be a blocking call; at least one of the test_ssl test cases froze here) |
|||
msg103412 - (view) | Author: Giampaolo Rodola' (giampaolo.rodola) * ![]() |
Date: 2010-04-17 17:27 | |
If everyone agrees on "error: [Errno 0] Error" being a legitimate alias for a "connection closed event" condition then I'd say the test server looks good, even if I think that expecting a ssl.SSLError derived exception would have made more sense, but I understand this is a OpenSSL inherited behavior. Anyway, I'd like to improve the SSL test server in future as I'm sure it hides some other problems but I can do that once pyftpdlib's SSL server code is more mature and after 2.7 is released. |
|||
msg103418 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2010-04-17 18:11 | |
> If everyone agrees on "error: [Errno 0] Error" being a legitimate alias > for a "connection closed event" condition then I'd say the test server > looks good, even if I think that expecting a ssl.SSLError derived > exception would have made more sense, but I understand this is a > OpenSSL inherited behavior. Well, it's quite an unfortunate error display, so perhaps we could detect it and turn into something more meaningful instead. However, I think it should remain a socket error or an OS error, because it's really the underlying socket which is closed, not the SSL wrapper. > Anyway, I'd like to improve the SSL test server in future as I'm sure > it hides some other problems but I can do that once pyftpdlib's SSL > server code is more mature and after 2.7 is released. Shouldn't it become part of asyncore at some point? It's both useful (well, for people using asyncore anyway :-)) and non-trivial. |
|||
msg103450 - (view) | Author: Darryl Miles (dlmiles) | Date: 2010-04-17 23:18 | |
With regards to create test cases for certain situations, sure this would be possible but not with "pure python" since your APIs deny/inhibit the particular things required to force a situation for a test case. With regards to SSL_peek() blocking, you'd need to explain yourself better on that one. The patch has been tested with the test cases from Python SVN enough to be happy they run ok. Maybe you have some offline not yet checked in SSL test cases you are referring to. To clarify why this is being done, if there is unread data then SSL_shutdown() will never return 1. Maybe you can simulate this situation by using SSL_write() with 1 byte payloads and a 10ms delay between each SSL_write() of the "QUIT response message" (you are trying to simulate network propagation delay). Then you have a client that tries to do unwrap() right after having sent the quit command, but makes no attempt to receive the response. I'll leave you guys too it about how you want to handle things with python (i.e. to make the design choice trade offs). I think all the points have been covered. |
|||
msg103604 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2010-04-19 14:26 | |
> With regards to SSL_peek() blocking, you'd need to explain yourself > better on that one. The patch has been tested with the test cases > from Python SVN enough to be happy they run ok. IIRC it worked with OpenSSL 1.0.0 but failed with 0.9.8k. I could test it again if you think that is unlikely. Can SSL_pending() be used in place of SSL_peek() in this case? (note: there doesn't seem to be any documentation for SSL_peek() on the openssl.org website) > I'll leave you guys too it about how you want to handle things with > python (i.e. to make the design choice trade offs). I think all the > points have been covered. Thanks a lot for your help. I believe it has been fruitful. |
|||
msg103987 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2010-04-23 00:18 | |
Patch committed to r80392 (trunk) and r80394 (py3k). Am pondering a partial backport of the SSL shutdown fixes to 2.6 and 3.1 (test_ftplib doesn't need any backport since it doesn't support TLS in these branches). |
|||
msg104053 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2010-04-23 21:13 | |
Ok, I've backported the shutdown fixes to 2.6 (r80419) and 3.1 (r80420). Thanks again everyone. |
|||
msg151688 - (view) | Author: Maxim Egorushkin (max0x7ba) | Date: 2012-01-20 12:13 | |
I am still having this issue with Python-2.7.2 (which includes the patch) and openssl-devel-1.0.0e-1.fc14.x86_64: File "./download.py", line 195, in download_file ftp.retrbinary("retr " + filename, lambda data: file.write(data)) File "/usr/local/ots/python/lib/python2.7/ftplib.py", line 691, in retrbinary conn.unwrap() File "/usr/local/ots/python/lib/python2.7/ssl.py", line 275, in unwrap s = self._sslobj.shutdown() socket.error: [Errno 0] Error |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:56:58 | admin | set | github: 52355 |
2012-01-20 12:13:45 | max0x7ba | set | nosy:
+ max0x7ba messages: + msg151688 |
2010-04-23 21:13:53 | pitrou | set | status: pending -> closed messages: + msg104053 versions: + Python 2.6, Python 3.1 |
2010-04-23 00:18:29 | pitrou | set | status: open -> pending messages: + msg103987 assignee: pitrou resolution: fixed stage: patch review -> resolved |
2010-04-19 14:26:22 | pitrou | set | messages: + msg103604 |
2010-04-17 23:18:16 | dlmiles | set | messages: + msg103450 |
2010-04-17 18:11:27 | pitrou | set | messages: + msg103418 |
2010-04-17 17:27:52 | giampaolo.rodola | set | messages: + msg103412 |
2010-04-16 19:32:05 | pitrou | set | files:
+ newssl6.patch messages: + msg103354 |
2010-04-11 18:54:10 | dlmiles | set | messages: + msg102867 |
2010-04-11 17:45:34 | pitrou | set | messages: + msg102857 |
2010-04-11 16:47:42 | dlmiles | set | files:
+ Modules__ssl.c.patch messages: + msg102852 |
2010-04-09 16:46:20 | dlmiles | set | files: - python_ssl.c.txt |
2010-04-09 16:46:10 | dlmiles | set | files:
+ python_ssl_v2.c.txt messages: + msg102731 |
2010-04-09 06:54:12 | dlmiles | set | files:
+ python_ssl.c.txt nosy: + dlmiles messages: + msg102689 |
2010-04-08 20:15:10 | pitrou | set | files:
+ newssl5.patch messages: + msg102641 |
2010-04-08 18:24:44 | pitrou | set | files:
+ newssl4.patch messages: + msg102639 |
2010-04-06 13:17:02 | pitrou | set | files:
+ newssl3.patch messages: + msg102464 |
2010-04-06 13:10:55 | pitrou | set | messages: + msg102462 |
2010-04-06 12:23:13 | giampaolo.rodola | set | messages: + msg102459 |
2010-04-06 11:54:30 | pitrou | set | messages:
+ msg102457 versions: - Python 2.6, Python 3.1 |
2010-04-06 11:43:09 | giampaolo.rodola | set | messages: + msg102456 |
2010-04-06 11:39:26 | pitrou | set | files:
+ newssl2.patch messages: + msg102455 |
2010-04-06 11:15:09 | pitrou | set | messages: + msg102452 |
2010-04-06 11:11:15 | pitrou | set | messages: + msg102451 |
2010-04-06 11:05:35 | giampaolo.rodola | set | messages: + msg102450 |
2010-04-06 10:46:05 | pitrou | set | files: + newssl.patch |
2010-04-06 10:45:42 | pitrou | set | messages: + msg102449 |
2010-04-05 20:46:52 | pitrou | set | messages: + msg102410 |
2010-04-05 20:43:16 | giampaolo.rodola | set | messages: + msg102407 |
2010-04-05 19:28:16 | pitrou | set | messages: + msg102398 |
2010-04-05 18:37:04 | giampaolo.rodola | set | messages: + msg102396 |
2010-04-05 18:30:54 | pitrou | set | messages: + msg102395 |
2010-04-05 18:28:47 | flox | set | nosy:
+ jcea |
2010-04-05 18:25:38 | giampaolo.rodola | set | nosy:
- jcea messages: + msg102394 versions: + Python 2.6, Python 3.1 |
2010-04-05 13:33:15 | jcea | set | nosy:
+ jcea |
2010-04-03 20:44:34 | benjamin.peterson | set | priority: release blocker -> critical |
2010-04-03 20:16:41 | pitrou | set | messages: + msg102293 |
2010-04-03 20:15:16 | pitrou | set | messages:
+ msg102292 versions: - Python 2.6, Python 3.1 |
2010-04-03 16:41:58 | benjamin.peterson | set | priority: high -> release blocker |
2010-03-25 19:33:19 | giampaolo.rodola | set | messages: + msg101729 |
2010-03-25 14:57:11 | flox | set | messages: + msg101706 |
2010-03-25 14:52:59 | flox | set | files:
+ issue8108_ftplib_ssl_098m_v2.diff messages: + msg101705 |
2010-03-25 14:50:52 | giampaolo.rodola | set | messages: + msg101704 |
2010-03-25 14:49:06 | flox | set | files: - issue8108_ftplib_ssl_098m.diff |
2010-03-25 14:11:58 | flox | set | files:
+ issue8108_print_ssl_want_read.diff messages: + msg101699 |
2010-03-25 13:30:12 | pitrou | set | messages: + msg101694 |
2010-03-25 08:32:48 | flox | set | files:
+ issue8108_ftplib_ssl_098m.diff messages: + msg101678 |
2010-03-25 08:31:40 | giampaolo.rodola | set | messages: + msg101677 |
2010-03-25 00:28:44 | flox | set | files:
+ issue8108_openssl_098m.diff messages: + msg101661 stage: needs patch -> patch review |
2010-03-24 22:16:13 | pitrou | set | stage: test needed -> needs patch versions: + Python 2.6, Python 3.1 |
2010-03-24 22:09:26 | pitrou | set | messages: + msg101654 |
2010-03-22 21:14:33 | giampaolo.rodola | set | messages: + msg101534 |
2010-03-22 15:58:30 | pitrou | set | messages: + msg101517 |
2010-03-15 00:11:28 | janssen | set | messages: + msg101074 |
2010-03-15 00:05:22 | janssen | set | messages: + msg101073 |
2010-03-14 22:42:26 | pitrou | set | messages: + msg101070 |
2010-03-14 20:02:49 | vstinner | set | messages: + msg101066 |
2010-03-14 19:44:49 | janssen | set | messages: + msg101065 |
2010-03-14 17:26:42 | Arfrever | set | nosy:
+ Arfrever |
2010-03-13 02:58:16 | pitrou | set | messages: + msg100985 |
2010-03-12 23:24:49 | vstinner | set | messages: + msg100980 |
2010-03-12 22:01:42 | giampaolo.rodola | set | files:
+ testftpd.patch messages: + msg100977 |
2010-03-12 20:19:54 | janssen | set | messages: + msg100965 |
2010-03-12 20:18:01 | janssen | set | messages: + msg100964 |
2010-03-12 20:10:40 | janssen | set | messages: + msg100963 |
2010-03-12 20:05:24 | giampaolo.rodola | set | messages: + msg100962 |
2010-03-12 19:48:36 | janssen | set | messages: + msg100961 |
2010-03-12 19:23:55 | janssen | set | messages: + msg100958 |
2010-03-12 00:53:35 | vstinner | set | files:
+ ssl_shutdown.patch keywords: + patch messages: + msg100899 |
2010-03-11 23:09:57 | pitrou | set | nosy:
+ janssen |
2010-03-11 23:06:25 | vstinner | set | messages: + msg100892 |
2010-03-11 21:48:34 | vstinner | set | messages: + msg100889 |
2010-03-11 21:39:28 | vstinner | set | messages: + msg100888 |
2010-03-10 13:49:02 | flox | set | messages: + msg100784 |
2010-03-10 13:47:34 | flox | set | keywords:
+ buildbot messages: + msg100783 |
2010-03-10 13:45:11 | flox | set | nosy:
+ pitrou, vstinner, giampaolo.rodola |
2010-03-10 13:44:37 | flox | create |