New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change socket.close() to ignore ECONNRESET #74504
Comments
http://buildbot.python.org/all/builders/AMD64%20FreeBSD%209.x%203.6/builds/128/steps/test/logs/stdio test_invalid_authentication (test.test_imaplib.NewIMAPSSLTests) ... SENT: b'* OK IMAP4rev1' ====================================================================== Traceback (most recent call last):
File "/usr/home/buildbot/python/3.6.koobs-freebsd9/build/Lib/test/test_imaplib.py", line 223, in _cleanup
self.client.shutdown()
File "/usr/home/buildbot/python/3.6.koobs-freebsd9/build/Lib/imaplib.py", line 326, in shutdown
self.sock.close()
File "/usr/home/buildbot/python/3.6.koobs-freebsd9/build/Lib/socket.py", line 417, in close
self._real_close()
File "/usr/home/buildbot/python/3.6.koobs-freebsd9/build/Lib/ssl.py", line 1052, in _real_close
socket._real_close(self)
File "/usr/home/buildbot/python/3.6.koobs-freebsd9/build/Lib/socket.py", line 411, in _real_close
_ss.close(self)
ConnectionResetError: [Errno 54] Connection reset by peer Ran 95 tests in 32.694s FAILED (errors=1, skipped=2) |
@our SSL experts: any idea why sock.close() fails with "ConnectionResetError: [Errno 54] Connection reset by peer" in imaplib? Is that a new error? Should imaplib catch ConnectionResetError on sock.close()? By the way, see also bpo-30329: I proposed a patch to catch the Windows socket error 10022 on shutdown() for poplib and imaplib SSL connections. No idea why this error only occurs on SSL connections. |
Making this an index of related reports: bpo-30319: test_imap These all look like a side effect of my change to raise an error from the OS as an exception when closing a socket, bpo-26685. Only 3.6+ is affected. According to <https://www.freebsd.org/cgi/man.cgi?close%282%29\>, ECONNRESET means “The underlying object was a stream socket that was shut down by the peer before all pending data was delivered”. It seems this is specific to Free BSD. See bug report about Posix compliance: <https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=159179\>. According to <https://forums.zeroc.com/discussion/5569/patch-to-network-cpp-for-freebsd-for-econnreset-on-close-2-problem\> this started in Free BSD 6.3 in 2006. I suppose the options are:
|
Before discussing revert, I would like experimenting fixing calls to |
I think fixing all affected calls to socket.close in the world (option 3) would be too much. I just added two new reports (bpo-30652 and bpo-30391) as dependencies. These are about testing socketserver.TCPServer. As an example, to fix the socket.close call there, I think the change would look like class TCPServer:
def close_request(self, request):
try:
request.close()
except ConnectionError:
# Suppress asynchronous errors such as ECONNRESET on Free BSD
pass Instead of that change all over the place, I am thinking option 2 would be safest. In Modules/socketmodule.c, something like sock_close(PySocketSockObject *s)
{
Py_BEGIN_ALLOW_THREADS
res = SOCKETCLOSE(fd);
Py_END_ALLOW_THREADS
/* ECONNRESET can occur on Free BSD */
if (res < 0 && errno != ECONNRESET) {
return s->errorhandler();
}
} |
+1 from me. |
Please, let's not use the "dependencies" field for issues which are not actual dependencies. |
None of "27784,30106,30315,30328,30391,30652,30543" are dependencies of this issue. |
tl; dr I agree to ignore ECONNRESET-like errors on socket.shutdown() and sock.close(). I'm not sure that I understand *exactly* the problem here. In which case closing a socket fails with ECONNRESET? Can it mean that the last write() was buffered and failed? If socket.send() is blocking, I expect that send() gets the ECONNRESET error. If socket.send() is non-blocking, the application has to be very carefully written to handle many kinds of errors. From a high lever point of view, I don't think that ECONNRESET is interested. I expect that a protocol at the application level doesn't rely on ECONNRESET, but a command like "QUIT". No? For a file on disk, it's different, since write() is always buffered and close() has to flush pending write on disk. For a network protocol, loosing the connection, loosing data, etc. is something "normal", not something execptional. That's why there are application-level commands to close cleanly a connection. Ok, now for SSL sockets... Is it also ok to ignore ECONNRESET on sock.close() for an SSL socket? -- ECONNRESET can occur on sock.close(), but not only: see bpo-30329, shutdown() fails can ENOTCONN on UNIX or WSAEINVAL on Windows. I modified poplib and imaplib recently to handle WSAEINVAL: commit 83a2c28
|
About versions, socket.close() was modified in Python 3.6 (bpo-26685) to raise an OSError on error. So only Python 3.6 and 3.7 are impacted here. |
Another buildbot failure which may be related: http://buildbot.python.org/all/builders/AMD64%20FreeBSD%209.x%203.6/builds/154/steps/test/logs/stdio test_write (test.test_socketserver.SocketWriterTest) ... Exception in thread <class 'socketserver.TCPServer'> serving:
Traceback (most recent call last):
File "/usr/home/buildbot/python/3.6.koobs-freebsd9/build/Lib/threading.py", line 916, in _bootstrap_inner
self.run()
File "/usr/home/buildbot/python/3.6.koobs-freebsd9/build/Lib/threading.py", line 864, in run
self._target(*self._args, **self._kwargs)
File "/usr/home/buildbot/python/3.6.koobs-freebsd9/build/Lib/socketserver.py", line 238, in serve_forever
self._handle_request_noblock()
File "/usr/home/buildbot/python/3.6.koobs-freebsd9/build/Lib/socketserver.py", line 319, in _handle_request_noblock
self.handle_error(request, client_address)
File "/usr/home/buildbot/python/3.6.koobs-freebsd9/build/Lib/socketserver.py", line 317, in _handle_request_noblock
self.process_request(request, client_address)
File "/usr/home/buildbot/python/3.6.koobs-freebsd9/build/Lib/socketserver.py", line 349, in process_request
self.shutdown_request(request)
File "/usr/home/buildbot/python/3.6.koobs-freebsd9/build/Lib/socketserver.py", line 510, in shutdown_request
self.close_request(request)
File "/usr/home/buildbot/python/3.6.koobs-freebsd9/build/Lib/socketserver.py", line 514, in close_request
request.close()
File "/usr/home/buildbot/python/3.6.koobs-freebsd9/build/Lib/socket.py", line 417, in close
self._real_close()
File "/usr/home/buildbot/python/3.6.koobs-freebsd9/build/Lib/socket.py", line 411, in _real_close
_ss.close(self)
ConnectionResetError: [Errno 54] Connection reset by peer /usr/home/buildbot/python/3.6.koobs-freebsd9/build/Lib/unittest/case.py:633: ResourceWarning: unclosed <socket.socket fd=5, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=0, laddr=('127.0.0.1', 29862), raddr=('127.0.0.1', 29860)> ====================================================================== Traceback (most recent call last):
File "/usr/home/buildbot/python/3.6.koobs-freebsd9/build/Lib/test/test_socketserver.py", line 175, in test_TCPServer
self.stream_examine)
File "/usr/home/buildbot/python/3.6.koobs-freebsd9/build/Lib/test/support/__init__.py", line 2045, in decorator
return func(*args)
File "/usr/home/buildbot/python/3.6.koobs-freebsd9/build/Lib/test/test_socketserver.py", line 141, in run_server
testfunc(svrcls.address_family, addr)
File "/usr/home/buildbot/python/3.6.koobs-freebsd9/build/Lib/test/test_socketserver.py", line 153, in stream_examine
buf = data = receive(s, 100)
File "/usr/home/buildbot/python/3.6.koobs-freebsd9/build/Lib/test/test_socketserver.py", line 46, in receive
raise RuntimeError("timed out on %r" % (sock,))
RuntimeError: timed out on <socket.socket fd=5, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=0, laddr=('127.0.0.1', 29862), raddr=('127.0.0.1', 29860)> |
I propose to ignore ECONNRESET in socket.close() and ENOTCONN and WSAEINVAL on socket.shutdown(). If we see more failures later, we can extend these lists. asyncore also ignores ENOTCONN on socket.close(), but I don't trust this module at this point: it seems like asyncore also handles pipes, not only socket.socket. Errors seen on buildbots:
The Python standard library already ignores some errors on socket.shutdown() and socket.close() depending on the module:
# Exceptions which must not call the exception handler in fatal error
# methods (_fatal_error())
_FATAL_ERROR_IGNORE = (BrokenPipeError, ConnectionResetError, ConnectionAbortedError) Mapping of exceptions to error codes:
|
shutdown() is not the same thing as close(). If you want to ignore errors on shutdown() you should open a separate issue (and argument for it, because "I saw the error on the buildbots" is not a sufficient reason IMHO). |
Antoine Pitrou: "shutdown() is not the same thing as close()." I consider that the bug is similar to socket.close() error, since we also get an exception because the peer closed the connection. It's just the error code which is different. Antoine Pitrou: "If you want to ignore errors on shutdown() you should open a separate issue (and argument for it, because "I saw the error on the buildbots" is not a sufficient reason IMHO)." bpo-4473 modified poplib to ignore ENOTCONN on shutdown(), patch by Lorenzo Catucci committed by you :-) Commit d89824b. Commit 81c87c5 modified impalib to ignore ENOTCONN, commit written and pushed by you ;-) In the bpo-30329, I modified poplib and imaplib to also ignore WSAEINVAL on shutdown(). I don't see the need to separate socket.close() and socket.shutdown() error handling in two issues. |
I wrote #2565
|
Le 04/07/2017 à 14:00, STINNER Victor a écrit :
First, let's not call it a bug when an error is reported to the user. A Second, close() and shutdown() are different functions operating at shutdown() operates at the transport level. Someone who calls It makes sense to silence some errors in close() because, most of the |
Ok, I rewrote my PR to only ignore ECONNRESET in socket.close(). |
I wrote #2565 because the ConnectionResetError error on socket.close() is more and more a cause of failures on FreeBSD buildbots. I mean that I fixed enough other bugs to start to only see one specific bug more often ;-) (Before we had dozens of random bugs.) |
Ok. I modified socket.close() to ignore ECONNRESET in Python 3.6 and 3.7. I will leave the issue open a few days to see if it helps to reduce buildbot failure rate. |
Antoine: FYI I abandonned my idea of ignoring errors on socket.shutdown(), since I agree with your rationale. An application may rely on shutdown() exception to trigger some events, and a socket can still be used after a shutdown(). |
Thank you Victor :-) |
Thanks for handling this Victor. To answer some of your earlier questions, this is my understanding of the Free BSD behaviour (although I don't have Free BSD to experiment with): When writing to TCP sockets, I believe the data is buffered by the local OS (as well as the network, remote peer, etc). The send call will normally return straight away. In the background, the OS might combine the data with existing buffers, send it to the network, wait for acknowledgements, retransmit it, etc. On Free BSD, steps to trigger ECONNRESET might be:
I understand ECONNRESET is an _indication_ that not all pending data was delivered. But this is asynchronous, and a lack of ECONNRESET does not guarantee that all pending data was delivered. Imagine if steps 3 and 4 were swapped above. I doubt Free BSD will block the close call until the data is acknowledged, so it won't know if the peer will reset the connection in the future. To guarantee the data was delivered to the application (not just the remote OS), you do need an application-level acknowledgement. For SSL, when you call the top-level SSLSocket.close, I don't think that does much at the SSL level. Again, if you need delivery indication, I would use an app-level acknowledgement. Also beware that by default, Python doesn't report a secure EOF signal sent from the remote peer, so I think you either need a specific app-level message, or should disable the suppress_ragged_eofs mode (see bpo-27815). Antoine: sorry for abusing the dependencies list; I will try to avoid that in the future. It seemed the easiest way to get a two-way link to a bunch of other bugs that could be duplicates, but I wasn't sure at the time. My theory was if this bug was fixed, someone could review those other bugs and see if they could also be closed. |
Thanks Martin for the long explanation. To simplify a lot, there is |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: