Issue5293
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2009-02-17 10:06 by techtonik, last changed 2022-04-11 14:56 by admin. This issue is now closed.
Messages (11) | |||
---|---|---|---|
msg82311 - (view) | Author: anatoly techtonik (techtonik) | Date: 2009-02-17 10:06 | |
The code below exits with timeout after about 20 secs on Windows + Python 2.5.4 import socket # address of server routable, but offline server = "192.168.1.2" s = socket.socket() s.setblocking(1) s.connect((server, 139)) s.close() The output is: Traceback (most recent call last): File "D:\.env\test.py", line 6, in <module> s.connect((server, 139)) File "<string>", line 1, in connect socket.error: (10060, 'Operation timed out') If timeout is set to 1 it exits almost immediately. If timeout is large it waits for about 20 seconds and exits. I use socket to wait for the network service to appear. The target machine 192.168.1.2 belongs to local network, but offline. |
|||
msg82353 - (view) | Author: Martin v. Löwis (loewis) * | Date: 2009-02-17 18:24 | |
Why do you think this is a bug in Python? |
|||
msg82355 - (view) | Author: anatoly techtonik (techtonik) | Date: 2009-02-17 19:14 | |
Because documentation doesn't say that Python should timeout after 20 seconds after entering blocking mode if socket to remote host can not be opened. |
|||
msg82356 - (view) | Author: Gregory P. Smith (gregory.p.smith) * | Date: 2009-02-17 19:18 | |
You can't use a connect() call for the purpose of waiting for your network to be up. This has nothing to do with Python. This is how all network APIs work regardless of OS and language. The "timeout" is due to the network stack being unable to find the remote host (read up on ARP) and eventually returning an error. You need to deal with that in your own code. |
|||
msg82367 - (view) | Author: Martin v. Löwis (loewis) * | Date: 2009-02-17 20:51 | |
> Because documentation doesn't say that Python should timeout after 20 > seconds after entering blocking mode if socket to remote host can not be > opened. That's not true: The documentation says "In blocking mode, operations block until complete." It takes 20 seconds for the connect attempt to complete (with error 10060), therefore, you block 20s. |
|||
msg82370 - (view) | Author: anatoly techtonik (techtonik) | Date: 2009-02-17 21:30 | |
After rewriting my reply several times I've noticed my mistake, but it took more time to understand the problem than could be expected for a language that we all would like to see as easy and intuitive as possible. That why I still would like to see this bugreport reopened. At first it seemed that adding missing details to documentation would be enough, but now I see that this problem can be deeper. The problem: As far as I informed, the socket module is the only way to wait for service on server:139 to appear. socket documentation doesn't reflect that will happen if network server is down (server is not network adapter). Analysis: In this specific timeout condition when server is offline socket.connect can throw two different errors: "socket.error: (10060, 'Operation timed out')" and "socket.timeout: timed out" Which one will fire and should be catched depends on the visible timeout settings for the socket and on invisible timeout value of underlying network library. Whichever occurs first - wins. For example, this code will warn you about network timeout: import socket s = socket.socket() s.settimeout(12.0) try: s.connect(("192.168.1.2", 139)) except socket.timeout: print "connect timeout" But this one won't: import socket s = socket.socket() s.settimeout(120.0) try: s.connect(("192.168.1.2", 139)) except socket.timeout: print "connect timeout" So, for reliable socket programming you should catch both. Solution: If there is a possibility for a socket to timeout when it is not expected then at least it should be documented. Alternative solution would be to document and merge socket.error: 10060 into socket.timeout exception. |
|||
msg82376 - (view) | Author: Martin v. Löwis (loewis) * | Date: 2009-02-17 22:03 | |
10060 is a winsock error, and there are many, MANY more of them. Read the winsock documentation for details. It's both impossible and pointless to document them, since some never occur, others aren't documented by Microsoft well enough in the first place. Many aspects of the Microsoft TCP stack are also fairly obscure, and can also change across Windows releases. |
|||
msg82387 - (view) | Author: anatoly techtonik (techtonik) | Date: 2009-02-17 23:07 | |
Isn't it a job of crossplatform programming language to abstract from low-level platform details? The scope of this bug is not about handling all possible Winsock errors. It is about proper handling the sole timeout error from the list http://www.winsock-error.com/ to make socket.connect() interface consistent for both windows and linux. In addition I believe that new socket.create_connection() function is vulnerable to the same issue and its only a matter of time when somebody reports that its additional "timeout" argument should be less than mystic system network timeout value. That's why some sort of generalized socket.connection_timeout exception is still needed. BTW, I have tested the behaviour on linux - the system timeout on socket does occur, but with different error code. socket.error: (110, 'Connection timed out') Note that the error message is different too. That means that to properly wait for service to appear (or retry to reconnect later if server is not available) you need to handle three error cases. |
|||
msg82390 - (view) | Author: Martin v. Löwis (loewis) * | Date: 2009-02-17 23:25 | |
> Isn't it a job of crossplatform programming language to abstract from > low-level platform details? It's certainly not Python's job. It's an explicit design goal, and a long tradition, to expose system interfaces *as is*, with the very same parameters, and the very same error codes. This is useful for developers who need to understand what problem they encounter - they can trust that Python doesn't second-guess the operating system. It might be useful to put a layer on top of the system interfaces, but such a layer needs to use different names. > The scope of this bug is not about handling all possible Winsock errors. > It is about proper handling the sole timeout error from the list > http://www.winsock-error.com/ to make socket.connect() interface > consistent for both windows and linux. That's nearly impossible, with respect to specific error conditions. The TCP stacks are too different. You can easily define an common (but useless) error handling scheme yourself: catch Exception, and interpret it as "it didn't work". > BTW, I have tested the behaviour on linux - the system timeout on socket > does occur, but with different error code. > > socket.error: (110, 'Connection timed out') > > Note that the error message is different too. That means that to > properly wait for service to appear (or retry to reconnect later if > server is not available) you need to handle three error cases. That's correct. However, you shouldn't look at the error message when handling the error on Linux. Instead, you should check whether the error code is errno.ETIMEDOUT. The error message is only meant for a human reader. Also notice that possible other errors returned from connect are EACCES, EPERM, EADDRINUSE, EAFNOSUPPORT, EAGAIN, EALREADY, EBADF, ECONNREFUSED, EFAULT, EINPROGRESS, EINTR, EISCONN, ENETUNREACH, ENOTSOCK. |
|||
msg82409 - (view) | Author: Gregory P. Smith (gregory.p.smith) * | Date: 2009-02-18 05:48 | |
Yes it is annoying to have to deal with the different OS specific error numbers when handling socket.error, OSError, IOError or EnvironmentError subclasses in general but that is life. Python does not attempt to figure out what all possible behaviors and errors are and coerce them into some common representation because there often is not a common thing. Fortunately while there are many network stack behaviors, there are really only two APIs (posix and windows) so there are only two sets of error numbers to check for. The burden is not that great in cross platform code. The issue that prompted this bug report: calling socket.connect() once is not and has never been a way to wait for a server on the network to come up. If the server isn't there at the time it was called, it will return an error once the OS has decided that it has no way to connect. The absense of a timeout being specified does not imply that it will retry the underlying system call for you. Merely that it won't bail out early. I have updated the socket module documentation to clarify this a bit in r69731. |
|||
msg82548 - (view) | Author: anatoly techtonik (techtonik) | Date: 2009-02-20 21:35 | |
Thanks for pointing me to the list of possible network errors. This information is invaluable. Too bad it is easily lost among other details. I've seen similar errors in other modules that use socket module and it's no wonder now why people can't handle them correctly. I still feel that the information about error handling should be specified at the very beginning before references to wizard books. In the meanwhile I made a script that probes remote service with proper timeout checks that can be included in examples chapter. It requires time module to calculate timeout shared between two exceptions. http://code.activestate.com/recipes/576655/ |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:56:45 | admin | set | github: 49543 |
2009-02-20 21:35:13 | techtonik | set | messages: + msg82548 |
2009-02-18 05:49:00 | gregory.p.smith | set | nosy:
+ georg.brandl resolution: not a bug -> fixed messages: + msg82409 components: + Documentation assignee: georg.brandl |
2009-02-17 23:25:16 | loewis | set | messages: + msg82390 |
2009-02-17 23:07:27 | techtonik | set | messages: + msg82387 |
2009-02-17 22:03:25 | loewis | set | messages: + msg82376 |
2009-02-17 21:30:08 | techtonik | set | messages: + msg82370 |
2009-02-17 20:51:37 | loewis | set | messages: + msg82367 |
2009-02-17 19:18:44 | gregory.p.smith | set | status: open -> closed nosy: + gregory.p.smith resolution: not a bug messages: + msg82356 |
2009-02-17 19:14:24 | techtonik | set | messages: + msg82355 |
2009-02-17 18:24:33 | loewis | set | nosy:
+ loewis messages: + msg82353 |
2009-02-17 10:06:33 | techtonik | create |