Title: Blocking sockets take entirely too long to timeout
Created on 2008-02-17 18:56 by khiltd, last changed 2008-03-19 22:40 by jafo. This issue is now closed.

Messages (3)
msg62498 - (view) Author: Nathan Duran (khiltd) Date: 2008-02-17 18:56
The following code:

import smtplib

test = smtplib.SMTP('')

will hang the entire script for about ten minutes when run on a machine 
which is connected to the internet via an ISP who blocks port 25 (which is 
pretty much all of them these days). Closer inspection of the smtplib 
sources reveals that it is making use of blocking sockets, which is all 
well and good, but I do not believe they should be allowed to block for 
such a ridiculously lengthy period of time. 

My task is to walk the list of MX records provided by a DNS query, connect 
to each server listed and check a user-supplied address for validity. If I 
have to wait 10 minutes for an exception to be thrown for each 
unresponsive server, this process could easily take hours per address, 
which is completely unacceptable. 

My workaround is to ditch smtplib entirely and write my own socket code, 
but setting a timeout on a socket throws it into non-blocking mode, which 
is not entirely what I want, either. PHP allows one to apply a perfectly 
sane timeout to a blocking socket without any difficulty, and this is 
something I'm frankly rather surprised to see Python choke on.

What I'd expect to see is an exception after less than a minute's worth of 
repeated failures so the caller can catch it and try another port.
msg62503 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2008-02-17 20:19
I recommend that you stay with non-blocking sockets, and use select/poll
on all sockets. Then you can simultaneously check multiple servers, and
select will tell you which ones you got connected to. For this
application, putting a time-out on the socket and doing the connections
sequentially seems unreasonable - that's exactly what select was
invented for.

I don't understand the "setting a timeout ... is not entirely what I
want, either" remark. The only way to specify a timeout for connect *is*
to set it into non-blocking mode. I'm sure PHP does the same.

If you want to see timeouts for smtplib, a work-around is to set a
global timeout for all sockets, through socket.setdefaulttimeout. This
will transparently apply to smtplib as well.
msg64116 - (view) Author: Sean Reifschneider (jafo) * (Python committer) Date: 2008-03-19 22:40
smtplib is for sending messages via SMTP, not for testing to see if a
user is behind an ISP that is incorrectly blocking outgoing SMTP
connections.  I would argue the "incorrectly" because they are dropping
rather than rejecting the connection packets.

One mechanism would be to use an alarm(30) call which would limit the
whole transaction to 30 seconds (connection, EHLO, RCPT, QUIT), not just
the connection.  My experience with dealing with remote machines in a
time-sensitive manner is that this approach is much more usable than
using socket timeouts.

A better mechanism is probably to use non-blocking socket I/O directly
and make connections to all or many of the remote servers and test them
in parallel, meaning all servers could be tested in 30 seconds (or
whatever your timeout is) rather than 5 minutes (you imply in your
message that you may be doing more than a dozen requests).

So, I believe the current functionality in the smtplib of being simple
and conservative at trying to get messages through, rather than trying
to be optimized for checking for account existance, is reasonable.  And
that in any case, using alarm() is a better solution than socket timeouts.
