classification
Title: test_multiprocessing: test_listener_client flakiness
Type: enhancement Stage:
Components: Library (Lib) Versions: Python 3.0, Python 2.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: jnoller Nosy List: jnoller, ocean-city, pitrou, roudkerk, trent
Priority: normal Keywords: patch

Created on 2008-07-03 12:58 by jnoller, last changed 2009-04-01 16:42 by jnoller. This issue is now closed.

Files
File name Uploaded Description Edit
connection_v2.patch jnoller, 2008-08-08 14:28 FQDN removal patch v2
Messages (16)
msg69197 - (view) Author: Jesse Noller (jnoller) * (Python committer) Date: 2008-07-03 12:58
Per mail thread:
http://mail.python.org/pipermail/python-dev/2008-June/080497.html

Attached is the patch to connection.py to drop the fqdn call.

Final suggestion from Trent:

> This is a common problem.  Binding to '127.0.0.1' will bind to *only*
> that address;

Indeed.

> binding to "" will bind to *all* addresses the machine
> is known by.

Agreed again.  I believe what we're dealing with here though is a lack 
of clarity regarding what role the 'address' attribute exposed by 
multiprocess.connection.Listener should play.  The way 
test_listener_client() is written, it effectively treats 'address' as an 
end-point that can be connected to directly (irrespective of the 
underlying family (i.e. AF_INET, AF_UNIX, AF_PIPE)).

I believe the problems we've run into stem from the fact that the API 
doesn't provide any guarantees as to what 'address' represents.  The 
test suite assumes it always reflects a connectable end-point, which I 
think is more than reasonable.  Unfortunately, nothing stops us from 
breaking this invariant by constructing the object as 
Listener(family='AF_INET', address=('0.0.0.0', 0)).

How do I connect to an AF_INET Listener (i.e. SocketListener) instance 
whose 'address' attribute reports '0.0.0.0' as the host?  I can't.

So, for now, I think we should enforce this invariant by raising an 
exception in Listener.__init__() if self._socket.getsockbyname()[0] 
returns '0.0.0.0'.  In effect, tightening up the API such that we can 
guarantee  Listener.address will always represent a connectable end-
point.  We can look at how to service 'listen on all available 
interfaces' semantics at a later date -- that adds far less value IMO 
than being able to depend on the said guarantee.
msg69682 - (view) Author: Jesse Noller (jnoller) * (Python committer) Date: 2008-07-15 13:52
The connection patch was applied r64961
msg69701 - (view) Author: Jesse Noller (jnoller) * (Python committer) Date: 2008-07-15 18:30
I made a mistake and reverted r64961 - self._address is used pretty 
heavily, and altering it the way outlined in the patch does not fix all 
instances.
msg70832 - (view) Author: Hirokazu Yamamoto (ocean-city) * (Python committer) Date: 2008-08-07 14:42
Hello. I also experienced test_multiprocessing hang on win2k and I
workarounded this by this adhok patch.

Index: Lib/multiprocessing/connection.py
===================================================================
--- Lib/multiprocessing/connection.py	(revision 65551)
+++ Lib/multiprocessing/connection.py	(working copy)
@@ -217,7 +217,12 @@
         self._socket.listen(backlog)
         address = self._socket.getsockname()
         if type(address) is tuple:
-            address = (socket.getfqdn(address[0]),) + address[1:]
+            def getfqdn(s): # ???
+                if s == '127.0.0.1':
+                    return 'localhost'
+                else:
+                    return socket.getfqdn(s)
+            address = (getfqdn(address[0]),) + address[1:]
         self._address = address
         self._family = family
         self._last_accepted = None

I'm not familiar to socket, so probably this info is useless. ;-)
msg70890 - (view) Author: Trent Nelson (trent) * (Python committer) Date: 2008-08-08 09:12
Jesse, thanks for capturing my e-mail thread in this issue.  Can you 
comment on my last three paragraphs?  Essentially, I think we should 
lock down the API and assert that Listener.address will always be 
a 'connectable' end-point.  (i.e. not a wildcard host, 0.0.0.0, that 
can't be bound to by a socket, for example)

This would mean raising an exception in Listener.__init__ if this 
invariant is violated.
msg70892 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-08-08 09:34
> This would mean raising an exception in Listener.__init__ if this 
> invariant is violated.

If I understand the suggestion correctly, it would forbid people to
listen on 0.0.0.0. I'm not sure it is the right correction for the
problem. Listening on 0.0.0.0 can be handy when you are not sure which
address to use; it would be better to address the problem elsewhere.

IMO, the "FQDN removal patch" as uploaded by Jesse is simple and
straight-forward enough, provided it does fix the bug.
msg70898 - (view) Author: Jesse Noller (jnoller) * (Python committer) Date: 2008-08-08 10:31
Unfortunately, the patch while simple, is too simple. The removal of the 
_address attribute breaks a lot more than it fixes (it's heavily used 
elsewhere)
msg70903 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-08-08 12:57
Selon Jesse Noller <report@bugs.python.org>:
>
> Unfortunately, the patch while simple, is too simple. The removal of the
> _address attribute breaks a lot more than it fixes (it's heavily used
> elsewhere)

I don't understand what you mean. The patch you uploaded doesn't remove the
_address attribute.

See http://bugs.python.org/file10801/connection.patch
msg70904 - (view) Author: Jesse Noller (jnoller) * (Python committer) Date: 2008-08-08 13:03
> I don't understand what you mean. The patch you uploaded doesn't remove the
> _address attribute.
>
> See http://bugs.python.org/file10801/connection.patch

Hmm. Then that was a mistake on my part: it's entirely possible the
patch didn't apply cleanly. I'll circle back to this shortly and
re-apply/test.
msg70906 - (view) Author: Jesse Noller (jnoller) * (Python committer) Date: 2008-08-08 14:28
Trent/Antoine - I'm stuck on the fence about this. Per trent's own 
suggestion - removing the allowance for the 0.0.0.0 style address means 
that the self._address is always a connectable end-point: this provides 
clarity to the API.

However - to Antoine's point - using 0.0.0.0 to listen on "all 
addresses" is actually pretty common, especially when working with 
bigger servers (which generally have multiple cores). Using the 0.0.0.0 
address makes it easy to say "listen everywhere" - where, in the case of 
a Manager is actually very useful (you could have processes connecting 
to a Manager on a machine from different networks).

Attached is a cleaned up diff of the removal of the fqdn call - this 
should resolve the original problem while leaving the door open for the 
ability to connect to the 0.0.0.0 "address".

I need someone with a windows machine (Hi Trent!) that exposed the 
original problem to test the patch. Currently I'm favoring this rather 
than locking out the 0.0.0.0 option.
msg70928 - (view) Author: Hirokazu Yamamoto (ocean-city) * (Python committer) Date: 2008-08-09 10:41
I confirmed this patch works on my win2000.
And I believe it works on Trent's machine, too.
http://mail.python.org/pipermail/python-dev/2008-June/080525.html
msg71017 - (view) Author: Jesse Noller (jnoller) * (Python committer) Date: 2008-08-11 14:29
I've removed the fqdn call per the patch as of r65641

Lowering this to an enhancement to consider the removal of the 0.0.0.0 
functionality
msg71033 - (view) Author: Trent Nelson (trent) * (Python committer) Date: 2008-08-11 19:57
I can confirm the patch works in Windows.  Regarding the 0.0.0.0 issue, 
perhaps we can compromise on something like this:

% svn diff connection.py
Index: connection.py
===================================================================
--- connection.py       (revision 65645)
+++ connection.py       (working copy)
@@ -217,6 +217,8 @@
         self._socket.bind(address)
         self._socket.listen(backlog)
         self._address = self._socket.getsockname()
+        if self._address[0] == '0.0.0.0':
+            self._address[0] = '127.0.0.1'
         self._family = family
         self._last_accepted = None


That way, we still support binding on all addresses, and self._address 
will always represent a connectable end-point.  Thoughts?
msg71034 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2008-08-11 20:08
Le lundi 11 août 2008 à 19:58 +0000, Trent Nelson a écrit :
> +        if self._address[0] == '0.0.0.0':
> +            self._address[0] = '127.0.0.1'

Please no. If the user asks for 0.0.0.0, either obey or raise an
exception, but do not silently change the value.

My own humble opinion is that 0.0.0.0 should be allowed and, at worse,
the documentation may carry a warning about it.
msg71045 - (view) Author: Trent Nelson (trent) * (Python committer) Date: 2008-08-12 06:31
I was thinking about this on the way home last night and concluded that 
my last suggestion (s/0.0.0.0/127.0.0.1/) is a terrible one as well.  
I'd be happy with a mention in the documentation (for now) stating that 
if you listen on '0.0.0.0', Listener._address won't be a connectable 
end-point (and you'll have to explicitly connect to 127.0.0.1, for 
example).

As for the original issue, Jesse I'm +1 on your connection_v2.patch.
msg85046 - (view) Author: Jesse Noller (jnoller) * (Python committer) Date: 2009-04-01 16:42
documented in r70960
History
Date User Action Args
2009-04-01 16:42:49jnollersetstatus: open -> closed
resolution: fixed
messages: + msg85046
2008-08-12 06:31:18trentsetmessages: + msg71045
2008-08-11 20:08:53pitrousetmessages: + msg71034
2008-08-11 19:57:46trentsetmessages: + msg71033
2008-08-11 14:29:21jnollersetpriority: high -> normal
type: enhancement
messages: + msg71017
2008-08-09 10:41:52ocean-citysetmessages: + msg70928
2008-08-08 14:29:26jnollersetfiles: - connection.patch
2008-08-08 14:28:50jnollersetfiles: + connection_v2.patch
messages: + msg70906
2008-08-08 13:03:43jnollersetmessages: + msg70904
2008-08-08 12:57:28pitrousetmessages: + msg70903
2008-08-08 10:31:54jnollersetmessages: + msg70898
2008-08-08 09:34:36pitrousetnosy: + pitrou
messages: + msg70892
2008-08-08 09:12:04trentsetnosy: + trent
messages: + msg70890
2008-08-07 14:42:03ocean-citysetnosy: + ocean-city
messages: + msg70832
2008-07-15 18:30:46jnollersetmessages: + msg69701
2008-07-15 13:52:00jnollersetmessages: + msg69682
2008-07-03 12:58:44jnollercreate