Created on 2009-05-18 17:48 by ryles, last changed 2012-07-29 12:28 by sbt. This issue is now closed.
|mpsock.patch||underrun, 2011-06-16 06:06||make python2.7 mp sockets blocking||review|
|mp6056fix.patch||underrun, 2012-07-24 13:49||updated patch against 3.3b1(ish)||review|
|msg88040 - (view)||Author: Ryan Leslie (ryles)||Date: 2009-05-18 17:48|
Terminal 1: Python 2.6.1 (r261:67515, Apr 2 2009, 18:25:55) [GCC 4.1.2 20071124 (Red Hat 4.1.2-42)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> from multiprocessing.managers import SyncManager >>> manager = SyncManager(authkey="mykey") >>> manager.start() >>> queue = manager.Queue() >>> import pickle >>> pickle.dump(queue, open("myqueue.pkl", "w")) >>> Terminal 2: Python 2.6.1 (r261:67515, Apr 2 2009, 18:25:55) [GCC 4.1.2 20071124 (Red Hat 4.1.2-42)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import socket >>> socket.setdefaulttimeout(30) >>> import multiprocessing, pickle >>> multiprocessing.current_process().authkey = "mykey" >>> queue = pickle.load(open("myqueue.pkl")) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "python2.6/pickle.py", line 1370, in load return Unpickler(file).load() File "python2.6/pickle.py", line 858, in load dispatch[key](self) File "python2.6/pickle.py", line 1133, in load_reduce value = func(*args) File "python2.6/multiprocessing/managers.py", line 845, in RebuildProxy return func(token, serializer, incref=incref, **kwds) File "python2.6/multiprocessing/managers.py", line 894, in AutoProxy incref=incref) File "python2.6/multiprocessing/managers.py", line 700, in __init__ self._incref() File "python2.6/multiprocessing/managers.py", line 749, in _incref conn = self._Client(self._token.address, authkey=self._authkey) File "python2.6/multiprocessing/connection.py", line 140, in Client answer_challenge(c, authkey) File "python2.6/multiprocessing/connection.py", line 376, in answer_challenge response = connection.recv_bytes(256) # reject large message IOError: [Errno 11] Resource temporarily unavailable >>> This works as expected without socket.setdefaulttimeout(). However, the timeout is useful since if the listening process on terminal 1 goes to sleep, e.g. ^Z, it would avoid blocking. I suspect the cause is similar to http://bugs.python.org/issue976613
|msg88045 - (view)||Author: Jesse Noller (jnoller) *||Date: 2009-05-18 19:18|
Pickling the queue and then unpickling it in a new process is something I never thought of. That's interesting in and of itself ;)
|msg88064 - (view)||Author: Ryan Leslie (ryles)||Date: 2009-05-19 05:19|
Yeah, storing pickled queues in the file system makes for some easy IPC :) It wasn't a very original idea, I took the pickling comments in the documentation at face value: http://docs.python.org/library/multiprocessing.html#proxy-objects So, from what I can tell this issue is related to the mixing of standard python socket I/O with multiprocessing socket I/O, with state not being carried from the former to the latter. In multiprocessing/connection.py, SocketClient() creates a familiar python socket object which, when a default timeout has been set in the module, will be made non-blocking. In addition, the timeout is remembered in the socket object, and when calling socket.recv(), the function internal_select() will use this to perform the expected poll() or select(). However, after a connection is established, SocketClient() will not use python's socket implementation any further, and instead pass its low-level socket descriptor to a multiprocessing Connection object. This object has its own specialized socket I/O implementation, which is not at all aware of the timeout previously associated with the socket. As a result no select/poll occurs and, due to the socket's non-blocking status, recv() may return EAGAIN immediately. I suspect this is what's happening. There might be a number of ways to make SocketClient() more timeout friendly, but possibility could be to simply check if the python socket has a timeout associated, and if so, use connection.poll() in addition to connection.recv(). There may be other places in the code where similar changes would occur. You obviously have more experience with this code base so I'll be curious to see the outcome.
|msg88078 - (view)||Author: Jesse Noller (jnoller) *||Date: 2009-05-19 13:10|
Well; I'm pretty tapped out right now - I think your idea of checking to see if a timeout has been set elsewhere makes sense. If you have the time to put together a patch (with a unit test or three :)) I can review it. Might take me a bit of time to get to this.
|msg137830 - (view)||Author: Derek Wilson (underrun)||Date: 2011-06-07 14:52|
This should be higher priority as one of the major benefits of the multiprocessing module is remote process management in a completely transparent manner. socket timeouts are very important in this context as blocking forever waiting for a connection is not always an option. The problem of not being able to use a default socket timeout for other purposes in combination with multiprocessing managers is definitely an issue, but making multiprocessing actually use the timeout itself if set would be a huge advantage. This might not be the place to ask for it, but it would make sense for manager objects to gain a timeout attribute to be used as a timeout for local or remote communications. At the very least, the manager.connect() method should accept a timeout argument.
|msg137860 - (view)||Author: Jesse Noller (jnoller) *||Date: 2011-06-07 17:14|
I agree derek, I think that would be a fine addition, however we lack a patch and I don't have the current bandwidth to add it.
|msg138415 - (view)||Author: Derek Wilson (underrun)||Date: 2011-06-16 06:06|
While having multiprocessing use a timeout would be great, I didn't really have the time to fiddle with the c code. Instead of using the socket timeout, I'm modifying all the sockets created by the socket module to have no timeout (and thus to be blocking) which makes the multiprocessing module 'immune' to the socket module's default timeout. For testing, I just run the test suite twice, once with the initial default socket timeout and once with a 60 second timeout. Nothing there failed with this issue. It is worth noting, however, that when using a default socket timeout, for some reason processes that have have put data into a queue no longer block at exit waiting for the data to be consumed. I'm not sure if there is some additional cleanup code that uses sockets and might need to block? Or maybe whatever the issue was with blocking sockets is not an issue with non-blocking sockets?
|msg138489 - (view)||Author: Derek Wilson (underrun)||Date: 2011-06-17 04:39|
I was wrong about exit behavior of a process that has put to a queue -- it seems to behave as expected. i had been playing with a proxy to a queue rather than a queue (to which, if you put, the process can exit right away because the actual put happens in the process that owns the queue). I think this works as intended, but lmk. Also, I haven't really played with the tests that much, so that bit could use some review. It hasn't broken anything in any of my real world tests though. Also, have I mentioned that the multiprocessing module is amazing? Cause it is. I sort of feel like antigravity == multiprocessing ...
|msg151968 - (view)||Author: Derek Wilson (underrun)||Date: 2012-01-25 19:24|
Any chance this patch will be accepted (or at least reviewed) soon?
|msg151982 - (view)||Author: Jim Jewett (Jim.Jewett)||Date: 2012-01-25 22:55|
The wording in 138415 suggested this patch was changing socket to not support timeouts -- which would be unacceptable. But the actual patch only seems to touch multiprocessing/connection.py -- a far more reasonable change. Unfortunately, the patch no longer applies to the development tip. I *think* the places you wanted to change are still there, and just moved. (1) Is it sufficiently clear that this is not-a-feature to justify a backport? (2) Are the problems already fixed by some of the other changes? (It doesn't look like it, but I'm not sure.) (3) Can you produce an updated patch? (The current tip is http://hg.python.org/cpython/file/fec45282dc28/Lib/multiprocessing/connection.py ) (4) If I understand the intent, then s.setblocking(True) would be slightly more clear than s.settimeout(None), though that change obviously isn't essential.
|msg166292 - (view)||Author: Derek Wilson (underrun)||Date: 2012-07-24 13:49|
Thanks, Jim, here is an updated patch. 1) I feel like it is clearly not-a-feature. Currently 2.7 crashes if remote managers are used and socket.setdefaulttimeout is anything other than None. Crashing seems bad and all this does is keep multiprocessing connection sockets non-blocking even if a default timeout is specified (so it maintains current behavior rather than crashing). 2) This problem is still evident on 2.7, 3.2 and 3.3 beta 1. This patch is against the current dev tip as of a few days ago. 3) here it is! 4) I agree that setblocking is more clear. I made the change. My test modifications cover the entire suite twice, once without a default timeout and once with. This may be excessive? I'm not sure where non-blocking sockets might pop up as an issue since there is C code that relies on blocking sockets and I haven't dug that deep.
|msg166567 - (view)||Author: Roundup Robot (python-dev)||Date: 2012-07-27 13:57|
New changeset 4e85e4743757 by Richard Oudkerk in branch '2.7': Issue #6056: Make multiprocessing use setblocking(True) on the sockets it uses. http://hg.python.org/cpython/rev/4e85e4743757 New changeset 290f04722be3 by Richard Oudkerk in branch '3.2': Issue #6056: Make multiprocessing use setblocking(True) on the sockets it uses. http://hg.python.org/cpython/rev/290f04722be3 New changeset f03839401420 by Richard Oudkerk in branch 'default': Issue #6056: Make multiprocessing use setblocking(True) on the sockets it uses. http://hg.python.org/cpython/rev/f03839401420
|2012-07-29 12:28:11||sbt||set||status: open -> closed|
type: crash -> behavior
stage: needs patch -> committed/rejected
messages: + msg166567
messages: + msg166292
messages: + msg151982
|2012-01-25 19:24:10||underrun||set||type: behavior -> crash|
messages: + msg151968
versions: + Python 3.3, Python 3.4
|2011-11-01 14:27:50||neologix||link||issue9897 superseder|
|2011-06-17 04:39:59||underrun||set||messages: + msg138489|
keywords: + patch
messages: + msg138415
|2011-06-07 17:14:04||jnoller||set||messages: + msg137860|
messages: + msg137830
|2010-08-27 15:28:21||BreamoreBoy||set||stage: needs patch|
versions: + Python 3.1, Python 2.7, Python 3.2, - Python 2.6
|2009-05-19 13:10:36||jnoller||set||messages: + msg88078|
|2009-05-19 05:20:18||ryles||set||type: behavior|
messages: + msg88064
|2009-05-18 19:18:19||jnoller||set||messages: + msg88045|
|2009-05-18 18:46:59||benjamin.peterson||set||assignee: jnoller|
nosy: + jnoller