classification
Title: Limitations in objects returned by multiprocessing Pool
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.1, Python 3.2, Python 2.7
process
Status: closed Resolution: out of date
Dependencies: Superseder:
Assigned To: Nosy List: asksol, eliquious, flox, jnoller, macfreek, max, serhiy.storchaka, vstinner
Priority: normal Keywords: buildbot

Created on 2010-08-13 19:02 by macfreek, last changed 2017-02-19 14:11 by serhiy.storchaka. This issue is now closed.

Files
File name Uploaded Description Edit
multiprocessingbugs.py macfreek, 2010-08-13 19:02 Small test file showing the errors
Messages (14)
msg113816 - (view) Author: Freek Dijkstra (macfreek) Date: 2010-08-13 19:02
I came across three limitation in the multiprocessing module that were not handled correctly.

Attached is a file that reproduces the errors in minimal code. I tested them with Python 2.6.5 and 3.1.2.

Expected result:
multiprocessing.Pool's promises a map function where each result is returned transparently to the main process (despite that the calculation was done in a subprocess)

Actual result:
Not all values returned by a subprocess are returned transparently.
I expected multiprocessing to handle these cases gracefully by yielding an Exception in the Main process.

The cases I found are:

1) A multiprocessing worker can not return (return, not raise!) an Exception. 
If this is attempted, the result handler thread in the Pool calls the exception with no arguments, 
which might raise an error if multiple arguments are required:
TypeError: ('__init__() takes exactly 2 arguments (1 given)', <class '__main__.MyException'>, ())    


2) A multiprocessing worker can not return an hashlib Object.
If this is attempted, pickle returns an error:
PicklingError: Can't pickle <type '_hashlib.HASH'>: attribute lookup _hashlib.HASH failed


3) A multiprocessing worker can not return an Object which overrides __getattr__, and accesses a variable from self in __getattr__.
If this is attempted, Python 2.6 crashes with a bus error:
Program terminated by uncaught signal #10 after 1.56 seconds.
Python 3.1 yields the error:
RuntimeError: maximum recursion depth exceeded while calling a Python object
msg113959 - (view) Author: Florent Xicluna (flox) * (Python committer) Date: 2010-08-15 13:44
Case 3 seen on buildbot Windows 7 3.1:

http://www.python.org/dev/buildbot/all/builders/x86%20Windows7%203.1/builds/676

test_array (test.test_multiprocessing.WithProcessesTestArray) ... Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "D:\cygwin\home\db3l\buildarea\3.1.bolen-windows7\build\lib\multiprocessing\forking.py", line 344, in main
    self = load(from_parent)
  File "D:\cygwin\home\db3l\buildarea\3.1.bolen-windows7\build\lib\pickle.py", line 1356, in load
    encoding=encoding, errors=errors).load()
  File "D:\cygwin\home\db3l\buildarea\3.1.bolen-windows7\build\lib\unittest.py", line 1363, in __getattr__
    return getattr(self.stream,attr)
(...)
  File "D:\cygwin\home\db3l\buildarea\3.1.bolen-windows7\build\lib\unittest.py", line 1363, in __getattr__
    return getattr(self.stream,attr)
RuntimeError: maximum recursion depth exceeded while calling a Python object
msg113960 - (view) Author: Florent Xicluna (flox) * (Python committer) Date: 2010-08-15 13:56
A different issue on XP-5 buildbot (Python 3.1):

test test_multiprocessing failed -- Traceback (most recent call last):
  File "C:\buildslave\3.1.moore-windows\build\lib\test\test_multiprocessing.py", line 1234, in test_rapid_restart
    manager.shutdown()
  File "C:\buildslave\3.1.moore-windows\build\lib\multiprocessing\util.py", line 174, in __call__
    res = self._callback(*self._args, **self._kwargs)
  File "C:\buildslave\3.1.moore-windows\build\lib\multiprocessing\managers.py", line 602, in _finalize_manager
    process.terminate()
  File "C:\buildslave\3.1.moore-windows\build\lib\multiprocessing\process.py", line 111, in terminate
    self._popen.terminate()
  File "C:\buildslave\3.1.moore-windows\build\lib\multiprocessing\forking.py", line 276, in terminate
    _subprocess.TerminateProcess(int(self._handle), TERMINATE)
WindowsError: [Error 5] Access is denied


Then, on replay, it ended with the "RuntimeError: maximum recursion depth exceeded while calling a Python object" like the Windows 7 case.


http://www.python.org/dev/buildbot/all/builders/x86%20XP-5%203.1/builds/581
msg113963 - (view) Author: Jesse Noller (jnoller) * (Python committer) Date: 2010-08-15 15:06
Florent - Are you running the script from Freek on the buildbots, or are you just updating this bugs with other run failures? I'm having a really hard time separating things.
msg113966 - (view) Author: Florent Xicluna (flox) * (Python committer) Date: 2010-08-15 15:14
It is an update with 2 similar failures on Windows XP and 7 buildbots (on normal runs).

FWIW, I ran the script from Freek on my laptop (Debian 64bits) and I noticed similar failures on 3.1 and 3.2 (you need to uncomment 1 of the 3 commented lines of the script to see the failures).
msg114040 - (view) Author: Freek Dijkstra (macfreek) Date: 2010-08-16 08:37
If it would help to separate things, let me know, and I split this up in three separate bug reports.

(For the record, knowing these limitations, I could work around it in my code, so they are low priority for me; I just think that it will benefit other users if multiprocessing would fail graciously with a clear exception. Though I probably can't help solving the issues, I can write a unit test).
msg114045 - (view) Author: Jesse Noller (jnoller) * (Python committer) Date: 2010-08-16 14:17
Thanks Freek - we're actually discussing some stuff like this in issue9205 as well
msg133023 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-04-05 12:14
> 1) A multiprocessing worker can not return (return, not raise!) an
> Exception. (...) raise an error if multiple arguments are required:
> TypeError: ('__init__() takes exactly 2 arguments (1 given)', 
> <class '__main__.MyException'>, ())

This problem comes from pickle, not multiprocessing: issue #1692335.

> 2) A multiprocessing worker can not return an hashlib Object.
> If this is attempted, pickle returns an error:

It is related to pickle and hashlib: #11771
msg133619 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-04-12 23:21
> Thanks Freek - we're actually discussing some stuff like this
> in issue9205 as well

I'm unable to see the relation between the issue #9205 and the point (3) of this issue (RuntimeError: maximum recursion depth exceeded while calling a Python object // WindowsError: [Error 5] Access is denied).

--

Are "RuntimeError: maximum recursion depth exceeded while calling a Python object" and "WindowsError: [Error 5] Access is denied)" errors the issue or not?
msg140308 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011-07-13 21:21
A recent test_rapid_restart hang:

[ 14/357] test_multiprocessing
Timeout (1:00:00)!
Thread 0xb18c4b70:
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/threading.py", line 237 in wait
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/queue.py", line 185 in get
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/multiprocessing/pool.py", line 376 in _handle_results
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/threading.py", line 690 in run
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/threading.py", line 737 in _bootstrap_inner
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/threading.py", line 710 in _bootstrap

Thread 0xb20c5b70:
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/threading.py", line 237 in wait
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/queue.py", line 185 in get
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/multiprocessing/pool.py", line 335 in _handle_tasks
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/threading.py", line 690 in run
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/threading.py", line 737 in _bootstrap_inner
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/threading.py", line 710 in _bootstrap

Thread 0xb28c6b70:
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/multiprocessing/pool.py", line 326 in _handle_workers
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/threading.py", line 690 in run
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/threading.py", line 737 in _bootstrap_inner
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/threading.py", line 710 in _bootstrap

Thread 0xb30c7b70:
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/threading.py", line 237 in wait
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/queue.py", line 185 in get
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/multiprocessing/pool.py", line 102 in worker
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/threading.py", line 690 in run
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/threading.py", line 737 in _bootstrap_inner
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/threading.py", line 710 in _bootstrap

Thread 0xb38c8b70:
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/threading.py", line 237 in wait
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/queue.py", line 185 in get
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/multiprocessing/pool.py", line 102 in worker
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/threading.py", line 690 in run
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/threading.py", line 737 in _bootstrap_inner
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/threading.py", line 710 in _bootstrap

Thread 0xb40c9b70:
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/threading.py", line 237 in wait
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/queue.py", line 185 in get
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/multiprocessing/pool.py", line 102 in worker
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/threading.py", line 690 in run
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/threading.py", line 737 in _bootstrap_inner
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/threading.py", line 710 in _bootstrap

Thread 0xb48cab70:
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/threading.py", line 237 in wait
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/queue.py", line 185 in get
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/multiprocessing/pool.py", line 102 in worker
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/threading.py", line 690 in run
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/threading.py", line 737 in _bootstrap_inner
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/threading.py", line 710 in _bootstrap

Thread 0xb50cbb70:
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/multiprocessing/connection.py", line 411 in _recv
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/multiprocessing/connection.py", line 432 in _recv_bytes
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/multiprocessing/connection.py", line 275 in recv
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/multiprocessing/pool.py", line 376 in _handle_results
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/threading.py", line 690 in run
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/threading.py", line 737 in _bootstrap_inner
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/threading.py", line 710 in _bootstrap

Thread 0xb58ccb70:
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/threading.py", line 237 in wait
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/queue.py", line 185 in get
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/multiprocessing/pool.py", line 335 in _handle_tasks
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/threading.py", line 690 in run
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/threading.py", line 737 in _bootstrap_inner
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/threading.py", line 710 in _bootstrap

Thread 0xb60cdb70:
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/multiprocessing/pool.py", line 326 in _handle_workers
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/threading.py", line 690 in run
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/threading.py", line 737 in _bootstrap_inner
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/threading.py", line 710 in _bootstrap

Thread 0xb76a36c0:
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/multiprocessing/connection.py", line 411 in _recv
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/multiprocessing/connection.py", line 432 in _recv_bytes
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/multiprocessing/connection.py", line 241 in recv_bytes
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/multiprocessing/connection.py", line 759 in recv
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/multiprocessing/managers.py", line 762 in _callmethod
  File "<string>", line 2 in get
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/test/test_multiprocessing.py", line 1417 in test_rapid_restart
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/unittest/case.py", line 386 in _executeTestPart
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/unittest/case.py", line 441 in run
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/unittest/case.py", line 493 in __call__
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/unittest/suite.py", line 105 in run
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/unittest/suite.py", line 67 in __call__
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/unittest/suite.py", line 105 in run
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/unittest/suite.py", line 67 in __call__
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/unittest/suite.py", line 105 in run
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/unittest/suite.py", line 67 in __call__
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/unittest/runner.py", line 168 in run
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/test/support.py", line 1260 in _run_suite
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/test/support.py", line 1286 in run_unittest
  File "/var/lib/buildslave/3.x.murray-gentoo-wide/build/Lib/test/test_multiprocessing.py", line 2228 in test_main
  File "./Lib/test/regrtest.py", line 1070 in runtest_inner
  File "./Lib/test/regrtest.py", line 861 in runtest
  File "./Lib/test/regrtest.py", line 669 in main
  File "./Lib/test/regrtest.py", line 1648 in <module>
make: *** [buildbottest] Error 1

http://www.python.org/dev/buildbot/all/builders/x86%20Gentoo%20Non-Debug%203.x/builds/389/steps/test/logs/stdio
msg155097 - (view) Author: Max Franks (eliquious) Date: 2012-03-07 16:59
Issue 3 is not related to the other 2. See this post http://bugs.python.org/issue5370. As haypo said, it has to do with unpickling objects. The post above gives a solution by using the __setstate__ function.
msg170348 - (view) Author: Max (max) * Date: 2012-09-12 01:12
I propose to close this issue as fixed.

The first two problems in the OP are now resolved through patches to pickle.

The third problem is addressed by issue5370: it is a documented feature of pickle that anyone who defines __setattr__ / __getattr__ that depend on an internal state must also take care to restore that state during unpickling. Otherwise, the code is not pickle-safe, and by extension, not multiprocessing-safe.
msg170369 - (view) Author: Ask Solem (asksol) (Python committer) Date: 2012-09-12 11:54
I vote to close too as it's very hard to fix in a clean way.

A big problem though is that there is a standard for defining exceptions, that also ensures that the exception is pickleable (always call Exception.__init__ with original args), that is not documented (http://docs.python.org/tutorial/errors.html#user-defined-exceptions).

Celery has an elaborate mechanism to rewrite unpickleable exceptions, but it's a massive workaround just to keep the workers running, and shouldn't be part of the stdlib.  It would help if the Python documentation mentioned this though.

Related: http://docs.celeryproject.org/en/latest/userguide/tasks.html#creating-pickleable-exceptions
msg288136 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017-02-19 14:11
The problem with pickling exceptions should be addressed in other issue (issue29466). Other problems seems are solved.
History
Date User Action Args
2017-02-19 14:11:09serhiy.storchakasetstatus: open -> closed

nosy: + serhiy.storchaka
messages: + msg288136

resolution: out of date
stage: resolved
2012-09-12 11:54:49asksolsetmessages: + msg170369
2012-09-12 01:12:03maxsetnosy: + max
messages: + msg170348
2012-03-07 16:59:17eliquioussetnosy: + eliquious
messages: + msg155097
2011-07-13 21:21:09vstinnersetmessages: + msg140308
2011-04-12 23:21:10vstinnersetmessages: + msg133619
2011-04-05 12:14:00vstinnersetmessages: + msg133023
2011-04-04 13:39:51vstinnersetnosy: + vstinner
2011-02-02 23:15:34belopolskysetnosy: macfreek, jnoller, asksol, flox
type: crash -> behavior
2011-01-04 01:40:04pitrousetnosy: + asksol

versions: + Python 2.7, - Python 2.6
2010-08-16 14:17:16jnollersetmessages: + msg114045
2010-08-16 08:37:37macfreeksetmessages: + msg114040
2010-08-15 15:14:35floxsetmessages: + msg113966
versions: + Python 3.2
2010-08-15 15:06:31jnollersetmessages: + msg113963
2010-08-15 13:56:27floxsetmessages: + msg113960
2010-08-15 13:44:46floxsetkeywords: + buildbot
nosy: + flox
messages: + msg113959

2010-08-13 19:05:06r.david.murraysetnosy: + jnoller
2010-08-13 19:02:47macfreekcreate