classification
Title: multiprocessing.pool: Worker crashes if result can't be encoded
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 2.7, Python 2.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: jnoller Nosy List: asksol, gdb, jnoller, python-dev, sbt
Priority: normal Keywords: patch

Created on 2010-07-13 11:51 by asksol, last changed 2012-05-25 19:55 by sbt. This issue is now closed.

Files
File name Uploaded Description Edit
multiprocessing-trunk@82502-handle_worker_encoding_errors.patch asksol, 2010-07-13 11:51 patch against trunk@82502
multiprocessing-trunk@82502-handle_worker_encoding_errors2.patch asksol, 2010-07-15 14:05
Messages (7)
msg110173 - (view) Author: Ask Solem (asksol) (Python committer) Date: 2010-07-13 11:51
If the target function returns an unpickleable value the worker process crashes. This patch tries to safely handle unpickleable errors, while enabling the user to inspect such errors after the fact.

In addition a new argument has been added to apply_async: error_callback.
This is an optional callback that is called if the job raises an exception. The signature of the callback is `callback(exc)`.
msg110203 - (view) Author: Greg Brockman (gdb) Date: 2010-07-13 15:24
This looks pretty reasonable to my untrained eye.  I successfully applied and ran the test suite.

To be clear, the errback change and the unpickleable result change are actually orthogonal, right?  Anyway, I'm not really familiar with the protocol here, but assuming that you're open to code review:

> -    def apply_async(self, func, args=(), kwds={}, callback=None):
> +    def apply_async(self, func, args=(), kwds={}, callback=None,
> +            error_callback=None):
>          '''
>          Asynchronous equivalent of `apply()` builtin
>          '''
>          assert self._state == RUN
> -        result = ApplyResult(self._cache, callback)
> +        result = ApplyResult(self._cache, callback, error_callback)
>          self._taskqueue.put(([(result._job, None, func, args, kwds)], None))
>          return result
Sure.  Why not add an error_callback for map_async as well?

> -    def __init__(self, cache, callback):
> +    def __init__(self, cache, callback, error_callback=None):
>          self._cond = threading.Condition(threading.Lock())
>          self._job = job_counter.next()
>          self._cache = cache
>          self._ready = False
>          self._callback = callback
> +        self._errback = error_callback
>          cache[self._job] = self
Any reason you chose to use a different internal name (errback versus error_callback)?   It seems cleaner to me to be consistent about the name.

>  def sqr(x, wait=0.0):
>      time.sleep(wait)
>      return x*x
> +
>  class _TestPool(BaseTestCase):
>      def test_apply(self):
> @@ -1020,6 +1021,7 @@ class _TestPool(BaseTestCase):
>          self.assertEqual(get(), 49)
>          self.assertTimingAlmostEqual(get.elapsed, TIMEOUT1)
>  
> +
>      def test_async_timeout(self):
In general, I'm wary of nonessential whitespace changes... did you mean to include these?

> +        scratchpad = [None]
> +        def errback(exc):
> +            scratchpad[0] = exc
> +
> +        res = p.apply_async(raising, error_callback=errback)
> +        self.assertRaises(KeyError, res.get)
> +        self.assertTrue(scratchpad[0])
> +        self.assertIsInstance(scratchpad[0], KeyError)
> +
> +        p.close()
Using "assertTrue" seems misleading.  "assertIsNotNone" is what you really mean, right?  Although, I believe that's redundant, since presumably self.assertIsInstance(None, KeyError) will error out anyway (I haven't verified this).


> +    def test_unpickleable_result(self):
> +        from multiprocessing.pool import MaybeEncodingError
> +        p = multiprocessing.Pool(2)
> +
> +        # Make sure we don't lose pool processes because of encoding errors.
> +        for iteration in xrange(20):
> +
> +            scratchpad = [None]
> +            def errback(exc):
> +                scratchpad[0] = exc
> +
> +            res = p.apply_async(unpickleable_result, error_callback=errback)
> +            self.assertRaises(MaybeEncodingError, res.get)
> +            wrapped = scratchpad[0]
> +            self.assertTrue(wrapped)
Again, assertTrue is probably not what you want, and is probably redundant.
> +            self.assertIsInstance(scratchpad[0], MaybeEncodingError)
Why use scratchpad[0] rather than wrapped?
> +            self.assertIsNotNone(wrapped.exc)
> +            self.assertIsNotNone(wrapped.value)
Under what circumstances would these be None?  (Perhaps you want wrapped.exc != 'None'?)  The initializer for MaybeEncodingError enforces the invariant that exc/value are strings, right?


> +
>  class _TestPoolWorkerLifetime(BaseTestCase):
>  
>      ALLOWED_TYPES = ('processes', )
Three line breaks there seems excessive.
msg110273 - (view) Author: Ask Solem (asksol) (Python committer) Date: 2010-07-14 12:34
> To be clear, the errback change and the unpickleable result
> change are actually orthogonal, right?

Yes, it could be a separate issue. Jesse, do you think I should I open
up a separate issue for this?

> Why not add an error_callback for map_async as well?

That's a good idea!

> Any reason you chose to use a different internal name
> (errback versus error_callback)? It seems cleaner to me
> to be consistent about the name.

It was actually a mistake. The argument was ``errback`` before, so
it's just a leftover from the previous name.

> In general, I'm wary of nonessential whitespace changes...
> did you mean to include these?

Of course not.

> Using "assertTrue" seems misleading. "assertIsNotNone" is what
> really mean, right?  Although, I believe that's redundant,
> since presumably self.assertIsInstance(None, KeyError) will
> error out anyway (I haven't verified this).

bool(KeyError("foo")) is True and bool(None) is False, so it works either way. It could theoretically result in a false negative if
the exception class tested overrides __nonzero__, but that is unlikely
to happen as the target always returns KeyError anyway (and the test below ensures it) It's just a habit of mine, unless I really want to test for Noneness, I just use assertTrue, but I'm not against changing it to assertIsNotNone either.

> Under what circumstances would these be None?  (Perhaps you
> want wrapped.exc != 'None'?)  The initializer for
> MaybeEncodingError enforces the invariant that exc/value are strings
> right?
It's just to test that these are actually set to something.
Even an empty string passes with assertIsNone instead of assertTrue.
Maybe it's better to test the values set, but I didn't bother.
msg110368 - (view) Author: Ask Solem (asksol) (Python committer) Date: 2010-07-15 14:05
Updated patch with Greg's suggestions.
(multiprocessing-trunk@82502-handle_worker_encoding_errors2.patch)
msg120502 - (view) Author: Jesse Noller (jnoller) * (Python committer) Date: 2010-11-05 14:52
Fine w/ committing this Ask.
msg159800 - (view) Author: Roundup Robot (python-dev) Date: 2012-05-02 15:39
New changeset 26bbff4562a7 by Richard Oudkerk in branch '2.7':
Issue #9400: Partial backport of fix for #9244
http://hg.python.org/cpython/rev/26bbff4562a7
msg161614 - (view) Author: Richard Oudkerk (sbt) * (Python committer) Date: 2012-05-25 19:55
The patch was applied to 3.x branch in 0aa8af79359d and partially backported to 2.7 in 26bbff4562a7 - see #9400.

I will close.
History
Date User Action Args
2012-05-25 19:55:25sbtsetstatus: open -> closed

nosy: + sbt
messages: + msg161614

resolution: fixed
stage: resolved
2012-05-02 15:39:17python-devsetnosy: + python-dev
messages: + msg159800
2010-11-05 14:52:16jnollersetmessages: + msg120502
2010-07-15 14:05:29asksolsetfiles: + multiprocessing-trunk@82502-handle_worker_encoding_errors2.patch

messages: + msg110368
2010-07-14 12:34:02asksolsetmessages: + msg110273
2010-07-13 15:24:45gdbsetmessages: + msg110203
2010-07-13 12:22:34jnollersetassignee: jnoller

nosy: + gdb
2010-07-13 11:56:29asksolsettitle: multiprocessing.pool: Worker crashes if result can't be encoded result (with patch) -> multiprocessing.pool: Worker crashes if result can't be encoded
2010-07-13 11:55:09asksolsettitle: multiprocessing.pool: Pool crashes if worker can't encode result (with patch) -> multiprocessing.pool: Worker crashes if result can't be encoded result (with patch)
2010-07-13 11:51:53asksolcreate