classification
Title: Multiprocessing imap hangs when generator input errors
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.6, Python 3.5, Python 2.7
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: Aaron Halfaker, davin, jnoller, sbt, terry.reedy
Priority: normal Keywords:

Created on 2016-02-10 19:03 by Aaron Halfaker, last changed 2016-02-12 23:16 by terry.reedy. This issue is now closed.

Messages (2)
msg260032 - (view) Author: Aaron Halfaker (Aaron Halfaker) Date: 2016-02-10 19:03
multiprocessing.imap will hang and not raise an error if an error occurs in the generator that is being mapped over.  I'd expect the error to be raised and/or the process to fail.  

For example, run the following code in python 2.7 or 3.4:

    from multiprocessing import Pool

    def add_one(v):
        return v+1

    pool = Pool(processes=2)

    values = ["1", "2", "3", "4", "foo", "5", "6", "7", "8"]
    value_iter = (int(v) for v in values)

    for new_val in pool.imap(add_one, value_iter):
        print(new_val)

And output should look something like this:

    $ python demo_hanging.py 
    2
    3
    4
    5
    Exception in thread Thread-2:
    Traceback (most recent call last):
      File "/usr/lib/python3.4/threading.py", line 920, in _bootstrap_inner
        self.run()
      File "/usr/lib/python3.4/threading.py", line 868, in run
        self._target(*self._args, **self._kwargs)
      File "/usr/lib/python3.4/multiprocessing/pool.py", line 378, in _handle_tasks
        for i, task in enumerate(taskseq):
      File "/usr/lib/python3.4/multiprocessing/pool.py", line 286, in <genexpr>
        self._taskqueue.put((((result._job, i, func, (x,), {})
      File "demo_hanging.py", line 9, in <genexpr>
        value_iter = (int(v) for v in values)
    ValueError: invalid literal for int() with base 10: 'foo'

The script will then hang indefinitely.
msg260210 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2016-02-12 23:16
If you add the "if __name__ == '__main__':" guard after defining the target function, as specified in the multiprocessing doc, you will get a traceback much as you expect:

Traceback (most recent call last):
  File "F:\Python\mypy\tem.py", line 12, in <module>
    for new_val in pool.imap(add_one, value_iter):
  File "C:\Programs\Python35\lib\multiprocessing\pool.py", line 695, in next
    raise value
  File "C:\Programs\Python35\lib\multiprocessing\pool.py", line 380, in _handle_tasks
    for i, task in enumerate(taskseq):
  File "C:\Programs\Python35\lib\multiprocessing\pool.py", line 286, in <genexpr>
    self._taskqueue.put((((result._job, i, func, (x,), {})
  File "F:\Python\mypy\tem.py", line 10, in <genexpr>
    value_iter = (int(v) for v in values)
ValueError: invalid literal for int() with base 10: 'foo'

I have seem this bug of omission multiple times on Stackoverflow.
History
Date User Action Args
2016-02-12 23:16:53terry.reedysetstatus: open -> closed

versions: + Python 3.6
nosy: + terry.reedy

messages: + msg260210
resolution: not a bug
stage: resolved
2016-02-12 15:13:50davinsetnosy: + davin
2016-02-10 19:04:27SilentGhostsetnosy: + jnoller, sbt

versions: + Python 3.5, - Python 3.4
2016-02-10 19:03:45Aaron Halfakercreate