Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multiprocessing occasionally spits out exception during shutdown #48356

Closed
smontanaro opened this issue Oct 11, 2008 · 22 comments
Closed

multiprocessing occasionally spits out exception during shutdown #48356

smontanaro opened this issue Oct 11, 2008 · 22 comments
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@smontanaro
Copy link
Contributor

BPO 4106
Nosy @vsajip, @pitrou, @mattheww
Files
  • test_proc.py
  • test_mult.py: Example that spits out exception
  • mpqshutdown.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2011-08-24.20:52:15.225>
    created_at = <Date 2008-10-11.03:30:21.624>
    labels = ['type-bug', 'library']
    title = 'multiprocessing occasionally spits out exception during shutdown'
    updated_at = <Date 2012-01-14.06:53:29.839>
    user = 'https://github.com/smontanaro'

    bugs.python.org fields:

    activity = <Date 2012-01-14.06:53:29.839>
    actor = 'Yaniv.Aknin'
    assignee = 'jnoller'
    closed = True
    closed_date = <Date 2011-08-24.20:52:15.225>
    closer = 'pitrou'
    components = ['Library (Lib)']
    creation = <Date 2008-10-11.03:30:21.624>
    creator = 'skip.montanaro'
    dependencies = []
    files = ['11771', '20445', '23031']
    hgrepos = []
    issue_num = 4106
    keywords = ['patch']
    message_count = 22.0
    messages = ['74656', '74657', '74658', '74659', '80415', '80417', '109575', '109579', '109580', '109584', '109589', '126502', '126525', '135106', '142896', '142898', '142903', '142907', '142911', '142919', '142920', '151237']
    nosy_count = 11.0
    nosy_names = ['vinay.sajip', 'pitrou', 'mattheww', 'jnoller', 'Thorney', 'bobbyi', 'Yaniv.Aknin', 'gdb', 'python-dev', 'dpranke', 'tobami']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue4106'
    versions = ['Python 2.7']

    @smontanaro
    Copy link
    Contributor Author

    I worked up a simple example of using the external processing module
    (0.52) for a friend at work today. I noticed some cases where it raised
    exceptions during exit. Not all the time, but not infrequently either.
    This evening I tweaked it for the 2.6 multiprocessing module's API and
    tried it out. I ran it in a large loop:

    for i in $(range 500) ; do
    echo '!'$i
    python test_proc.py
    end | egrep '!'

    Most of the time all I see are the '!' lines from the echo command.
    Every once in awhile I see a traceback though. For example:

    Exception in thread QueueFeederThread (most likely raised during 
    interpreter shutdown):
    Traceback (most recent call last):
      File "/Users/skip/local/lib/python2.7/threading.py", line 522, in 
    __bootstrap_inner
      File "/Users/skip/local/lib/python2.7/threading.py", line 477, in run
      File "/Users/skip/local/lib/python2.7/multiprocessing/queues.py", line 
    233, in _feed
    <type 'exceptions.TypeError'>: 'NoneType' object is not callable

    This occurred once in approximately 1500 runs of the script (three times
    through the above shell loop). The script used to trigger this
    exception is attached.

    @smontanaro smontanaro added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Oct 11, 2008
    @smontanaro
    Copy link
    Contributor Author

    Oh, the range command used in the shell for loop is analogous to Python's
    range() builtin function.

    @smontanaro
    Copy link
    Contributor Author

    Got another one just now, but with just the note about the exception
    in the queue feeder thread. The traceback was swallowed.

    @smontanaro
    Copy link
    Contributor Author

    Final comment before I see some feedback from the experts.

    I have this code in the worker function's loop:

        # quick pause to allow other stuff to happen a bit randomly
        t = 0.1 * random.random()
        time.sleep(t)

    If I eliminate the sleep altogether pretty much all hell breaks loose.
    As I reduce the sleep time it gets noisier and noisier. I switched to
    a fixed sleep time and reduced it as far as

        time.sleep(0.00015625)

    At that point it was complaining about killing worker processes on
    many of the runs, maybe 1 out of every 5 or 10 runs.

    I suppose the moral of the story is to not use multiprocessing except
    when you have long-running tasks.

    @jnoller jnoller mannequin self-assigned this Jan 8, 2009
    @jnoller
    Copy link
    Mannequin

    jnoller mannequin commented Jan 23, 2009

    Skip, using this:

    while ((x++ < 500)) ; do echo '!'$i ; ./python.exe test_proc.py; done |
    egrep '!'

    I don't see the exception in python-trunk, freshly compiled. It could be
    an OS thing (I'm on OS/X) - I just want to confirm that you're still
    seeing this on trunk

    @jnoller
    Copy link
    Mannequin

    jnoller mannequin commented Jan 23, 2009

    Ah ha. I see it if I run it with the loop set to 3000 - it is pretty rare.

    @gdb
    Copy link
    Mannequin

    gdb mannequin commented Jul 8, 2010

    For what it's worth, I think I have a simpler reproducer of this issue. Using freshly-compiled python-from-trunk (as well as multiprocessing-from-trunk), I get tracebacks from the following about 30% of the time:

    """
    import multiprocessing, time
    def foo(x):
    time.sleep(3)
    multiprocessing.Pool(1).apply(foo, [1])
    """

    My tracebacks are of the form:
    """
    Exception in thread Thread-1 (most likely raised during interpreter shutdown):
    Traceback (most recent call last):
      File "/usr/local/lib/python2.7/threading.py", line 530, in __bootstrap_inner
      File "/usr/local/lib/python2.7/threading.py", line 483, in run
      File "/usr/local/lib/python2.7/multiprocessing/pool.py", line 272, in _handle_workers
    <type 'exceptions.TypeError'>: 'NoneType' object is not callable
    """

    @jnoller
    Copy link
    Mannequin

    jnoller mannequin commented Jul 8, 2010

    Greg - what platform?

    @gdb
    Copy link
    Mannequin

    gdb mannequin commented Jul 8, 2010

    I'm on Ubuntu 10.04, 64 bit.

    @jnoller
    Copy link
    Mannequin

    jnoller mannequin commented Jul 8, 2010

    Greg - this is actually a different exception then the original bug report; could you please file a new issue with the information you've provided? I'm going to need to find a 64bit ubuntu box as I don't have one right now.

    @gdb
    Copy link
    Mannequin

    gdb mannequin commented Jul 8, 2010

    Sure thing. See http://bugs.python.org/issue9207.

    @Thorney
    Copy link
    Mannequin

    Thorney mannequin commented Jan 18, 2011

    With the example script attached I see the exception every time. On Ubuntu 10.10 with Python 2.6

    Since the offending line in multiprocesing/queues.py (233) is a debug statement, just commenting it out seems to stop this exception.

    Looking at the util file shows the logging functions to be all of the form:

        if _logger:
            _logger.log(...

    Could it be possible that after the check the _logger global (or the debug function) is destroyed by the exit handler? Can we convince them to stick around until such a time that they cannot be called?

    Adding a small delay before joining also seems to work, but is ugly. Why should another Process *have* to have a minimum amount of work to not throw an exception?

    @jnoller
    Copy link
    Mannequin

    jnoller mannequin commented Jan 19, 2011

    On Tue, Jan 18, 2011 at 6:23 PM, Brian Thorne <report@bugs.python.org> wrote:

    Brian Thorne <hardbyte@gmail.com> added the comment:

    With the example script attached I see the exception every time. On Ubuntu 10.10 with Python 2.6

    Since the offending line in multiprocesing/queues.py (233) is a debug statement, just commenting it out seems to stop this exception.

    Looking at the util file shows the logging functions to be all of the form:

       if _logger:
           _logger.log(...

    Could it be possible that after the check the _logger global (or the debug function) is destroyed by the exit handler? Can we convince them to stick around until such a time that they cannot be called?

    Adding a small delay before joining also seems to work, but is ugly. Why should another Process *have* to have a minimum amount of work to not throw an exception?

    See http://bugs.python.org/issue9207 - but yes, the problem is that
    the VM is nuking our imported modules before all the processes are
    shutdown.

    @tobami
    Copy link
    Mannequin

    tobami mannequin commented May 4, 2011

    I can confirm this but with Python 2.7.1 on Ubuntu 11.04 64bit

    My code was working with a queue that was being fed a two-string tuple.
    When i changed it to contain my custom Objects, it still worked correctly, but the main program doesn't end until it raises the exception in QueueFeederThread

    @pitrou
    Copy link
    Member

    pitrou commented Aug 24, 2011

    I can't seem to reproduce this under 3.3. Should it be closed?

    @jnoller
    Copy link
    Mannequin

    jnoller mannequin commented Aug 24, 2011

    On Wed, Aug 24, 2011 at 3:01 PM, Antoine Pitrou <report@bugs.python.org> wrote:

    Antoine Pitrou <pitrou@free.fr> added the comment:

    I can't seem to reproduce this under 3.3. Should it be closed?

    I don't think so; it's still applicable to 2.x, and a fix should go
    into 2.7 ideally. http://bugs.python.org/issue9207 is the source of
    the issue AFAIR

    @pitrou
    Copy link
    Member

    pitrou commented Aug 24, 2011

    Indeed, 2.7 seems still affected.

    @pitrou
    Copy link
    Member

    pitrou commented Aug 24, 2011

    Ok, I think the reason this doesn't appear in 3.2/3.3 is the fix for bpo-1856. In 2.x (and 3.1) daemon threads can continue executing after the interpreter's internal structures have started being destroyed.

    The least intrusive solution is to always join the helper thread before shutting down the interpreter. Patch attached.

    @vsajip
    Copy link
    Member

    vsajip commented Aug 24, 2011

    In Antoine's patch, ISTM that the line

    created_by_this_process = ...

    could also be deleted, as the patch no longer uses that value and it's not used anywhere later in the method.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Aug 24, 2011

    New changeset d316315a8781 by Antoine Pitrou in branch '2.7':
    Issue bpo-4106: Fix occasional exceptions printed out by multiprocessing on interpreter shutdown.
    http://hg.python.org/cpython/rev/d316315a8781

    @pitrou
    Copy link
    Member

    pitrou commented Aug 24, 2011

    This should hopefully be fixed now. Feel free to reopen if it isn't.

    @pitrou pitrou closed this as completed Aug 24, 2011
    @YanivAknin
    Copy link
    Mannequin

    YanivAknin mannequin commented Jan 14, 2012

    Ugh. Not 100% sure it's related, but I've been getting a similar traceback when running pip's test suite (python setup.py test) on OSX 10.6.8 with Python 2.7.2.

    Traceback (most recent call last):
      File "/usr/local/Cellar/python/2.7.2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/atexit.py", line 24, in _run_exitfuncs
        func(*targs, **kargs)
      File "/usr/local/Cellar/python/2.7.2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/util.py", line 284, in _exit_function
        info('process shutting down')
    TypeError: 'NoneType' object is not callable

    Obviously it's not the exact same bug as fixed here, but Googling the traceback led me here and I do think it's the same genre of bug, i.e., multiprocessing's use of forking leads to issues when atexit is called (wasn't sure whether to open it here or bpo-9207). Also, see https://groups.google.com/forum/#!topic/nose-users/fnJ-kAUbYHQ, it seems other users of the nose testsuite ran into this.

    I'm afraid I won't have time to look much further into this (the reason I'm running pip's testsuite is that I'm already trying to make a contribution to pip...), but I thought it's best to at least mention it somewhere.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants