Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

itertools.tee not thread-safe; can segfault interpreter when wrapped iterator releases GIL #78591

Closed
carlorosati mannequin opened this issue Aug 15, 2018 · 31 comments
Closed
Assignees
Labels
3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes stdlib Python modules in the Lib dir type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@carlorosati
Copy link
Mannequin

carlorosati mannequin commented Aug 15, 2018

BPO 34410
Nosy @rhettinger, @serhiy-storchaka, @MojoVampire, @applio, @zhangyangyu, @hongweipeng, @miss-islington, @tirkarthi, @iritkatriel
PRs
  • bpo-34410: itertools.tee not thread-safe; can segfault interpreter wh… #9075
  • bpo-34410: itertools.tee not thread-safe; can segfault #9254
  • bpo-34410: Raise a RuntimeError when tee iterator is consumed from different threads. #15567
  • bpo-34410: Fix a crash in the tee iterator when re-enter it. #15625
  • Revert "bpo-34410: Raise a RuntimeError when tee iterator is consumed from different threads." #15736
  • [3.8] bpo-34410: Fix a crash in the tee iterator when re-enter it. (GH-15625) #15737
  • [3.7] bpo-34410: Fix a crash in the tee iterator when re-enter it. (GH-15625) #15738
  • [2.7] bpo-34410: Fix a crash in the tee iterator when re-enter it. (GH-15625) #15740
  • Files
  • 3.py
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/serhiy-storchaka'
    closed_at = <Date 2021-07-27.07:17:31.839>
    created_at = <Date 2018-08-15.02:08:28.419>
    labels = ['3.8', '3.7', 'library', '3.9', 'type-crash']
    title = 'itertools.tee not thread-safe; can segfault interpreter when wrapped iterator releases GIL'
    updated_at = <Date 2021-07-27.07:17:31.838>
    user = 'https://bugs.python.org/carlorosati'

    bugs.python.org fields:

    activity = <Date 2021-07-27.07:17:31.838>
    actor = 'serhiy.storchaka'
    assignee = 'serhiy.storchaka'
    closed = True
    closed_date = <Date 2021-07-27.07:17:31.839>
    closer = 'serhiy.storchaka'
    components = ['Library (Lib)']
    creation = <Date 2018-08-15.02:08:28.419>
    creator = 'carlorosati'
    dependencies = []
    files = ['47750']
    hgrepos = []
    issue_num = 34410
    keywords = ['patch']
    message_count = 31.0
    messages = ['323546', '323587', '323588', '323592', '323593', '323613', '323632', '323633', '323817', '323828', '323858', '323881', '323917', '324146', '324237', '324658', '324839', '350165', '350174', '350705', '350750', '350754', '350907', '351014', '351360', '351370', '351371', '351381', '375640', '398264', '398282']
    nosy_count = 10.0
    nosy_names = ['rhettinger', 'serhiy.storchaka', 'josh.r', 'davin', 'xiang.zhang', 'hongweipeng', 'miss-islington', 'xtreak', 'carlorosati', 'iritkatriel']
    pr_nums = ['9075', '9254', '15567', '15625', '15736', '15737', '15738', '15740']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'crash'
    url = 'https://bugs.python.org/issue34410'
    versions = ['Python 2.7', 'Python 3.7', 'Python 3.8', 'Python 3.9']

    @carlorosati
    Copy link
    Mannequin Author

    carlorosati mannequin commented Aug 15, 2018

    Hello,

    When I run the attached code, I encounter a segmentation fault.

    Thanks,
    Carlo

    @carlorosati carlorosati mannequin added 3.7 (EOL) end of life type-crash A hard crash of the interpreter, possibly with a core dump labels Aug 15, 2018
    @carlorosati
    Copy link
    Mannequin Author

    carlorosati mannequin commented Aug 16, 2018

    I figured out that the problem is itertools.tee does not use a multiprocessing.Manager proxied object for shared state. I was able to create a workaround tee as follows.

    def multiprocessing_tee(iterable, n=2):
        """Write a multiprocessing safe itertools.tee"""
        it = iter(iterable)
        m = multiprocessing.Manager()
        lists = [m.list() for i in range(n)]
        def gen(local_list):
            keep_m_alive = m
            while True:
                if not local_list:         # when the local list is empty
                    newval = next(it)      # fetch a new value and
                    for l in lists:        # load it to all the lists
                        l.append(newval)
                yield local_list.pop(-1)
        return tuple(gen(l) for l in lists)

    @carlorosati
    Copy link
    Mannequin Author

    carlorosati mannequin commented Aug 16, 2018

    Okay I needed to do .pop(0) instead of .pop(-1) which is probably O(N)

    @carlorosati
    Copy link
    Mannequin Author

    carlorosati mannequin commented Aug 16, 2018

    You'll also need to lock when modifying the manager's list. Does anyone know how to do this using the multiprocessing.Queues without deadlocking?

    @tirkarthi
    Copy link
    Member

    Thanks for the script. I can reproduce this on master and Python 3.6 too. Sometimes the attached script causes timeout error. Running it under gdb gives me below :

    [New Thread 0x18ab of process 10682]
    [New Thread 0x1903 of process 10682]
    [New Thread 0x1a03 of process 10682]

    Thread 2 received signal SIGSEGV, Segmentation fault.
    [Switching to Thread 0x18ab of process 10682]
    0x000000010033e509 in teedataobject_getitem (tdo=0x1014361c0, i=1)
    at ./Modules/itertoolsmodule.c:454
    454 Py_INCREF(value);

    Backtrace :

    #0 0x000000010033e509 in teedataobject_getitem (tdo=0x100645640, i=1)
    at ./Modules/itertoolsmodule.c:454
    #1 0x000000010033e290 in tee_next (to=0x10308a668) at ./Modules/itertoolsmodule.c:637
    #2 0x0000000100059844 in enum_next (en=0x10308fc20) at Objects/enumobject.c:156
    #3 0x00000001002297e1 in _PyEval_EvalFrameDefault (
    f=Frame 0x100646710, for file /Users/karthikeyansingaravelan/stuff/python/cpython/Lib/multiprocessing/pool.py, line 292, in _guarded_task_generation (self=<Pool(_ctx=<ForkContext at remote 0x101ddce90>, _inqueue=<SimpleQueue(_reader=<Connection(_handle=4, _readable=True, _writable=False) at remote 0x101c236b0>, _writer=<Connection(_handle=5, _readable=False, _writable=True) at remote 0x101c23c50>, _rlock=<Lock(_semlock=<_multiprocessing.SemLock at remote 0x101f53e10>, acquire=<built-in method acquire of _multiprocessing.SemLock object at remote 0x101f53e10>, release=<built-in method release of _multiprocessing.SemLock object at remote 0x101f53e10>) at remote 0x101f50050>, _poll=<method at remote 0x10186e2c0>, _wlock=<Lock(_semlock=<_multiprocessing.SemLock at remote 0x103078178>, acquire=<built-in method acquire of _multiprocessing.SemLock object at remote 0x103078178>, release=<built-in method release of _multiprocessing.SemLock object at remote 0x103078178>) at remote 0x101f54dd0>) at remote 0x101bf5cb0>, _ou...(truncated), throwflag=0) at Python/ceval.c:2905
    #4 0x0000000100215277 in PyEval_EvalFrameEx (
    f=Frame 0x100646710, for file /Users/karthikeyansingaravelan/stuff/python/cpython/Lib/multiprocessing/pool.py, line 292, in _guarded_task_generation (self=<Pool(_ctx=<ForkContext at remote 0x101ddce90>, _inqueue=<SimpleQueue(_reader=<Connection(_handle=4, _readable=True, _writable=False) at remote 0x101c236b0>, _writer=<Connection(_handle=5, _readable=False, _writable=True) at remote 0x101c23c50>, _rlock=<Lock(semlock=<multiprocessing.SemLock at remote 0x101f53e10>, acquire=<built-in method acquire of _multiprocessing.SemLock object at remote 0x101f53e10>, release=<built-in method release of _multiprocessing.SemLock object at remote 0x101f53e10>) at remote 0x101f50050>, _poll=<method at remote 0x10186e2c0>, _wlock=<Lock(semlock=<multiprocessing.SemLock at remote 0x103078178>, acquire=<built-in method acquire of _multiprocessing.SemLock object at remote 0x103078178>, release=<built-in method release of _multiprocessing.SemLock object at remote 0x103078178>) at remote 0x101f54dd0>) at remote 0x101bf5cb0>, _ou...(truncated), throwflag=0) at Python/ceval.c:536
    #5 0x000000010006a3e2 in gen_send_ex (gen=0x1030349b0, arg=0x0, exc=0, closing=0)
    at Objects/genobject.c:221
    #6 0x000000010006ba1f in gen_iternext (gen=0x1030349b0) at Objects/genobject.c:542
    #7 0x00000001002297e1 in _PyEval_EvalFrameDefault (
    f=Frame 0x101909730, for file /Users/karthikeyansingaravelan/stuff/python/cpython/Lib/multiprocessing/pool.py, line 426, in _handle_tasks (taskqueue=<queue.SimpleQueue at remote 0x101f1daa0>, put=<method at remote 0x101db5ae0>, outqueue=<SimpleQueue(reader=<Connection(handle=8, _readable=True, _writable=False) at remote 0x10306d410>, _writer=<Connection(handle=9, _readable=False, _writable=True) at remote 0x10306d9b0>, _rlock=<Lock(semlock=<multiprocessing.SemLock at remote 0x103078ba0>, acquire=<built-in method acquire of _multiprocessing.SemLock object at remote 0x103078ba0>, release=<built-in method release of _multiprocessing.SemLock object at remote 0x103078ba0>) at remote 0x10306de90>, _poll=<method at remote 0x1018eddb8>, _wlock=<Lock(semlock=<multiprocessing.SemLock at remote 0x103078c08>, acquire=<built-in method acquire of _multiprocessing.SemLock object at remote 0x103078c08>, release=<built-in method release of _multiprocessing.SemLock object at remote 0x103078c08>) at remote 0x10306df50>) a...(truncated), throwflag=0) at Python/ceval.c:2905
    #8 0x0000000100215277 in PyEval_EvalFrameEx (
    f=Frame 0x101909730, for file /Users/karthikeyansingaravelan/stuff/python/cpython/Lib/multiprocessing/pool.py, line 426, in _handle_tasks (taskqueue=<queue.SimpleQueue at remote 0x101f1daa0>, put=<method at remote 0x101db5ae0>, outqueue=<SimpleQueue(reader=<Connection(handle=8, _readable=True, _writable=False) at remote 0x10306d410>, _writer=<Connection(handle=9, _readable=False, _writable=True) at remote 0x10306d9b0>, _rlock=<Lock(semlock=<multiprocessing.SemLock at remote 0x103078ba0>, ac---Type <return> to continue, or q <return> to quit---
    quire=<built-in method acquire of _multiprocessing.SemLock object at remote 0x103078ba0>, release=<built-in method release of _multiprocessing.SemLock object at remote 0x103078ba0>) at remote 0x10306de90>, _poll=<method at remote 0x1018eddb8>, _wlock=<Lock(semlock=<multiprocessing.SemLock at remote 0x103078c08>, acquire=<built-in method acquire of _multiprocessing.SemLock object at remote 0x103078c08>, release=<built-in method release of _multiprocessing.SemLock object at remote 0x103078c08>) at remote 0x10306df50>) a...(truncated), throwflag=0) at Python/ceval.c:536
    #9 0x0000000100045d63 in function_code_fastcall (co=0x101de77c0, args=0x1030693a0, nargs=5,
    globals={'__name
    ': 'multiprocessing.pool', '__doc
    ': None, '__package
    ': 'multiprocessing', '__loader
    ': <SourceFileLoader(name='multiprocessing.pool', path='/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/multiprocessing/pool.py') at remote 0x101c23b90>, '__spec
    ': <ModuleSpec(name='multiprocessing.pool', loader=<...>, origin='/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/multiprocessing/pool.py', loader_state=None, submodule_search_locations=None, _set_fileattr=True, _cached='/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/multiprocessing/pycache/pool.cpython-38.pyc', _initializing=False) at remote 0x101c23bf0>, '__file
    ': '/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/multiprocessing/pool.py', '__cached
    ': '/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/multiprocessing/pycache/pool.cpython-38.pyc', '__builtins
    ': {'__name
    ': 'builtins', '__doc
    ': "Built-in functions, exceptions, and other objects.\n\nNoteworthy: None is the `nil' object; Ellipsis repr...(truncated)) at Objects/call.c:283
    #10 0x0000000100043d91 in _PyFunction_FastCallDict (func=<function at remote 0x101f41f70>,
    args=0x103069378, nargs=5, kwargs={}) at Objects/call.c:322
    #11 0x0000000100045902 in PyObject_Call (callable=<function at remote 0x101f41f70>,
    args=(<_queue.SimpleQueue at remote 0x101f1daa0>, <method at remote 0x101db5ae0>, <SimpleQueue(_reader=<Connection(_handle=8, _readable=True, _writable=False) at remote 0x10306d410>, _writer=<Connection(_handle=9, _readable=False, _writable=True) at remote 0x10306d9b0>, _rlock=<Lock(_semlock=<_multiprocessing.SemLock at remote 0x103078ba0>, acquire=<built-in method acquire of _multiprocessing.SemLock object at remote 0x103078ba0>, release=<built-in method release of _multiprocessing.SemLock object at remote 0x103078ba0>) at remote 0x10306de90>, _poll=<method at remote 0x1018eddb8>, _wlock=<Lock(_semlock=<_multiprocessing.SemLock at remote 0x103078c08>, acquire=<built-in method acquire of _multiprocessing.SemLock object at remote 0x103078c08>, release=<built-in method release of _multiprocessing.SemLock object at remote 0x103078c08>) at remote 0x10306df50>) at remote 0x101f50530>, [<ForkProcess(_identity=(1,), _config={'authkey': <AuthenticationString at remote 0x101bfcdd0>, 'semprefix': '/mp', 'daemon': True}, _pa...(truncated), kwargs={}) at Objects/call.c:226
    #12 0x00000001002342bc in do_call_core (func=<function at remote 0x101f41f70>,
    callargs=(<_queue.SimpleQueue at remote 0x101f1daa0>, <method at remote 0x101db5ae0>, <SimpleQueue(_reader=<Connection(_handle=8, _readable=True, _writable=False) at remote 0x10306d410>, _writer=<Connection(_handle=9, _readable=False, _writable=True) at remote 0x10306d9b0>, _rlock=<Lock(_semlock=<_multiprocessing.SemLock at remote 0x103078ba0>, acquire=<built-in method acquire of _multiprocessing.SemLock object at remote 0x103078ba0>, release=<built-in method release of _multiprocessing.SemLock object at remote 0x103078ba0>) at remote 0x10306de90>, _poll=<method at remote 0x1018eddb8>, _wlock=<Lock(_semlock=<_multiprocessing.SemLock at remote 0x103078c08>, acquire=<built-in method acquire of _multiprocessing.SemLock object at remote 0x103078c08>, release=<built-in method release of _multiprocessing.SemLock object at remote 0x103078c08>) at remote 0x10306df50>) at remote 0x101f50530>, [<ForkProcess(_identity=(1,), _config={'authkey': <AuthenticationString at remote 0x101bfcdd0>, 'semprefix': '/mp', 'daemon': True}, _pa...(truncated), kwdict={}) at Python/ceval.c:4652
    #13 0x000000010022ca2b in _PyEval_EvalFrameDefault (
    f=Frame 0x10308d050, for file /Users/karthikeyansingaravelan/stuff/python/cpython/Lib/threading.py, line 865, in run (self=<Thread(_target=<function at remote 0x101f41f70>, _name='Thread-2', _args=(<_queue.SimpleQueue at remote 0x101f1daa0>, <method at remote 0x101db5ae0>, <SimpleQueue(_reader=<Connection(_handle=8, _readable=True, _writable=False) at remote 0x10306d410>, _writer=<Connection(_hand---Type <return> to continue, or q <return> to quit---

    le=9, _readable=False, _writable=True) at remote 0x10306d9b0>, _rlock=<Lock(_semlock=<_multiprocessing.SemLock at remote 0x103078ba0>, acquire=<built-in method acquire of _multiprocessing.SemLock object at remote 0x103078ba0>, release=<built-in method release of _multiprocessing.SemLock object at remote 0x103078ba0>) at remote 0x10306de90>, _poll=<method at remote 0x1018eddb8>, _wlock=<Lock(_semlock=<_multiprocessing.SemLock at remote 0x103078c08>, acquire=<built-in method acquire of _multiprocessing.SemLock object at remote 0x103078c08>, release=<built-in method release of _multiprocessing.SemLock object at remote 0x1...(truncated), throwflag=0) at Python/ceval.c:3267
    #14 0x0000000100215277 in PyEval_EvalFrameEx (
        f=Frame 0x10308d050, for file /Users/karthikeyansingaravelan/stuff/python/cpython/Lib/threading.py, line 865, in run (self=<Thread(_target=<function at remote 0x101f41f70>, _name='Thread-2', _args=(<_queue.SimpleQueue at remote 0x101f1daa0>, <method at remote 0x101db5ae0>, <SimpleQueue(_reader=<Connection(_handle=8, _readable=True, _writable=False) at remote 0x10306d410>, _writer=<Connection(_handle=9, _readable=False, _writable=True) at remote 0x10306d9b0>, _rlock=<Lock(_semlock=<_multiprocessing.SemLock at remote 0x103078ba0>, acquire=<built-in method acquire of _multiprocessing.SemLock object at remote 0x103078ba0>, release=<built-in method release of _multiprocessing.SemLock object at remote 0x103078ba0>) at remote 0x10306de90>, _poll=<method at remote 0x1018eddb8>, _wlock=<Lock(_semlock=<_multiprocessing.SemLock at remote 0x103078c08>, acquire=<built-in method acquire of _multiprocessing.SemLock object at remote 0x103078c08>, release=<built-in method release of _multiprocessing.SemLock object at remote 0x1...(truncated), throwflag=0) at Python/ceval.c:536
    #15 0x0000000100045d63 in function_code_fastcall (co=0x101c7aa00, args=0x10190a8f8, nargs=1,
        globals={'__name__': 'threading', '__doc__': "Thread module emulating a subset of Java's threading model.", '__package__': '', '__loader__': <SourceFileLoader(name='threading', path='/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/threading.py') at remote 0x101c6a4d0>, '__spec__': <ModuleSpec(name='threading', loader=<...>, origin='/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/threading.py', loader_state=None, submodule_search_locations=None, _set_fileattr=True, _cached='/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/__pycache__/threading.cpython-38.pyc', _initializing=False) at remote 0x101c6a530>, '__file__': '/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/threading.py', '__cached__': '/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/__pycache__/threading.cpython-38.pyc', '__builtins__': {'__name__': 'builtins', '__doc__': "Built-in functions, exceptions, and other objects.\n\nNoteworthy: None is the `nil' object; Ellipsis represents `...' in slices.", '__package__': '', '...(truncated)) at Objects/call.c:283
    #16 0x0000000100045069 in _PyFunction_FastCallKeywords (func=<function at remote 0x101d9e890>,
        stack=0x10190a8f0, nargs=1, kwnames=0x0) at Objects/call.c:408
    #17 0x0000000100233e50 in call_function (pp_stack=0x103b522b8, oparg=1, kwnames=0x0)
        at Python/ceval.c:4623
    #18 0x000000010022bcf9 in _PyEval_EvalFrameDefault (
        f=Frame 0x10190a750, for file /Users/karthikeyansingaravelan/stuff/python/cpython/Lib/threading.py, line 917, in _bootstrap_inner (self=<Thread(_target=<function at remote 0x101f41f70>, _name='Thread-2', _args=(<_queue.SimpleQueue at remote 0x101f1daa0>, <method at remote 0x101db5ae0>, <SimpleQueue(_reader=<Connection(_handle=8, _readable=True, _writable=False) at remote 0x10306d410>, _writer=<Connection(_handle=9, _readable=False, _writable=True) at remote 0x10306d9b0>, _rlock=<Lock(_semlock=<_multiprocessing.SemLock at remote 0x103078ba0>, acquire=<built-in method acquire of _multiprocessing.SemLock object at remote 0x103078ba0>, release=<built-in method release of _multiprocessing.SemLock object at remote 0x103078ba0>) at remote 0x10306de90>, _poll=<method at remote 0x1018eddb8>, _wlock=<Lock(_semlock=<_multiprocessing.SemLock at remote 0x103078c08>, acquire=<built-in method acquire of _multiprocessing.SemLock object at remote 0x103078c08>, release=<built-in method release of _multiprocessing.SemLock object ...(truncated), throwflag=0) at Python/ceval.c:3186
    #19 0x0000000100215277 in PyEval_EvalFrameEx (
        f=Frame 0x10190a750, for file /Users/karthikeyansingaravelan/stuff/python/cpython/Lib/threading.p---Type <return> to continue, or q <return> to quit

    y, line 917, in _bootstrap_inner (self=<Thread(_target=<function at remote 0x101f41f70>, _name='Thread-2', _args=(<_queue.SimpleQueue at remote 0x101f1daa0>, <method at remote 0x101db5ae0>, <SimpleQueue(_reader=<Connection(_handle=8, _readable=True, _writable=False) at remote 0x10306d410>, _writer=<Connection(_handle=9, _readable=False, _writable=True) at remote 0x10306d9b0>, _rlock=<Lock(_semlock=<_multiprocessing.SemLock at remote 0x103078ba0>, acquire=<built-in method acquire of _multiprocessing.SemLock object at remote 0x103078ba0>, release=<built-in method release of _multiprocessing.SemLock object at remote 0x103078ba0>) at remote 0x10306de90>, _poll=<method at remote 0x1018eddb8>, _wlock=<Lock(_semlock=<_multiprocessing.SemLock at remote 0x103078c08>, acquire=<built-in method acquire of _multiprocessing.SemLock object at remote 0x103078c08>, release=<built-in method release of _multiprocessing.SemLock object ...(truncated), throwflag=0) at Python/ceval.c:536
    #20 0x0000000100045d63 in function_code_fastcall (co=0x101c7ad00, args=0x103086b68, nargs=1,
    globals={'name': 'threading', 'doc': "Thread module emulating a subset of Java's threading model.", 'package': '', 'loader': <SourceFileLoader(name='threading', path='/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/threading.py') at remote 0x101c6a4d0>, 'spec': <ModuleSpec(name='threading', loader=<...>, origin='/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/threading.py', loader_state=None, submodule_search_locations=None, _set_fileattr=True, _cached='/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/pycache/threading.cpython-38.pyc', _initializing=False) at remote 0x101c6a530>, 'file': '/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/threading.py', 'cached': '/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/pycache/threading.cpython-38.pyc', 'builtins': {'name': 'builtins', 'doc': "Built-in functions, exceptions, and other objects.\n\nNoteworthy: None is the nil' object; Ellipsis represents ...' in slices.", 'package': '', '...(truncated)) at Objects/call.c:283
    #21 0x0000000100045069 in _PyFunction_FastCallKeywords (func=<function at remote 0x101d9eb50>,
    stack=0x103086b60, nargs=1, kwnames=0x0) at Objects/call.c:408
    #22 0x0000000100233e50 in call_function (pp_stack=0x103b54508, oparg=1, kwnames=0x0)
    at Python/ceval.c:4623
    #23 0x000000010022bcf9 in _PyEval_EvalFrameDefault (
    f=Frame 0x1030869d8, for file /Users/karthikeyansingaravelan/stuff/python/cpython/Lib/threading.py, line 885, in _bootstrap (self=<Thread(_target=<function at remote 0x101f41f70>, _name='Thread-2', _args=(<_queue.SimpleQueue at remote 0x101f1daa0>, <method at remote 0x101db5ae0>, <SimpleQueue(_reader=<Connection(_handle=8, _readable=True, _writable=False) at remote 0x10306d410>, _writer=<Connection(_handle=9, _readable=False, _writable=True) at remote 0x10306d9b0>, _rlock=<Lock(_semlock=<_multiprocessing.SemLock at remote 0x103078ba0>, acquire=<built-in method acquire of _multiprocessing.SemLock object at remote 0x103078ba0>, release=<built-in method release of _multiprocessing.SemLock object at remote 0x103078ba0>) at remote 0x10306de90>, _poll=<method at remote 0x1018eddb8>, _wlock=<Lock(_semlock=<_multiprocessing.SemLock at remote 0x103078c08>, acquire=<built-in method acquire of _multiprocessing.SemLock object at remote 0x103078c08>, release=<built-in method release of _multiprocessing.SemLock object at rem...(truncated), throwflag=0) at Python/ceval.c:3186
    #24 0x0000000100215277 in PyEval_EvalFrameEx (
    f=Frame 0x1030869d8, for file /Users/karthikeyansingaravelan/stuff/python/cpython/Lib/threading.py, line 885, in _bootstrap (self=<Thread(_target=<function at remote 0x101f41f70>, _name='Thread-2', _args=(<_queue.SimpleQueue at remote 0x101f1daa0>, <method at remote 0x101db5ae0>, <SimpleQueue(_reader=<Connection(_handle=8, _readable=True, _writable=False) at remote 0x10306d410>, _writer=<Connection(_handle=9, _readable=False, _writable=True) at remote 0x10306d9b0>, _rlock=<Lock(_semlock=<_multiprocessing.SemLock at remote 0x103078ba0>, acquire=<built-in method acquire of _multiprocessing.SemLock object at remote 0x103078ba0>, release=<built-in method release of _multiprocessing.SemLock object at remote 0x103078ba0>) at remote 0x10306de90>, _poll=<method at remote 0x1018eddb8>, _wlock=<Lock(_semlock=<_multiprocessing.SemLock at remote 0x103078c08>, acquire=<built-in method acquire of _multiprocessing.SemLock object at remote 0x103078c08>, release=<built-in method release of _multiprocessing.S---Type to continue, or q to quit---
    emLock object at rem...(truncated), throwflag=0) at Python/ceval.c:536
    #25 0x0000000100045d63 in function_code_fastcall (co=0x101c7aac0, args=0x103b55d68, nargs=1,
    globals={'name': 'threading', 'doc': "Thread module emulating a subset of Java's threading model.", 'package': '', 'loader': <SourceFileLoader(name='threading', path='/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/threading.py') at remote 0x101c6a4d0>, 'spec': <ModuleSpec(name='threading', loader=<...>, origin='/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/threading.py', loader_state=None, submodule_search_locations=None, _set_fileattr=True, _cached='/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/pycache/threading.cpython-38.pyc', _initializing=False) at remote 0x101c6a530>, 'file': '/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/threading.py', 'cached': '/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/pycache/threading.cpython-38.pyc', 'builtins': {'name': 'builtins', 'doc': "Built-in functions, exceptions, and other objects.\n\nNoteworthy: None is the nil' object; Ellipsis represents ...' in slices.", 'package': '', '...(truncated)) at Objects/call.c:283
    #26 0x0000000100043d91 in _PyFunction_FastCallDict (func=<function at remote 0x101d9e940>,
    args=0x103b55d60, nargs=1, kwargs=0x0) at Objects/call.c:322
    #27 0x00000001000437e4 in _PyObject_FastCallDict (callable=<function at remote 0x101d9e940>,
    args=0x103b55d60, nargs=1, kwargs=0x0) at Objects/call.c:98
    #28 0x0000000100047d29 in _PyObject_Call_Prepend (callable=<function at remote 0x101d9e940>,
    obj=<Thread(_target=<function at remote 0x101f41f70>, _name='Thread-2', _args=(<_queue.SimpleQueue at remote 0x101f1daa0>, <method at remote 0x101db5ae0>, <SimpleQueue(_reader=<Connection(_handle=8, _readable=True, _writable=False) at remote 0x10306d410>, _writer=<Connection(_handle=9, _readable=False, _writable=True) at remote 0x10306d9b0>, _rlock=<Lock(_semlock=<_multiprocessing.SemLock at remote 0x103078ba0>, acquire=<built-in method acquire of _multiprocessing.SemLock object at remote 0x103078ba0>, release=<built-in method release of _multiprocessing.SemLock object at remote 0x103078ba0>) at remote 0x10306de90>, _poll=<method at remote 0x1018eddb8>, _wlock=<Lock(_semlock=<_multiprocessing.SemLock at remote 0x103078c08>, acquire=<built-in method acquire of _multiprocessing.SemLock object at remote 0x103078c08>, release=<built-in method release of _multiprocessing.SemLock object at remote 0x103078c08>) at remote 0x10306df50>) at remote 0x101f50530>, [<ForkProcess(_identity=(1,), _config={'authkey': <Authenticat...(truncated), args=(), kwargs=0x0) at Objects/call.c:904
    #29 0x000000010004b5bd in method_call (method=<method at remote 0x101ec2bb0>, args=(), kwargs=0x0)
    at Objects/classobject.c:306
    #30 0x0000000100045a02 in PyObject_Call (callable=<method at remote 0x101ec2bb0>, args=(),
    kwargs=0x0) at Objects/call.c:245
    #31 0x0000000100349e16 in t_bootstrap (boot_raw=0x10303dd30) at ./Modules/_threadmodule.c:992
    #32 0x00007fff87a53268 in _pthread_body () from /usr/lib/system/libsystem_pthread.dylib
    #33 0x00007fff87a531e5 in _pthread_start () from /usr/lib/system/libsystem_pthread.dylib
    #34 0x00007fff87a5141d in thread_start () from /usr/lib/system/libsystem_pthread.dylib
    #35 0x0000000000000000 in ?? ()

    Thanks

    @rhettinger
    Copy link
    Contributor

    Davin, is there anything itertools.tee() can do about this or is this a multiprocessing issue?

    @carlorosati
    Copy link
    Mannequin Author

    carlorosati mannequin commented Aug 17, 2018

    I've actually written a few workarounds that should be considered a multiprocessing specific tee function. I need feedback/critique on these. Hopefully we can all agree on one solution that's the best. It is unfortunate that the multiprocessing manager does not provide a dequeue.

    The first one I wrote uses a managed list.

    def multiprocessing_tee(iterable, n=2):
        """Write a multiprocessing safe itertools.tee"""
        it = iter(iterable)
        m = multiprocessing.Manager()
        mylock = m.Lock()
        lists = [m.list() for i in range(n)]
        def gen(local_list):
            for i in itertools.count():
                with mylock:
                    if not local_list:         # when the local list is empty
                        newval = next(it)      # fetch a new value and
                        for l in lists:        # load it to all the lists
                            l.append(newval)
                yield local_list.pop(0)
        return tuple(gen(l) for l in lists)

    The second two implementations use queues.

    def multiprocessing_tee_q(iterable, n=2):
        """Write a multiprocessing safe itertools.tee"""
        it = iter(iterable)
        m = multiprocessing.Manager()
        lock = m.Lock()
        queues = [m.Queue(-1) for _ in range(n)] # -1 means infinite maxsize (so puts won't block)
        def gen(myqueue):
            while True:
                with lock: # no one else touches anything
                    try:
                        newval = myqueue.get_nowait()
                    except Queue.Empty:
                        newval = next(it)
                        for q in queues:
                            q.put(newval)
                        newval = myqueue.get()
                yield newval
        return tuple(gen(q) for q in queues)
    
    class Sentinel(object):
        """used as Queue Sentinel"""
    
    def multiprocessing_tee_q2(iterable, n=2):
        """Write a multiprocessing safe itertools.tee"""
        it = iter(iterable)
        m = multiprocessing.Manager()
        lock = m.Lock()
        queues = [m.Queue(-1) for _ in range(n)] # -1 means infinite maxsize (so puts won't block)
        def gen(myqueue):
            while True:
                try:
                    retval = myqueue.get_nowait()
                except Queue.Empty:
                    # what happens if the other process puts last item in my queue before i get lock?
                    with lock: # no one else touches anything
                        try:
                            newval = next(it)
                        except StopIteration:
                            newval = Sentinel
                        for q in queues:
                            q.put(newval)
                    retval = myqueue.get()
                if retval is Sentinel:
                    raise StopIteration
                yield retval
        return tuple(gen(q) for q in queues)

    I'm just throwing out my sketches here. I'm hoping the more experienced here can weigh in on these implementations.

    @carlorosati
    Copy link
    Mannequin Author

    carlorosati mannequin commented Aug 17, 2018

    for i in itertools.count() in the first implementation I posted should be while True. I was using that for debugging.

    @zhangyangyu
    Copy link
    Member

    It seems to me the problem is tee objects might encounter race conditions while PyIter_Next in teedataobject_getitem releases GIL. Other threads then might get into the same branch since tdo->numread haven't been updated yet. NULL slots are generated then, 2 objects are read from the underlying iterator and tdo->numread is updated twice while only one slot is set.

    As for multiprocessing.pool, there is a background task handling thread consuming one tee object and main thread consuming another one. The underlying iterator is IMapIterator which next method would block on a condition.

    While trying, I find the following snippet would also crash:

    import threading
    import itertools
    
    class C:
        def __iter__(self):
            return self
        def __next__(self):
            return 1
    
    def test(i):
        print(list(i))

    i1, i2 = itertools.tee(C())
    threading.Thread(target=test, args=(i1,)).start()
    print(list(i2))

    GDB shows it crashs in teedataobject_dealloc -> teedataobject_clear. I haven't understood what happened.

    @zhangyangyu
    Copy link
    Member

    Ahh, the infinite iterator could also releases GIL in PyIter_Next.

    @carlorosati
    Copy link
    Mannequin Author

    carlorosati mannequin commented Aug 22, 2018

    If what you've said is correct, would it make the most sense to create a Manager method which returns a Proxy to a tee'd iterator?

    @MojoVampire
    Copy link
    Mannequin

    MojoVampire mannequin commented Aug 22, 2018

    Carlo: The point of Xiang's post is that this is only tangentially related to multiprocessing; the real problem is that tee-ing an iterator implemented in Python (of which pool.imap_unordered is just one example) and using the resulting tee-ed iterators in multiple threads (which pool.imap_unordered does implicitly, as there is a thread involved in dispatching work).

    The problem is *exposed* by multiprocessing.pool.imap_unordered, but it entirely a problem with itertools.tee, and as Xiang's repro indicates, it can be triggered easily without the complexity of multiprocessing being involved.

    I've updated the bug title to reflect this.

    @MojoVampire MojoVampire mannequin added the stdlib Python modules in the Lib dir label Aug 22, 2018
    @MojoVampire MojoVampire mannequin changed the title Segfault/TimeoutError: itertools.tee of multiprocessing.pool.imap_unordered itertools.tee not thread-safe; can segfault interpreter when wrapped iterator releases GIL Aug 22, 2018
    @rhettinger
    Copy link
    Contributor

    Xiang Zhang, would you like to submit a patch?

    @zhangyangyu
    Copy link
    Member

    I could. But currently I don't have a good idea how to fix it elegantly. If anyone else makes a PR, I'm willing to review it.

    @zhangyangyu zhangyangyu added the 3.8 only security fixes label Aug 27, 2018
    @tirkarthi
    Copy link
    Member

    This might be redundant but I googled 'itertools tee thread safe' and came across a detailed SO answer from 2017 that explains the issue along with a similar example that causes segfault in the compiler but it was not reported here it seems. It also proposes a possible solution to fix the iterator but not tee itself.

    SO answer : https://stackoverflow.com/a/44638570/2610955

    Similar open issue that could be a duplicate

    https://bugs.python.org/issue24482

    Thanks

    @hongweipeng
    Copy link
    Mannequin

    hongweipeng mannequin commented Sep 6, 2018

    Multi-process need uses multiprocessing.Manager to share, the current problem should be tee-objcet thread safety issue.As Xiang Zhang said,PyIter_Next in teedataobject_getitem releases GIL.So the thread lock is necessary,and only lead iterator uses it when runs PyIter_Next.Can anyone help me review it.

    @serhiy-storchaka
    Copy link
    Member

    I take this.

    @serhiy-storchaka serhiy-storchaka self-assigned this Sep 8, 2018
    @serhiy-storchaka
    Copy link
    Member

    I tried to solve this issue myself, and figured out that it is not so simple. It is possible to make tee() nor crashing, but it is not possible to guarantee the order of the output without using a lock in tee(). It you can get a sequence 1, 2, 4, 3, ...

    So we have the following options:

    1. Document that the tee() iterator can not be consumed from different threads, even if an underlying iterator is thread-safe.
    2. Detect the race condition and raise a RuntimeError (as for iterating modifying dicts).
    3. Fix a crash, but document that the tee() iterator can return items out of order if consumed from different threads (PR 9254 basically does this). And exceptions can be raised out of order too.
    4. Add a non-reentrant lock in the tee() object and raise a RuntimeError if it is re-entered.
    5. Add a reentrant lock in the tee() object.

    We can apply different solutions in developed and maintained versions. I would not backport options with locks.

    @rhettinger
    Copy link
    Contributor

    Thanks for enumerating the options. I think 1 and 2 are the best combination. It is a reasonable restriction to not tee across threads. If someone still does, then detecting it, raising an exception, and not crashing seem like a reasonable response.

    @rhettinger
    Copy link
    Contributor

    I've applied the patch to "master". I considered backporting but am thinking that would be risky at this stage in the 3.8 release.

    @rhettinger rhettinger added 3.9 only security fixes and removed 3.7 (EOL) end of life 3.8 only security fixes labels Aug 29, 2019
    @serhiy-storchaka
    Copy link
    Member

    The documentation changes should be backported.

    And I think we need a change like PR 9254, but with raising a RuntimeError instead of adding the value to the queue, to be applied in older versions.

    It may be better to apply it even to the developed version. There is nothing wrong with creating the tee iterator in one thread and using it the other thread. Or using the tee iterators with external locking. I afraid that PR 15567 can break a legitimate code.

    @serhiy-storchaka serhiy-storchaka added 3.7 (EOL) end of life 3.8 only security fixes labels Aug 29, 2019
    @rhettinger
    Copy link
    Contributor

    Go ahead and take this in any direction you want.

    @hongweipeng
    Copy link
    Mannequin

    hongweipeng mannequin commented Aug 31, 2019

    As far as I am concerned, I prefer that like PR 9254 can be merged into old versions. Because it does not break the legitimate code and can prevent the program from crashing. It can be used as a compatible solution with old versions.

    @zhangyangyu
    Copy link
    Member

    It may be better to apply it even to the developed version. There is nothing wrong with creating the tee iterator in one thread and using it the other thread. Or using the tee iterators with external locking. I afraid that PR 15567 can break a legitimate code.

    +1 on this. I think it's better to also apply bpo-15625 to master branch. I don't think we should prevent creating the iterator in one thread and using it in others.

    @serhiy-storchaka
    Copy link
    Member

    New changeset 526a014 by Serhiy Storchaka in branch 'master':
    bpo-34410: Fix a crash in the tee iterator when re-enter it. (GH-15625)
    526a014

    @miss-islington
    Copy link
    Contributor

    New changeset 6e3809c by Miss Islington (bot) in branch '3.8':
    bpo-34410: Fix a crash in the tee iterator when re-enter it. (GH-15625)
    6e3809c

    @miss-islington
    Copy link
    Contributor

    New changeset 5190b71 by Miss Islington (bot) in branch '3.7':
    bpo-34410: Fix a crash in the tee iterator when re-enter it. (GH-15625)
    5190b71

    @serhiy-storchaka
    Copy link
    Member

    New changeset 2fb6921 by Serhiy Storchaka in branch '2.7':
    [2.7] bpo-34410: Fix a crash in the tee iterator when re-enter it. (GH-15625) (GH-15740)
    2fb6921

    @iritkatriel
    Copy link
    Member

    This seems resolved, can it be closed?

    @iritkatriel
    Copy link
    Member

    The script (3.py) now gives the RuntimeError (as of 3.9) so I think the discussion about back port is over and there is nothing more to do on this issue. If nobody objects I will close this.

    @serhiy-storchaka
    Copy link
    Member

    It would be nice to add a multi-thread supporting version of tee() or add multi-thread support in tee(), but this is different issue.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes stdlib Python modules in the Lib dir type-crash A hard crash of the interpreter, possibly with a core dump
    Projects
    None yet
    Development

    No branches or pull requests

    6 participants