Issue33612
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2018-05-23 07:17 by serhiy.storchaka, last changed 2022-04-11 14:59 by admin. This issue is now closed.
Files | ||||
---|---|---|---|---|
File name | Uploaded | Description | Edit | |
bug.py | vstinner, 2018-05-23 10:18 |
Pull Requests | |||
---|---|---|---|
URL | Status | Linked | Edit |
PR 7069 | merged | vstinner, 2018-05-23 10:26 | |
PR 7074 | merged | miss-islington, 2018-05-23 16:01 |
Messages (15) | |||
---|---|---|---|
msg317363 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2018-05-23 07:17 | |
I got the following output when run tests in the huntrleaks mode. $ ./python -We -m test -R 3:3 -x test_builtin -x test_urlparse ... 1:24:54 load avg: 2.48 [225/414] test_multiprocessing_fork beginning 6 repetitions 123456 ..python: Python/pystate.c:589: PyThreadState_Clear: Assertion `tstate->exc_info->previous_item == NULL' failed. Fatal Python error: Aborted Current thread 0x00007fdd3e07e700 (most recent call first): File "/home/serhiy/py/cpython/Lib/multiprocessing/popen_fork.py", line 70 in _launch File "/home/serhiy/py/cpython/Lib/multiprocessing/popen_fork.py", line 20 in __init__ File "/home/serhiy/py/cpython/Lib/multiprocessing/context.py", line 277 in _Popen File "/home/serhiy/py/cpython/Lib/multiprocessing/process.py", line 112 in start File "/home/serhiy/py/cpython/Lib/multiprocessing/pool.py", line 241 in _repopulate_pool File "/home/serhiy/py/cpython/Lib/multiprocessing/pool.py", line 248 in _maintain_pool File "/home/serhiy/py/cpython/Lib/multiprocessing/pool.py", line 412 in _handle_workers File "/home/serhiy/py/cpython/Lib/threading.py", line 865 in run File "/home/serhiy/py/cpython/Lib/threading.py", line 917 in _bootstrap_inner File "/home/serhiy/py/cpython/Lib/threading.py", line 885 in _bootstrap ..python: Python/pystate.c:589: PyThreadState_Clear: Assertion `tstate->exc_info->previous_item == NULL' failed. Fatal Python error: Aborted Current thread 0x00007fdd6cb6c700 (most recent call first): File "/home/serhiy/py/cpython/Lib/multiprocessing/popen_fork.py", line 70 in _launch File "/home/serhiy/py/cpython/Lib/multiprocessing/popen_fork.py", line 20 in __init__ File "/home/serhiy/py/cpython/Lib/multiprocessing/context.py", line 277 in _Popen File "/home/serhiy/py/cpython/Lib/multiprocessing/process.py", line 112 in start File "/home/serhiy/py/cpython/Lib/multiprocessing/pool.py", line 241 in _repopulate_pool File "/home/serhiy/py/cpython/Lib/multiprocessing/pool.py", line 248 in _maintain_pool File "/home/serhiy/py/cpython/Lib/multiprocessing/pool.py", line 412 in _handle_workers File "/home/serhiy/py/cpython/Lib/threading.py", line 865 in run File "/home/serhiy/py/cpython/Lib/threading.py", line 917 in _bootstrap_inner File "/home/serhiy/py/cpython/Lib/threading.py", line 885 in _bootstrap .. 1:34:07 load avg: 1.01 [226/414] test_multiprocessing_forkserver -- test_multiprocessing_fork passed in 9 min 13 sec ... It happens after 1.5 hours of running tests. Running the test_multiprocessing_fork test only doesn't expose it. Likely there are already opened related issues, but the assertion failure in PyThreadState_Clear is new, it was added in 3.7. It may help to identify the problem. |
|||
msg317368 - (view) | Author: STINNER Victor (vstinner) * | Date: 2018-05-23 09:36 | |
The assertion has been added by bpo-25612: commit ae3087c6382011c47db82fea4d05f8bbf514265d Author: Mark Shannon <mark@hotpy.org> Date: Sun Oct 22 22:41:51 2017 +0100 Move exc state to generator. Fixes bpo-25612 (#1773) Move exception state information from frame objects to coroutine (generator/thread) object where it belongs. |
|||
msg317369 - (view) | Author: STINNER Victor (vstinner) * | Date: 2018-05-23 09:44 | |
Failing assertion: /* The stack of exception states should contain just this thread. */ assert(tstate->exc_info->previous_item == NULL); "test_multiprocessing_fork passed in 9 min 13 sec" Oh. A child process crashed on "self.pid = os.fork()" in "/Lib/multiprocessing/popen_fork.py", line 70 in _launch", but the test still pass. os.fork() calls PyOS_AfterFork_Child() which starts by calling _PyGILState_Reinit() which ends with "_PyThreadState_DeleteExcept(current_tstate);". _PyThreadState_DeleteExcept() creates a "garbage" list of thread states, and then call PyThreadState_Clear() on each thread. Finally, one call to PyThreadState_Clear() fails on the assertion. I'm not sure that the assertion takes in account the case of specific case of fork(). |
|||
msg317373 - (view) | Author: STINNER Victor (vstinner) * | Date: 2018-05-23 10:04 | |
I created bpo-33613: "test_multiprocessing_fork: test_semaphore_tracker_sigint() fails with -W error". I don't think that it's directly related. |
|||
msg317377 - (view) | Author: STINNER Victor (vstinner) * | Date: 2018-05-23 10:18 | |
Attached bug.py reproduces the bug with a daemonic thread but without fork(): $ ./python bug.py ls coredpython: Python/pystate.c:589: PyThreadState_Clear: Assertion `tstate->exc_info->previous_item == NULL' failed. Aborted (core dumped) (You might have to run the script a few times to get the crash, it's not deterministic.) vstinner@apu$ gdb ./python -c coredump-python.24130 GNU gdb (GDB) Fedora 8.0.1-36.fc27 (..) Core was generated by `./python bug.py'. Program terminated with signal SIGABRT, Aborted. #0 0x00007f5136075660 in raise () from /lib64/libc.so.6 [Current thread is 1 (Thread 0x7f5136f64040 (LWP 24130))] (gdb) where #0 0x00007f5136075660 in raise () from /lib64/libc.so.6 #1 0x00007f5136076c41 in abort () from /lib64/libc.so.6 #2 0x00007f513606df7a in __assert_fail_base () from /lib64/libc.so.6 #3 0x00007f513606dff2 in __assert_fail () from /lib64/libc.so.6 #4 0x0000000000591dde in PyThreadState_Clear (tstate=0x1b7a430) at Python/pystate.c:589 #5 0x0000000000590274 in PyInterpreterState_Clear (interp=0x1b23930) at Python/pystate.c:204 #6 0x000000000058cf4c in Py_FinalizeEx () at Python/pylifecycle.c:1153 #7 0x00000000004257af in pymain_main (pymain=0x7ffe5f82f200) at Modules/main.c:2664 #8 0x0000000000425915 in _Py_UnixMain (argc=2, argv=0x7ffe5f82f448) at Modules/main.c:2697 #9 0x000000000041f1e7 in main (argc=2, argv=0x7ffe5f82f448) at ./Programs/python.c:15 (gdb) info threads Id Target Id Frame 1 Thread 0x7f5136f64040 (LWP 24130) 0x00007f5136075660 in raise () from /lib64/libc.so.6 * 2 Thread 0x7f512e8c6700 (LWP 24138) 0x00007f5136b56be6 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 3 Thread 0x7f512e0c5700 (LWP 24139) 0x00007f5136b56be6 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 4 Thread 0x7f512d8c4700 (LWP 24147) 0x00007f5136b56be6 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 5 Thread 0x7f512f0c7700 (LWP 24137) 0x00007f5136b56be6 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 (gdb) t 2 [Switching to thread 2 (Thread 0x7f512e8c6700 (LWP 24138))] #0 0x00007f5136b56be6 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 (gdb) where #0 0x00007f5136b56be6 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x000000000053458a in PyCOND_TIMEDWAIT (cond=0xa07248 <_PyRuntime+1224>, mut=0xa07278 <_PyRuntime+1272>, us=5000) at Python/condvar.h:90 #2 0x00000000005349a3 in take_gil (tstate=0x1b87980) at Python/ceval_gil.h:208 |
|||
msg317379 - (view) | Author: STINNER Victor (vstinner) * | Date: 2018-05-23 10:30 | |
I created PR 7069 to remove the assertion. With this change, attached bug.py doesn't crash anymore, but it still logs warnings in verbose mode: vstinner@apu$ ./python -v bug.py (...) PyThreadState_Clear: warning: thread still has a frame PyThreadState_Clear: warning: thread still has a frame PyThreadState_Clear: warning: thread still has a generator PyThreadState_Clear: warning: thread still has a frame PyThreadState_Clear: warning: thread still has a generator PyThreadState_Clear: warning: thread still has a frame The root issue is that Python doesn't join deamonic threads. But it's by design no? I hate daemonic threads :-) For the specific case of a fork(), it can be possible that a thread calls os.fork() while another thread is running a generator, no? -- Since this issue seems to be a regression and I have a PR fixing it, I propose to mark this issue as a release blocker. Sorry Ned :-( |
|||
msg317385 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2018-05-23 11:27 | |
The assertion looked good to me when I reviewed this code, and I don't understand why it fails. Maybe it would be better to fix the case in which tstate->exc_info->previous_item != NULL before calling PyThreadState_Clear(). Otherwise we can leak some things. |
|||
msg317386 - (view) | Author: Antoine Pitrou (pitrou) * | Date: 2018-05-23 11:31 | |
Yes, simply removing the assertion feels more like a copout than an actual fix. (perhaps it *is* the right fix to the issue, but it would be nice to find out why :-)) |
|||
msg317388 - (view) | Author: STINNER Victor (vstinner) * | Date: 2018-05-23 11:57 | |
> Yes, simply removing the assertion feels more like a copout than an actual fix. We are very close to the 3.7rc1, so I suggest to remove the assertion, just to get more time to fix the issue. -- I agree that in a perfect world, Python should cleanup everything properly, but in the current world, daemon threads and fork are a mess full of corner cases like this one. Python doesn't join daemonic threads at shutdown, so I don't see how Python could ensure that the state of the Python state of these daemonic threads is consistent (ready to be cleared). For fork, it seems like the issue is similar. A fork removes immediately all threads except of the thread which called fork. How would it be possible for Python to ensure that the Python state of the other destroyed threads is consistent? tstate->exc_info is a new feature of Python 3.7, to fix bpo-25612. But I don't think that Python 3.6 is better to cleanup daemon threads at shutdown or to cleanup threads on fork. If you run the attached bug.py with python3.6 -v bug.py, I also see warnings: PyThreadState_Clear: warning: thread still has a frame PyThreadState_Clear: warning: thread still has a frame PyThreadState_Clear: warning: thread still has a frame PyThreadState_Clear: warning: thread still has a frame |
|||
msg317389 - (view) | Author: Antoine Pitrou (pitrou) * | Date: 2018-05-23 11:59 | |
Le 23/05/2018 à 13:57, STINNER Victor a écrit : > > I agree that in a perfect world, Python should cleanup everything properly, but in the current world, daemon threads and fork are a mess full of corner cases like this one. For now, we don't know for sure that it's a cleanup issue. What if it's something else? I agree that, if it's simply a cleanup issue with daemon threads, we can remove the assertion. But perhaps we should first diagnose the issue. |
|||
msg317418 - (view) | Author: STINNER Victor (vstinner) * | Date: 2018-05-23 16:00 | |
New changeset b6dccf54fd3bac9c87348d96f9d6b571608c15bc by Victor Stinner in branch 'master': bpo-33612: Remove PyThreadState_Clear() assertion (#7069) https://github.com/python/cpython/commit/b6dccf54fd3bac9c87348d96f9d6b571608c15bc |
|||
msg317443 - (view) | Author: Ned Deily (ned.deily) * | Date: 2018-05-23 19:58 | |
New changeset 508d7693bc09affd99fdaa4a321cc3da0638c8a0 by Ned Deily (Miss Islington (bot)) in branch '3.7': bpo-33612: Remove PyThreadState_Clear() assertion (GH-7069) (GH-7074) https://github.com/python/cpython/commit/508d7693bc09affd99fdaa4a321cc3da0638c8a0 |
|||
msg317458 - (view) | Author: STINNER Victor (vstinner) * | Date: 2018-05-23 21:06 | |
Mark Shannon, the author of the assertion, approved my PR to remove the assertion: https://github.com/python/cpython/pull/7069#issuecomment-391342700 The initial issue, the failing assertion, has been fixed. Well, to be honest, I failed to reproduce Serhiy's use case using test_multiprocessing_fork, but I'm confident that the removal of the assertion fixed the bug ;-) |
|||
msg317595 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2018-05-24 17:53 | |
Thank you for fixing this Victor! |
|||
msg317612 - (view) | Author: STINNER Victor (vstinner) * | Date: 2018-05-24 20:17 | |
> Thank you for fixing this Victor! You're welcome. It's always a pleasure to remove code! |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:59:00 | admin | set | nosy:
+ lukasz.langa github: 77793 |
2018-05-24 20:17:10 | vstinner | set | messages: + msg317612 |
2018-05-24 17:53:59 | serhiy.storchaka | set | messages: + msg317595 |
2018-05-23 21:06:54 | vstinner | set | status: open -> closed resolution: fixed messages: + msg317458 stage: patch review -> resolved |
2018-05-23 19:58:25 | ned.deily | set | messages: + msg317443 |
2018-05-23 16:01:31 | miss-islington | set | pull_requests: + pull_request6704 |
2018-05-23 16:00:58 | vstinner | set | messages: + msg317418 |
2018-05-23 11:59:49 | pitrou | set | messages: + msg317389 |
2018-05-23 11:57:11 | vstinner | set | messages: + msg317388 |
2018-05-23 11:31:04 | pitrou | set | messages: + msg317386 |
2018-05-23 11:27:45 | serhiy.storchaka | set | messages: + msg317385 |
2018-05-23 10:30:09 | vstinner | set | priority: critical -> release blocker nosy: + ned.deily messages: + msg317379 |
2018-05-23 10:26:42 | vstinner | set | keywords:
+ patch stage: patch review pull_requests: + pull_request6697 |
2018-05-23 10:18:24 | vstinner | set | files:
+ bug.py messages: + msg317377 |
2018-05-23 10:04:39 | vstinner | set | messages: + msg317373 |
2018-05-23 09:44:59 | vstinner | set | messages: + msg317369 |
2018-05-23 09:36:46 | vstinner | set | messages: + msg317368 |
2018-05-23 08:48:57 | pitrou | set | priority: normal -> critical |
2018-05-23 07:20:10 | serhiy.storchaka | set | nosy:
+ vstinner |
2018-05-23 07:17:40 | serhiy.storchaka | create |