This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: test_multiprocessing_spawn dumps core in AMD64 FreeBSD CURRENT Shared 3.x
Type: Stage: resolved
Components: Tests Versions: Python 3.8
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: eric.snow, koobs, lukasz.langa, pablogsal, pitrou, scotchka, vstinner
Priority: normal Keywords:

Created on 2019-02-26 06:09 by pablogsal, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (18)
msg336615 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-02-26 06:09
OK (skipped=32)
Warning -- files was modified by test_multiprocessing_spawn
  Before: []
  After:  ['python.core']

https://buildbot.python.org/all/#/builders/168/builds/632/steps/4/logs/stdio
msg336616 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-02-26 06:14
Another failure:

https://buildbot.python.org/all/#/builders/168/builds/633
msg337035 - (view) Author: Henry Chen (scotchka) * Date: 2019-03-03 16:29
Another example of this, same bot:

https://buildbot.python.org/all/#/builders/168/builds/669
msg337038 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-03-03 17:54
After some investigation, this turns out to be more complicated, as this 'python.core' turns to be a core dumped from some of the processes spawned by test_multiprocessing_spawn. So the real problem is that test_multiprocessing_spawn segfaults, as in https://bugs.python.org/issue36116. I think this may be the same underlying problem. I will change the title to reflect this.
msg337043 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-03-03 20:46
I managed to access the core file and this is the traceback:


Thread 3 (LWP 100629):
#0  0x00000008007d4828 in _accept4 () from /lib/libc.so.7
#1  0x0000000800672eda in ?? () from /lib/libthr.so.3
#2  0x00000008016f7b75 in ?? ()
#3  0x0000000800acca10 in ?? ()
#4  0x00007fffdfffcea0 in ?? ()
#5  0x00007fffdfffcea8 in ?? ()
#6  0x00007fffdfffce70 in ?? ()
#7  0x00007fffdfffce70 in ?? ()
#8  0x00000008025485a0 in ?? ()
#9  0x00007fffdfffcdf0 in ?? ()
#10 0x00000008016f820c in ?? ()
#11 0x00007fffdfffcdb0 in ?? ()
#12 0x0000000800384d02 in cfunction_call_varargs (func=0x8016ffb70, args=0x0, kwargs=<optimized out>) at Objects/call.c:770
#13 PyCFunction_Call (func=0x8016ffb70, args=0x0, kwargs=<optimized out>) at Objects/call.c:786
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 2 (LWP 101669):
#0  0x0000000801e0495d in CRYPTO_free () from /usr/local/lib/libcrypto.so.11
#1  0x0000000801de09bc in ?? () from /usr/local/lib/libcrypto.so.11
#2  0x0000000801e005b7 in OPENSSL_cleanup () from /usr/local/lib/libcrypto.so.11
#3  0x0000000800825ab1 in __cxa_finalize () from /lib/libc.so.7
#4  0x00000008007b2791 in exit () from /lib/libc.so.7
#5  0x00000008004ca84e in Py_Exit (sts=0) at Python/pylifecycle.c:2166
#6  0x00000008004d6ffb in handle_system_exit () at Python/pythonrun.c:641
#7  0x00000008004d6b07 in PyErr_PrintEx (set_sys_last_vars=1) at Python/pythonrun.c:651
#8  0x00000008004d698e in PyErr_Print () at Python/pythonrun.c:547
#9  PyRun_SimpleStringFlags (command=0x80150de38 "from multiprocessing.spawn import spawn_main; spawn_main(tracker_fd=4, pipe_handle=9)\n", flags=0x7fffffffe860) at Python/pythonrun.c:462
#10 0x00000008004f7a39 in pymain_run_command (command=<optimized out>, cf=0x0) at Modules/main.c:527
#11 pymain_run_python (interp=<optimized out>, exitcode=<optimized out>) at Modules/main.c:804
#12 pymain_main (args=<optimized out>) at Modules/main.c:896
#13 0x00000008004f84f7 in _Py_UnixMain (argc=<optimized out>, argv=0x801bdb848) at Modules/main.c:937
#14 0x0000000000201120 in _start (ap=<optimized out>, cleanup=<optimized out>) at /usr/src/lib/csu/amd64/crt1.c:76

Thread 1 (LWP 100922):
#0  0x000000080047ca84 in take_gil (tstate=0x80262ba10) at Python/ceval_gil.h:216
#1  0x000000080047d074 in PyEval_RestoreThread (tstate=0x80262ba10) at Python/ceval.c:281
#2  0x00000008004f3325 in _Py_write_impl (fd=5, buf=0x8025e3e00, count=29, gil_held=1) at Python/fileutils.c:1558
#3  0x00000008005051aa in os_write_impl (module=<optimized out>, fd=<optimized out>, data=<optimized out>) at ./Modules/posixmodule.c:8798
#4  os_write (module=<optimized out>, args=0x8015b3ba8, nargs=<optimized out>) at ./Modules/clinic/posixmodule.c.h:4182
#5  0x0000000800385a5d in _PyMethodDef_RawFastCallKeywords (method=<optimized out>, self=0x80127a5f0, args=<optimized out>, nargs=2, kwnames=<optimized out>) at Objects/call.c:653
#6  0x00000008003847de in _PyCFunction_FastCallKeywords (func=0x8012820c0, args=0x0, nargs=0, kwnames=0xdbdbdbdbdbdbdbdb) at Objects/call.c:732
#7  0x000000080048e1f5 in call_function (pp_stack=0x7fffdf7faf98, oparg=<optimized out>, kwnames=0x0) at Python/ceval.c:4673
#8  0x00000008004892fa in _PyEval_EvalFrameDefault (f=0x8015b3a00, throwflag=<optimized out>) at Python/ceval.c:3294
#9  0x000000080048f03a in PyEval_EvalFrameEx (f=<optimized out>, throwflag=<error reading variable: Cannot access memory at address 0x0>) at Python/ceval.c:624
#10 _PyEval_EvalCodeWithName (_co=<optimized out>, globals=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=2, kwnames=0x0, kwargs=0x8025e25f8, kwcount=0, kwstep=1, defs=0x802567798,
    defcount=1, kwdefs=0x0, closure=0x0, name=0x8017e2930, qualname=0x80255b7c0) at Python/ceval.c:4035
#11 0x0000000800384690 in _PyFunction_FastCallKeywords (func=<optimized out>, stack=<optimized out>, nargs=0, kwnames=<optimized out>) at Objects/call.c:435
#12 0x000000080048e35f in call_function (pp_stack=0x7fffdf7fb2f0, oparg=<optimized out>, kwnames=0x0) at Python/ceval.c:4721
msg337044 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-03-03 20:52
Apparently is the Thread 1 the one that is causing the core dump
msg337045 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-03-03 20:53
This is the state of the thread interpreter:

gdb) p tstate
$3 = (PyThreadState *) 0x80262ba10
(gdb) p tstate->interp
$4 = (PyInterpreterState *) 0xdbdbdbdbdbdbdbdb
msg337078 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-03-04 08:41
I can reproduce the crash on my FreeBSD 12 VM:

vstinner@freebsd$ ./python -m test --fail-env-changed test_multiprocessing_spawn -v

FAIL: test_mymanager_context_prestarted (test.test_multiprocessing_spawn.WithManagerTestMyManager)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/home/vstinner/prog/python/master/Lib/test/_test_multiprocessing.py", line 2754, in test_mymanager_context_prestarted
    self.assertEqual(manager._process.exitcode, 0)
AssertionError: -10 != 0

Warning -- files was modified by test_multiprocessing_spawn
  Before: []
  After:  ['python.8184.core']
msg337080 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-03-04 08:45
Interesting, I tried several hours to reproduce the crash on the buildbot itself manually and I could not do it.
msg337081 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-03-04 08:57
I recently (Feb 25) fixed the config of this slow buildbot to change the timeout from 15 min to 20 min:

https://github.com/python/buildmaster-config/commit/37cf09ca9a5b6ce14b4e9b481101d559c4e55485
https://github.com/python/buildmaster-config/commit/e4155a744819acdbe7ada03b12116f7c147a9166

It seems like there is a correlation between this buildbot config change and the buildbot starting to create a coredump file. The test started to create a core dump around build 624, Feb 25.

--

With coredump.

Warning -- files was modified by test_multiprocessing_spawn:

https://buildbot.python.org/all/#/builders/168/builds/624
# 625 was fine
https://buildbot.python.org/all/#/builders/168/builds/626
https://buildbot.python.org/all/#/builders/168/builds/628
https://buildbot.python.org/all/#/builders/168/builds/629
https://buildbot.python.org/all/#/builders/168/builds/630
https://buildbot.python.org/all/#/builders/168/builds/631

--

Without coredump: fail but no coredump.

ERROR: test_shared_memory_SharedMemoryManager_basics (test.test_multiprocessing_forkserver.WithProcessesTestSharedMemory)
ERROR: test_shared_memory_across_processes (test.test_multiprocessing_forkserver.WithProcessesTestSharedMemory)
ERROR: test_shared_memory_basics (test.test_multiprocessing_forkserver.WithProcessesTestSharedMemory)
ERROR: test_shared_memory_ShareableList_basics (test.test_multiprocessing_spawn.WithProcessesTestSharedMemory)
ERROR: test_shared_memory_ShareableList_pickling (test.test_multiprocessing_spawn.WithProcessesTestSharedMemory)
ERROR: test_shared_memory_SharedMemoryManager_basics (test.test_multiprocessing_spawn.WithProcessesTestSharedMemory)
ERROR: test_shared_memory_across_processes (test.test_multiprocessing_spawn.WithProcessesTestSharedMemory)
ERROR: test_shared_memory_basics (test.test_multiprocessing_spawn.WithProcessesTestSharedMemory)
ERROR: test_shared_memory_ShareableList_basics (test.test_multiprocessing_fork.WithProcessesTestSharedMemory)
ERROR: test_shared_memory_ShareableList_pickling (test.test_multiprocessing_fork.WithProcessesTestSharedMemory)
ERROR: test_shared_memory_SharedMemoryManager_basics (test.test_multiprocessing_fork.WithProcessesTestSharedMemory)
ERROR: test_shared_memory_across_processes (test.test_multiprocessing_fork.WithProcessesTestSharedMemory)
ERROR: test_shared_memory_basics (test.test_multiprocessing_fork.WithProcessesTestSharedMemory)
Re-running failed tests in verbose mode

https://buildbot.python.org/all/#/builders/168/builds/617
https://buildbot.python.org/all/#/builders/168/builds/618
https://buildbot.python.org/all/#/builders/168/builds/620
https://buildbot.python.org/all/#/builders/168/builds/622
https://buildbot.python.org/all/#/builders/168/builds/623
msg337082 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-03-04 08:58
I set the priority to release blocker to remind to fix this regression. I guess that it's the same bug than bpo-36116, but since it's a different OS (FreeBSD / Windows), I prefer to continue to use separated issues.
msg337084 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2019-03-04 09:06
Can you paste traceback of the dump that you were able to generate in your FreeBSD? I wonder if you can get the symbols that the one I pasted were missing (likely in libthr.so).
msg337090 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-03-04 11:03
Example of crash:

* Thread 2 exit Python: PyRun_SimpleStringFlags->PyErr_PrintEx->handle_system_exit->Py_Exit->OPENSSL_cleanup...
* Thread 1 tries to acquire the GIL to close a file
* Thread 3 is waiting in socket.socket.accept()

Problem: PyThreadState of Thread 1 is corrupted: its memory has been freed.



Thread 1 got the crash

(gdb) p tstate
$2 = (PyThreadState *) 0x8027e2050
(gdb) p *tstate
$3 = {
  prev = 0xdbdbdbdbdbdbdbdb,
  next = 0xdbdbdbdbdbdbdbdb,
  interp = 0xdbdbdbdbdbdbdbdb,
  ...
}

(gdb) p _PyRuntime.gilstate.tstate_current
$4 = {
  _value = 0
}



(gdb) thread apply all where

Thread 3 (LWP 100714):
#0  _accept4 () at _accept4.S:3
#1  0x0000000800679eaa in __thr_accept4 (s=5, addr=0x7fffdfff45f8, addrlen=0x7fffdfff45f0, flags=268435456) at /usr/src/lib/libthr/thread/thr_syscalls.c:126
#2  0x00000008016f9b55 in sock_accept_impl (s=0x801bda050, data=0x7fffdfff45c0) at /usr/home/vstinner/prog/python/master/Modules/socketmodule.c:2592
#3  0x00000008016fa1ec in sock_call_ex (s=0x801bda050, writing=0, sock_func=0x8016f9b00 <sock_accept_impl>, data=0x7fffdfff45c0, connect=0, err=0x0, timeout=-1000000000) at /usr/home/vstinner/prog/python/master/Modules/socketmodule.c:886
#4  0x00000008016f9aef in sock_call (s=0x801bda050, writing=0, func=0x8016f9b00 <sock_accept_impl>, data=0x7fffdfff45c0) at /usr/home/vstinner/prog/python/master/Modules/socketmodule.c:938
#5  0x00000008016f70e3 in sock_accept (s=0x801bda050, _unused_ignored=0x0) at /usr/home/vstinner/prog/python/master/Modules/socketmodule.c:2634
#6  0x000000000035415c in _PyMethodDef_RawFastCallKeywords (method=0x801701b70 <sock_methods>, self=<socket at remote 0x801bda050>, args=0x8027a01f8, nargs=0, kwnames=0x0) at Objects/call.c:631
#7  0x00000000004d3841 in _PyMethodDescr_FastCallKeywords (descrobj=<method_descriptor at remote 0x8016df360>, args=0x8027a01f0, nargs=1, kwnames=0x0) at Objects/descrobject.c:290
#8  0x000000000037e20e in call_function (pp_stack=0x7fffdfff4cb8, oparg=1, kwnames=0x0) at Python/ceval.c:4698
#9  0x0000000000378dec in _PyEval_EvalFrameDefault (f=Frame 0x8027a0050, for file /usr/home/vstinner/prog/python/master/Lib/socket.py, line 212, in accept (), throwflag=0) at Python/ceval.c:3280
#10 0x0000000000369595 in PyEval_EvalFrameEx (f=Frame 0x8027a0050, for file /usr/home/vstinner/prog/python/master/Lib/socket.py, line 212, in accept (), throwflag=0) at Python/ceval.c:624
...
#53 0x0000000800677776 in thread_start (curthread=0x80265a300) at /usr/src/lib/libthr/thread/thr_create.c:292

Thread 2 (LWP 100120):
#0  0x000000080076e981 in __je_tcache_event_hard (tsd=0x8005ec090, tcache=0x8005ec250) at jemalloc_tcache.c:54
#1  0x00000008007b05e3 in tcache_event (tsd=<optimized out>, tcache=<optimized out>) at /usr/src/contrib/jemalloc/include/jemalloc/internal/tcache_inlines.h:37
#2  tcache_dalloc_small (tsd=<optimized out>, tcache=<optimized out>, ptr=<optimized out>, binind=<optimized out>, slow_path=false) at /usr/src/contrib/jemalloc/include/jemalloc/internal/tcache_inlines.h:185
#3  arena_dalloc (tcache=<optimized out>, slow_path=false, tsdn=<optimized out>, ptr=<optimized out>, alloc_ctx=<optimized out>) at /usr/src/contrib/jemalloc/include/jemalloc/internal/arena_inlines_b.h:224
#4  idalloctm (slow_path=false, tsdn=<optimized out>, ptr=<optimized out>, tcache=<optimized out>, alloc_ctx=<optimized out>, is_internal=<optimized out>) at /usr/src/contrib/jemalloc/include/jemalloc/internal/jemalloc_internal_inlines_c.h:118
#5  ifree (tsd=<optimized out>, ptr=<optimized out>, tcache=<optimized out>, slow_path=false) at jemalloc_jemalloc.c:2226
#6  __free (ptr=<optimized out>) at jemalloc_jemalloc.c:2397
#7  0x0000000801fabbd4 in OPENSSL_LH_free (lh=0x801b6c0c0) at /usr/src/crypto/openssl/crypto/lhash/lhash.c:88
#8  0x0000000801f2d6ac in lh_ERR_STRING_DATA_free (lh=<optimized out>) at /usr/src/crypto/openssl/include/openssl/err.h:217
#9  err_cleanup () at /usr/src/crypto/openssl/crypto/err/err.c:289
#10 0x0000000801fde3a7 in OPENSSL_cleanup () at /usr/src/crypto/openssl/crypto/init.c:569
#11 0x000000080082a0c5 in __cxa_finalize (dso=0x0) at /usr/src/lib/libc/stdlib/atexit.c:239
#12 0x00000008007b9cc1 in exit (status=0) at /usr/src/lib/libc/stdlib/exit.c:74
#13 0x000000000045c6e8 in Py_Exit (sts=0) at Python/pylifecycle.c:2166
#14 0x000000000041c403 in handle_system_exit () at Python/pythonrun.c:641
#15 0x000000000041bf86 in PyErr_PrintEx (set_sys_last_vars=1) at Python/pythonrun.c:651
#16 0x000000000041b20e in PyErr_Print () at Python/pythonrun.c:547
#17 0x000000000041be48 in PyRun_SimpleStringFlags (command=0x80134b460 "from multiprocessing.spawn import spawn_main; spawn_main(tracker_fd=6, pipe_handle=9)\n", flags=0x7fffffffe7f0) at Python/pythonrun.c:462
#18 0x00000000002cb691 in pymain_run_command (command=0x800acd910 L"from multiprocessing.spawn import spawn_main; spawn_main(tracker_fd=6, pipe_handle=9)\n", cf=0x7fffffffe7f0) at Modules/main.c:527
#19 0x00000000002cadb5 in pymain_run_python (interp=0x800b08010, exitcode=0x7fffffffe8cc) at Modules/main.c:804
#20 0x00000000002ca6ba in pymain_main (args=0x7fffffffe928) at Modules/main.c:896
#21 0x00000000002ca788 in _Py_UnixMain (argc=4, argv=0x7fffffffe9e8) at Modules/main.c:937
#22 0x00000000002c9372 in main (argc=4, argv=0x7fffffffe9e8) at ./Programs/python.c:16

Thread 1 (LWP 100696):
#0  0x0000000000368210 in take_gil (tstate=0x8027e2050) at Python/ceval_gil.h:216
#1  0x0000000000368a94 in PyEval_RestoreThread (tstate=0x8027e2050) at Python/ceval.c:281
#2  0x000000000058fdbe in internal_close (self=0x802cb7440) at ./Modules/_io/fileio.c:121
#3  0x000000000058fcd3 in _io_FileIO_close_impl (self=0x802cb7440) at ./Modules/_io/fileio.c:163
#4  0x000000000058ece9 in _io_FileIO_close (self=0x802cb7440, _unused_ignored=0x0) at ./Modules/_io/clinic/fileio.c.h:23
#5  0x00000000003537d9 in _PyMethodDef_RawFastCallDict (method=0x6298f0 <fileio_methods+224>, self=<_io.FileIO at remote 0x802cb7440>, args=0x7fffdf3e92e0, nargs=0, kwargs=0x0) at Objects/call.c:484
#6  0x000000000035200d in _PyCFunction_FastCallDict (func=<built-in method close of _io.FileIO object at remote 0x802cb7440>, args=0x7fffdf3e92e0, nargs=0, kwargs=0x0) at Objects/call.c:584
#7  0x0000000000351501 in _PyObject_FastCallDict (callable=<built-in method close of _io.FileIO object at remote 0x802cb7440>, args=0x7fffdf3e92e0, nargs=0, kwargs=0x0) at Objects/call.c:103
#8  0x000000000035613e in object_vacall (callable=<built-in method close of _io.FileIO object at remote 0x802cb7440>, vargs=0x7fffdf3e94b0) at Objects/call.c:1200
#9  0x0000000000355f07 in PyObject_CallMethodObjArgs (callable=<built-in method close of _io.FileIO object at remote 0x802cb7440>, name='close') at Objects/call.c:1225
#10 0x0000000000597096 in buffered_close (self=0x8025e2ea8, args=0x0) at ./Modules/_io/bufferedio.c:524
#11 0x00000000003537d9 in _PyMethodDef_RawFastCallDict (method=0x62bc10 <bufferedreader_methods+64>, self=<_io.BufferedReader at remote 0x8025e2ea8>, args=0x0, nargs=0, kwargs=0x0) at Objects/call.c:484
#12 0x000000000035200d in _PyCFunction_FastCallDict (func=<built-in method close of _io.BufferedReader object at remote 0x8025e2ea8>, args=0x0, nargs=0, kwargs=0x0) at Objects/call.c:584
#13 0x0000000000351501 in _PyObject_FastCallDict (callable=<built-in method close of _io.BufferedReader object at remote 0x8025e2ea8>, args=0x0, nargs=0, kwargs=0x0) at Objects/call.c:103
#14 0x0000000000354dc6 in _PyObject_CallFunctionVa (callable=<built-in method close of _io.BufferedReader object at remote 0x8025e2ea8>, format=0x0, va=0x7fffdf3e9950, is_size_t=1) at Objects/call.c:933
#15 0x000000000035558b in callmethod (callable=<built-in method close of _io.BufferedReader object at remote 0x8025e2ea8>, format=0x0, va=0x7fffdf3e9950, is_size_t=1) at Objects/call.c:1029
#16 0x0000000000355d20 in _PyObject_CallMethodId_SizeT (obj=<_io.BufferedReader at remote 0x8025e2ea8>, name=0x62f660 <PyId_close>, format=0x0) at Objects/call.c:1147
#17 0x00000000005a5373 in _io_TextIOWrapper_close_impl (self=0x80270c750) at ./Modules/_io/textio.c:2957
#18 0x00000000005a26c9 in _io_TextIOWrapper_close (self=0x80270c750, _unused_ignored=0x0) at ./Modules/_io/clinic/textio.c.h:552
#19 0x00000000003537d9 in _PyMethodDef_RawFastCallDict (method=0x62e960 <textiowrapper_methods+192>, self=<_io.TextIOWrapper at remote 0x80270c750>, args=0x7fffdf3e9c40, nargs=0, kwargs=0x0) at Objects/call.c:484
#20 0x000000000035200d in _PyCFunction_FastCallDict (func=<built-in method close of _io.TextIOWrapper object at remote 0x80270c750>, args=0x7fffdf3e9c40, nargs=0, kwargs=0x0) at Objects/call.c:584
#21 0x0000000000351501 in _PyObject_FastCallDict (callable=<built-in method close of _io.TextIOWrapper object at remote 0x80270c750>, args=0x7fffdf3e9c40, nargs=0, kwargs=0x0) at Objects/call.c:103
#22 0x000000000035613e in object_vacall (callable=<built-in method close of _io.TextIOWrapper object at remote 0x80270c750>, vargs=0x7fffdf3e9e10) at Objects/call.c:1200
#23 0x0000000000355f07 in PyObject_CallMethodObjArgs (callable=<built-in method close of _io.TextIOWrapper object at remote 0x80270c750>, name='close') at Objects/call.c:1225
#24 0x000000000058c5d7 in iobase_exit (self=<_io.TextIOWrapper at remote 0x80270c750>, args=(None, None, None)) at ./Modules/_io/iobase.c:469
#25 0x000000000035396b in _PyMethodDef_RawFastCallDict (method=0x629270 <iobase_methods+480>, self=<_io.TextIOWrapper at remote 0x80270c750>, args=0x7fffdf3ea4e0, nargs=3, kwargs=0x0) at Objects/call.c:520
#26 0x000000000035200d in _PyCFunction_FastCallDict (func=<built-in method __exit__ of _io.TextIOWrapper object at remote 0x80270c750>, args=0x7fffdf3ea4e0, nargs=3, kwargs=0x0) at Objects/call.c:584
#27 0x0000000000351501 in _PyObject_FastCallDict (callable=<built-in method __exit__ of _io.TextIOWrapper object at remote 0x80270c750>, args=0x7fffdf3ea4e0, nargs=3, kwargs=0x0) at Objects/call.c:103
#28 0x0000000000378239 in _PyEval_EvalFrameDefault (f=Frame 0x802c20630, for file /usr/home/vstinner/prog/python/master/Lib/linecache.py, line 393, in updatecache (), throwflag=0) at Python/ceval.c:3155
...
#107 0x0000000800677776 in thread_start (curthread=0x8027e7a00) at /usr/src/lib/libthr/thread/thr_create.c:292
msg337092 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-03-04 11:30
Sometimes, I can reproduce the crash using:
./python -m test --matchfile=bisect5 test_multiprocessing_spawn --fail-env-changed -F

Using this file:

test.test_multiprocessing_spawn.WithThreadsTestQueue.test_timeout
test.test_multiprocessing_spawn.WithProcessesTestBarrier.test_default_timeout
test.test_multiprocessing_spawn.WithThreadsTestManagerRestart.test_rapid_restart
test.test_multiprocessing_spawn.WithProcessesTestPool.test_release_task_refs
test.test_multiprocessing_spawn.WithManagerTestLock.test_rlock

It seems like the following test is enough to creates a coredump:

test.test_multiprocessing_spawn.WithThreadsTestManagerRestart.test_rapid_restart

Problem: it's really hard to write a *reliable* script/method to trigger the crash. This race condition is very well hidden!
msg337093 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-03-04 12:04
Currently guilty:

commit ef4ac967e2f3a9a18330cc6abe14adb4bc3d0465
Author: Eric Snow <ericsnowcurrently@gmail.com>
Date:   Sun Feb 24 15:40:47 2019 -0800

    bpo-33608: Factor out a private, per-interpreter _Py_AddPendingCall(). (GH-11617)
    
    This involves moving the global "pending calls" state to PyInterpreterState.
msg337095 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-03-04 12:14
Ok, I confirm that the commit ef4ac967e2f3a9a18330cc6abe14adb4bc3d0465 introduced a regression in test.test_multiprocessing_spawn.WithThreadsTestManagerRestart.test_rapid_restart.
msg337220 - (view) Author: Eric Snow (eric.snow) * (Python committer) Date: 2019-03-05 16:16
This is resolved with gh-12159, no?
msg337264 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019-03-06 01:37
> This is resolved with gh-12159, no?

I was waiting to see if buildbot workers feel better. It's the case, so I close the issue.
History
Date User Action Args
2022-04-11 14:59:11adminsetgithub: 80295
2019-03-06 01:37:08vstinnersetstatus: open -> closed
priority: release blocker -> normal
messages: + msg337264

resolution: fixed
stage: resolved
2019-03-05 16:16:30eric.snowsetnosy: + eric.snow
messages: + msg337220
2019-03-04 12:14:52vstinnersetmessages: + msg337095
2019-03-04 12:04:59vstinnersetmessages: + msg337093
2019-03-04 11:31:16vstinnersettitle: test_multiprocessing_spawn: WithProcessesTestSharedMemory dumps core in AMD64 FreeBSD CURRENT Shared 3.x -> test_multiprocessing_spawn dumps core in AMD64 FreeBSD CURRENT Shared 3.x
2019-03-04 11:30:12vstinnersetmessages: + msg337092
2019-03-04 11:03:06vstinnersetmessages: + msg337090
2019-03-04 09:25:16vstinnersettitle: test_multiprocessing_spawn dumps core in AMD64 FreeBSD CURRENT Shared 3.x -> test_multiprocessing_spawn: WithProcessesTestSharedMemory dumps core in AMD64 FreeBSD CURRENT Shared 3.x
2019-03-04 09:06:45pablogsalsetmessages: + msg337084
2019-03-04 08:58:47vstinnersetpriority: normal -> release blocker
nosy: + lukasz.langa
messages: + msg337082

2019-03-04 08:57:43vstinnersetmessages: + msg337081
2019-03-04 08:45:30pablogsalsetmessages: + msg337080
2019-03-04 08:41:57vstinnersetmessages: + msg337078
2019-03-04 08:23:44vstinnersettitle: test_multiprocessing_spawn dumps core in AMD64 FreeBSD CURRENT Shared custom -> test_multiprocessing_spawn dumps core in AMD64 FreeBSD CURRENT Shared 3.x
2019-03-03 20:53:58pablogsalsetmessages: + msg337045
2019-03-03 20:52:49pablogsalsetnosy: + pitrou, vstinner
messages: + msg337044
2019-03-03 20:46:19pablogsalsetmessages: + msg337043
2019-03-03 17:54:12pablogsalsetmessages: + msg337038
title: test_multiprocessing_spawn changes the execution environment -> test_multiprocessing_spawn dumps core in AMD64 FreeBSD CURRENT Shared custom
2019-03-03 16:29:02scotchkasetnosy: + scotchka
messages: + msg337035
2019-02-26 07:03:30pablogsalsetmessages: - msg336620
2019-02-26 07:02:56pablogsalsetmessages: + msg336620
2019-02-26 06:50:34pablogsalsetnosy: + pablogsal
2019-02-26 06:50:24pablogsalsetnosy: + koobs, - pablogsal
2019-02-26 06:14:58pablogsalsetmessages: + msg336616
2019-02-26 06:09:51pablogsalcreate