Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_multiprocessing_spawn dumps core in AMD64 FreeBSD CURRENT Shared 3.x #80295

Closed
pablogsal opened this issue Feb 26, 2019 · 18 comments
Closed
Labels
3.8 only security fixes tests Tests in the Lib/test dir

Comments

@pablogsal
Copy link
Member

BPO 36114
Nosy @pitrou, @vstinner, @ambv, @ericsnowcurrently, @koobs, @pablogsal, @scotchka

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2019-03-06.01:37:08.841>
created_at = <Date 2019-02-26.06:09:51.119>
labels = ['3.8', 'tests']
title = 'test_multiprocessing_spawn dumps core in AMD64 FreeBSD CURRENT Shared 3.x'
updated_at = <Date 2019-03-06.01:37:08.840>
user = 'https://github.com/pablogsal'

bugs.python.org fields:

activity = <Date 2019-03-06.01:37:08.840>
actor = 'vstinner'
assignee = 'none'
closed = True
closed_date = <Date 2019-03-06.01:37:08.841>
closer = 'vstinner'
components = ['Tests']
creation = <Date 2019-02-26.06:09:51.119>
creator = 'pablogsal'
dependencies = []
files = []
hgrepos = []
issue_num = 36114
keywords = []
message_count = 18.0
messages = ['336615', '336616', '337035', '337038', '337043', '337044', '337045', '337078', '337080', '337081', '337082', '337084', '337090', '337092', '337093', '337095', '337220', '337264']
nosy_count = 7.0
nosy_names = ['pitrou', 'vstinner', 'lukasz.langa', 'eric.snow', 'koobs', 'pablogsal', 'scotchka']
pr_nums = []
priority = 'normal'
resolution = 'fixed'
stage = 'resolved'
status = 'closed'
superseder = None
type = None
url = 'https://bugs.python.org/issue36114'
versions = ['Python 3.8']

@pablogsal
Copy link
Member Author

OK (skipped=32)
Warning -- files was modified by test_multiprocessing_spawn
Before: []
After: ['python.core']

https://buildbot.python.org/all/#/builders/168/builds/632/steps/4/logs/stdio

@pablogsal pablogsal added 3.8 only security fixes tests Tests in the Lib/test dir labels Feb 26, 2019
@pablogsal
Copy link
Member Author

@scotchka
Copy link
Mannequin

scotchka mannequin commented Mar 3, 2019

Another example of this, same bot:

https://buildbot.python.org/all/#/builders/168/builds/669

@pablogsal
Copy link
Member Author

After some investigation, this turns out to be more complicated, as this 'python.core' turns to be a core dumped from some of the processes spawned by test_multiprocessing_spawn. So the real problem is that test_multiprocessing_spawn segfaults, as in https://bugs.python.org/issue36116. I think this may be the same underlying problem. I will change the title to reflect this.

@pablogsal pablogsal changed the title test_multiprocessing_spawn changes the execution environment test_multiprocessing_spawn dumps core in AMD64 FreeBSD CURRENT Shared custom Mar 3, 2019
@pablogsal
Copy link
Member Author

I managed to access the core file and this is the traceback:

Thread 3 (LWP 100629):
#0 0x00000008007d4828 in _accept4 () from /lib/libc.so.7
#1 0x0000000800672eda in ?? () from /lib/libthr.so.3
#2 0x00000008016f7b75 in ?? ()
#3 0x0000000800acca10 in ?? ()
#4 0x00007fffdfffcea0 in ?? ()
#5 0x00007fffdfffcea8 in ?? ()
#6 0x00007fffdfffce70 in ?? ()
#7 0x00007fffdfffce70 in ?? ()
#8 0x00000008025485a0 in ?? ()
#9 0x00007fffdfffcdf0 in ?? ()
#10 0x00000008016f820c in ?? ()
#11 0x00007fffdfffcdb0 in ?? ()
#12 0x0000000800384d02 in cfunction_call_varargs (func=0x8016ffb70, args=0x0, kwargs=<optimized out>) at Objects/call.c:770
#13 PyCFunction_Call (func=0x8016ffb70, args=0x0, kwargs=<optimized out>) at Objects/call.c:786
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 2 (LWP 101669):
#0 0x0000000801e0495d in CRYPTO_free () from /usr/local/lib/libcrypto.so.11
#1 0x0000000801de09bc in ?? () from /usr/local/lib/libcrypto.so.11
#2 0x0000000801e005b7 in OPENSSL_cleanup () from /usr/local/lib/libcrypto.so.11
#3 0x0000000800825ab1 in __cxa_finalize () from /lib/libc.so.7
#4 0x00000008007b2791 in exit () from /lib/libc.so.7
#5 0x00000008004ca84e in Py_Exit (sts=0) at Python/pylifecycle.c:2166
#6 0x00000008004d6ffb in handle_system_exit () at Python/pythonrun.c:641
#7 0x00000008004d6b07 in PyErr_PrintEx (set_sys_last_vars=1) at Python/pythonrun.c:651
#8 0x00000008004d698e in PyErr_Print () at Python/pythonrun.c:547
#9 PyRun_SimpleStringFlags (command=0x80150de38 "from multiprocessing.spawn import spawn_main; spawn_main(tracker_fd=4, pipe_handle=9)\n", flags=0x7fffffffe860) at Python/pythonrun.c:462
#10 0x00000008004f7a39 in pymain_run_command (command=<optimized out>, cf=0x0) at Modules/main.c:527
#11 pymain_run_python (interp=<optimized out>, exitcode=<optimized out>) at Modules/main.c:804
#12 pymain_main (args=<optimized out>) at Modules/main.c:896
#13 0x00000008004f84f7 in _Py_UnixMain (argc=<optimized out>, argv=0x801bdb848) at Modules/main.c:937
#14 0x0000000000201120 in _start (ap=<optimized out>, cleanup=<optimized out>) at /usr/src/lib/csu/amd64/crt1.c:76

Thread 1 (LWP 100922):
#0 0x000000080047ca84 in take_gil (tstate=0x80262ba10) at Python/ceval_gil.h:216
#1 0x000000080047d074 in PyEval_RestoreThread (tstate=0x80262ba10) at Python/ceval.c:281
#2 0x00000008004f3325 in _Py_write_impl (fd=5, buf=0x8025e3e00, count=29, gil_held=1) at Python/fileutils.c:1558
#3 0x00000008005051aa in os_write_impl (module=<optimized out>, fd=<optimized out>, data=<optimized out>) at ./Modules/posixmodule.c:8798
#4 os_write (module=<optimized out>, args=0x8015b3ba8, nargs=<optimized out>) at ./Modules/clinic/posixmodule.c.h:4182
#5 0x0000000800385a5d in _PyMethodDef_RawFastCallKeywords (method=<optimized out>, self=0x80127a5f0, args=<optimized out>, nargs=2, kwnames=<optimized out>) at Objects/call.c:653
#6 0x00000008003847de in _PyCFunction_FastCallKeywords (func=0x8012820c0, args=0x0, nargs=0, kwnames=0xdbdbdbdbdbdbdbdb) at Objects/call.c:732
#7 0x000000080048e1f5 in call_function (pp_stack=0x7fffdf7faf98, oparg=<optimized out>, kwnames=0x0) at Python/ceval.c:4673
#8 0x00000008004892fa in _PyEval_EvalFrameDefault (f=0x8015b3a00, throwflag=<optimized out>) at Python/ceval.c:3294
#9 0x000000080048f03a in PyEval_EvalFrameEx (f=<optimized out>, throwflag=<error reading variable: Cannot access memory at address 0x0>) at Python/ceval.c:624
#10 _PyEval_EvalCodeWithName (_co=<optimized out>, globals=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=2, kwnames=0x0, kwargs=0x8025e25f8, kwcount=0, kwstep=1, defs=0x802567798,
defcount=1, kwdefs=0x0, closure=0x0, name=0x8017e2930, qualname=0x80255b7c0) at Python/ceval.c:4035
#11 0x0000000800384690 in _PyFunction_FastCallKeywords (func=<optimized out>, stack=<optimized out>, nargs=0, kwnames=<optimized out>) at Objects/call.c:435
#12 0x000000080048e35f in call_function (pp_stack=0x7fffdf7fb2f0, oparg=<optimized out>, kwnames=0x0) at Python/ceval.c:4721

@pablogsal
Copy link
Member Author

Apparently is the Thread 1 the one that is causing the core dump

@pablogsal
Copy link
Member Author

This is the state of the thread interpreter:

gdb) p tstate
$3 = (PyThreadState *) 0x80262ba10
(gdb) p tstate->interp
$4 = (PyInterpreterState *) 0xdbdbdbdbdbdbdbdb

@vstinner vstinner changed the title test_multiprocessing_spawn dumps core in AMD64 FreeBSD CURRENT Shared custom test_multiprocessing_spawn dumps core in AMD64 FreeBSD CURRENT Shared 3.x Mar 4, 2019
@vstinner
Copy link
Member

vstinner commented Mar 4, 2019

I can reproduce the crash on my FreeBSD 12 VM:

vstinner@freebsd$ ./python -m test --fail-env-changed test_multiprocessing_spawn -v

FAIL: test_mymanager_context_prestarted (test.test_multiprocessing_spawn.WithManagerTestMyManager)
----------------------------------------------------------------------

Traceback (most recent call last):
  File "/usr/home/vstinner/prog/python/master/Lib/test/_test_multiprocessing.py", line 2754, in test_mymanager_context_prestarted
    self.assertEqual(manager._process.exitcode, 0)
AssertionError: -10 != 0

Warning -- files was modified by test_multiprocessing_spawn
Before: []
After: ['python.8184.core']

@pablogsal
Copy link
Member Author

Interesting, I tried several hours to reproduce the crash on the buildbot itself manually and I could not do it.

@vstinner
Copy link
Member

vstinner commented Mar 4, 2019

I recently (Feb 25) fixed the config of this slow buildbot to change the timeout from 15 min to 20 min:

python/buildmaster-config@37cf09c
python/buildmaster-config@e4155a7

It seems like there is a correlation between this buildbot config change and the buildbot starting to create a coredump file. The test started to create a core dump around build 624, Feb 25.

--

With coredump.

Warning -- files was modified by test_multiprocessing_spawn:

https://buildbot.python.org/all/#/builders/168/builds/624
# 625 was fine
https://buildbot.python.org/all/#/builders/168/builds/626
https://buildbot.python.org/all/#/builders/168/builds/628
https://buildbot.python.org/all/#/builders/168/builds/629
https://buildbot.python.org/all/#/builders/168/builds/630
https://buildbot.python.org/all/#/builders/168/builds/631

--

Without coredump: fail but no coredump.

ERROR: test_shared_memory_SharedMemoryManager_basics (test.test_multiprocessing_forkserver.WithProcessesTestSharedMemory)
ERROR: test_shared_memory_across_processes (test.test_multiprocessing_forkserver.WithProcessesTestSharedMemory)
ERROR: test_shared_memory_basics (test.test_multiprocessing_forkserver.WithProcessesTestSharedMemory)
ERROR: test_shared_memory_ShareableList_basics (test.test_multiprocessing_spawn.WithProcessesTestSharedMemory)
ERROR: test_shared_memory_ShareableList_pickling (test.test_multiprocessing_spawn.WithProcessesTestSharedMemory)
ERROR: test_shared_memory_SharedMemoryManager_basics (test.test_multiprocessing_spawn.WithProcessesTestSharedMemory)
ERROR: test_shared_memory_across_processes (test.test_multiprocessing_spawn.WithProcessesTestSharedMemory)
ERROR: test_shared_memory_basics (test.test_multiprocessing_spawn.WithProcessesTestSharedMemory)
ERROR: test_shared_memory_ShareableList_basics (test.test_multiprocessing_fork.WithProcessesTestSharedMemory)
ERROR: test_shared_memory_ShareableList_pickling (test.test_multiprocessing_fork.WithProcessesTestSharedMemory)
ERROR: test_shared_memory_SharedMemoryManager_basics (test.test_multiprocessing_fork.WithProcessesTestSharedMemory)
ERROR: test_shared_memory_across_processes (test.test_multiprocessing_fork.WithProcessesTestSharedMemory)
ERROR: test_shared_memory_basics (test.test_multiprocessing_fork.WithProcessesTestSharedMemory)
Re-running failed tests in verbose mode

https://buildbot.python.org/all/#/builders/168/builds/617
https://buildbot.python.org/all/#/builders/168/builds/618
https://buildbot.python.org/all/#/builders/168/builds/620
https://buildbot.python.org/all/#/builders/168/builds/622
https://buildbot.python.org/all/#/builders/168/builds/623

@vstinner
Copy link
Member

vstinner commented Mar 4, 2019

I set the priority to release blocker to remind to fix this regression. I guess that it's the same bug than bpo-36116, but since it's a different OS (FreeBSD / Windows), I prefer to continue to use separated issues.

@pablogsal
Copy link
Member Author

Can you paste traceback of the dump that you were able to generate in your FreeBSD? I wonder if you can get the symbols that the one I pasted were missing (likely in libthr.so).

@vstinner vstinner changed the title test_multiprocessing_spawn dumps core in AMD64 FreeBSD CURRENT Shared 3.x test_multiprocessing_spawn: WithProcessesTestSharedMemory dumps core in AMD64 FreeBSD CURRENT Shared 3.x Mar 4, 2019
@vstinner
Copy link
Member

vstinner commented Mar 4, 2019

Example of crash:

  • Thread 2 exit Python: PyRun_SimpleStringFlags->PyErr_PrintEx->handle_system_exit->Py_Exit->OPENSSL_cleanup...
  • Thread 1 tries to acquire the GIL to close a file
  • Thread 3 is waiting in socket.socket.accept()

Problem: PyThreadState of Thread 1 is corrupted: its memory has been freed.

Thread 1 got the crash

(gdb) p tstate
$2 = (PyThreadState *) 0x8027e2050
(gdb) p *tstate
$3 = {
prev = 0xdbdbdbdbdbdbdbdb,
next = 0xdbdbdbdbdbdbdbdb,
interp = 0xdbdbdbdbdbdbdbdb,
...
}

(gdb) p _PyRuntime.gilstate.tstate_current
$4 = {
_value = 0
}

(gdb) thread apply all where

Thread 3 (LWP 100714):
#0 _accept4 () at _accept4.S:3
#1 0x0000000800679eaa in __thr_accept4 (s=5, addr=0x7fffdfff45f8, addrlen=0x7fffdfff45f0, flags=268435456) at /usr/src/lib/libthr/thread/thr_syscalls.c:126
#2 0x00000008016f9b55 in sock_accept_impl (s=0x801bda050, data=0x7fffdfff45c0) at /usr/home/vstinner/prog/python/master/Modules/socketmodule.c:2592
#3 0x00000008016fa1ec in sock_call_ex (s=0x801bda050, writing=0, sock_func=0x8016f9b00 <sock_accept_impl>, data=0x7fffdfff45c0, connect=0, err=0x0, timeout=-1000000000) at /usr/home/vstinner/prog/python/master/Modules/socketmodule.c:886
#4 0x00000008016f9aef in sock_call (s=0x801bda050, writing=0, func=0x8016f9b00 <sock_accept_impl>, data=0x7fffdfff45c0) at /usr/home/vstinner/prog/python/master/Modules/socketmodule.c:938
#5 0x00000008016f70e3 in sock_accept (s=0x801bda050, _unused_ignored=0x0) at /usr/home/vstinner/prog/python/master/Modules/socketmodule.c:2634
#6 0x000000000035415c in _PyMethodDef_RawFastCallKeywords (method=0x801701b70 <sock_methods>, self=<socket at remote 0x801bda050>, args=0x8027a01f8, nargs=0, kwnames=0x0) at Objects/call.c:631
#7 0x00000000004d3841 in _PyMethodDescr_FastCallKeywords (descrobj=<method_descriptor at remote 0x8016df360>, args=0x8027a01f0, nargs=1, kwnames=0x0) at Objects/descrobject.c:290
#8 0x000000000037e20e in call_function (pp_stack=0x7fffdfff4cb8, oparg=1, kwnames=0x0) at Python/ceval.c:4698
#9 0x0000000000378dec in _PyEval_EvalFrameDefault (f=Frame 0x8027a0050, for file /usr/home/vstinner/prog/python/master/Lib/socket.py, line 212, in accept (), throwflag=0) at Python/ceval.c:3280
#10 0x0000000000369595 in PyEval_EvalFrameEx (f=Frame 0x8027a0050, for file /usr/home/vstinner/prog/python/master/Lib/socket.py, line 212, in accept (), throwflag=0) at Python/ceval.c:624
...
#53 0x0000000800677776 in thread_start (curthread=0x80265a300) at /usr/src/lib/libthr/thread/thr_create.c:292

Thread 2 (LWP 100120):
#0 0x000000080076e981 in __je_tcache_event_hard (tsd=0x8005ec090, tcache=0x8005ec250) at jemalloc_tcache.c:54
#1 0x00000008007b05e3 in tcache_event (tsd=<optimized out>, tcache=<optimized out>) at /usr/src/contrib/jemalloc/include/jemalloc/internal/tcache_inlines.h:37
#2 tcache_dalloc_small (tsd=<optimized out>, tcache=<optimized out>, ptr=<optimized out>, binind=<optimized out>, slow_path=false) at /usr/src/contrib/jemalloc/include/jemalloc/internal/tcache_inlines.h:185
#3 arena_dalloc (tcache=<optimized out>, slow_path=false, tsdn=<optimized out>, ptr=<optimized out>, alloc_ctx=<optimized out>) at /usr/src/contrib/jemalloc/include/jemalloc/internal/arena_inlines_b.h:224
#4 idalloctm (slow_path=false, tsdn=<optimized out>, ptr=<optimized out>, tcache=<optimized out>, alloc_ctx=<optimized out>, is_internal=<optimized out>) at /usr/src/contrib/jemalloc/include/jemalloc/internal/jemalloc_internal_inlines_c.h:118
#5 ifree (tsd=<optimized out>, ptr=<optimized out>, tcache=<optimized out>, slow_path=false) at jemalloc_jemalloc.c:2226
#6 __free (ptr=<optimized out>) at jemalloc_jemalloc.c:2397
#7 0x0000000801fabbd4 in OPENSSL_LH_free (lh=0x801b6c0c0) at /usr/src/crypto/openssl/crypto/lhash/lhash.c:88
#8 0x0000000801f2d6ac in lh_ERR_STRING_DATA_free (lh=<optimized out>) at /usr/src/crypto/openssl/include/openssl/err.h:217
#9 err_cleanup () at /usr/src/crypto/openssl/crypto/err/err.c:289
#10 0x0000000801fde3a7 in OPENSSL_cleanup () at /usr/src/crypto/openssl/crypto/init.c:569
#11 0x000000080082a0c5 in __cxa_finalize (dso=0x0) at /usr/src/lib/libc/stdlib/atexit.c:239
#12 0x00000008007b9cc1 in exit (status=0) at /usr/src/lib/libc/stdlib/exit.c:74
#13 0x000000000045c6e8 in Py_Exit (sts=0) at Python/pylifecycle.c:2166
#14 0x000000000041c403 in handle_system_exit () at Python/pythonrun.c:641
#15 0x000000000041bf86 in PyErr_PrintEx (set_sys_last_vars=1) at Python/pythonrun.c:651
#16 0x000000000041b20e in PyErr_Print () at Python/pythonrun.c:547
#17 0x000000000041be48 in PyRun_SimpleStringFlags (command=0x80134b460 "from multiprocessing.spawn import spawn_main; spawn_main(tracker_fd=6, pipe_handle=9)\n", flags=0x7fffffffe7f0) at Python/pythonrun.c:462
#18 0x00000000002cb691 in pymain_run_command (command=0x800acd910 L"from multiprocessing.spawn import spawn_main; spawn_main(tracker_fd=6, pipe_handle=9)\n", cf=0x7fffffffe7f0) at Modules/main.c:527
#19 0x00000000002cadb5 in pymain_run_python (interp=0x800b08010, exitcode=0x7fffffffe8cc) at Modules/main.c:804
#20 0x00000000002ca6ba in pymain_main (args=0x7fffffffe928) at Modules/main.c:896
#21 0x00000000002ca788 in _Py_UnixMain (argc=4, argv=0x7fffffffe9e8) at Modules/main.c:937
#22 0x00000000002c9372 in main (argc=4, argv=0x7fffffffe9e8) at ./Programs/python.c:16

Thread 1 (LWP 100696):
#0 0x0000000000368210 in take_gil (tstate=0x8027e2050) at Python/ceval_gil.h:216
#1 0x0000000000368a94 in PyEval_RestoreThread (tstate=0x8027e2050) at Python/ceval.c:281
#2 0x000000000058fdbe in internal_close (self=0x802cb7440) at ./Modules/_io/fileio.c:121
#3 0x000000000058fcd3 in _io_FileIO_close_impl (self=0x802cb7440) at ./Modules/_io/fileio.c:163
#4 0x000000000058ece9 in _io_FileIO_close (self=0x802cb7440, _unused_ignored=0x0) at ./Modules/_io/clinic/fileio.c.h:23
#5 0x00000000003537d9 in _PyMethodDef_RawFastCallDict (method=0x6298f0 <fileio_methods+224>, self=<_io.FileIO at remote 0x802cb7440>, args=0x7fffdf3e92e0, nargs=0, kwargs=0x0) at Objects/call.c:484
#6 0x000000000035200d in _PyCFunction_FastCallDict (func=<built-in method close of _io.FileIO object at remote 0x802cb7440>, args=0x7fffdf3e92e0, nargs=0, kwargs=0x0) at Objects/call.c:584
#7 0x0000000000351501 in _PyObject_FastCallDict (callable=<built-in method close of _io.FileIO object at remote 0x802cb7440>, args=0x7fffdf3e92e0, nargs=0, kwargs=0x0) at Objects/call.c:103
#8 0x000000000035613e in object_vacall (callable=<built-in method close of _io.FileIO object at remote 0x802cb7440>, vargs=0x7fffdf3e94b0) at Objects/call.c:1200
#9 0x0000000000355f07 in PyObject_CallMethodObjArgs (callable=<built-in method close of _io.FileIO object at remote 0x802cb7440>, name='close') at Objects/call.c:1225
#10 0x0000000000597096 in buffered_close (self=0x8025e2ea8, args=0x0) at ./Modules/_io/bufferedio.c:524
#11 0x00000000003537d9 in _PyMethodDef_RawFastCallDict (method=0x62bc10 <bufferedreader_methods+64>, self=<_io.BufferedReader at remote 0x8025e2ea8>, args=0x0, nargs=0, kwargs=0x0) at Objects/call.c:484
#12 0x000000000035200d in _PyCFunction_FastCallDict (func=<built-in method close of _io.BufferedReader object at remote 0x8025e2ea8>, args=0x0, nargs=0, kwargs=0x0) at Objects/call.c:584
#13 0x0000000000351501 in _PyObject_FastCallDict (callable=<built-in method close of _io.BufferedReader object at remote 0x8025e2ea8>, args=0x0, nargs=0, kwargs=0x0) at Objects/call.c:103
#14 0x0000000000354dc6 in _PyObject_CallFunctionVa (callable=<built-in method close of _io.BufferedReader object at remote 0x8025e2ea8>, format=0x0, va=0x7fffdf3e9950, is_size_t=1) at Objects/call.c:933
#15 0x000000000035558b in callmethod (callable=<built-in method close of _io.BufferedReader object at remote 0x8025e2ea8>, format=0x0, va=0x7fffdf3e9950, is_size_t=1) at Objects/call.c:1029
#16 0x0000000000355d20 in _PyObject_CallMethodId_SizeT (obj=<_io.BufferedReader at remote 0x8025e2ea8>, name=0x62f660 <PyId_close>, format=0x0) at Objects/call.c:1147
#17 0x00000000005a5373 in _io_TextIOWrapper_close_impl (self=0x80270c750) at ./Modules/_io/textio.c:2957
#18 0x00000000005a26c9 in _io_TextIOWrapper_close (self=0x80270c750, _unused_ignored=0x0) at ./Modules/_io/clinic/textio.c.h:552
#19 0x00000000003537d9 in _PyMethodDef_RawFastCallDict (method=0x62e960 <textiowrapper_methods+192>, self=<io.TextIOWrapper at remote 0x80270c750>, args=0x7fffdf3e9c40, nargs=0, kwargs=0x0) at Objects/call.c:484
#20 0x000000000035200d in _PyCFunction_FastCallDict (func=<built-in method close of _io.TextIOWrapper object at remote 0x80270c750>, args=0x7fffdf3e9c40, nargs=0, kwargs=0x0) at Objects/call.c:584
#21 0x0000000000351501 in _PyObject_FastCallDict (callable=<built-in method close of _io.TextIOWrapper object at remote 0x80270c750>, args=0x7fffdf3e9c40, nargs=0, kwargs=0x0) at Objects/call.c:103
#22 0x000000000035613e in object_vacall (callable=<built-in method close of _io.TextIOWrapper object at remote 0x80270c750>, vargs=0x7fffdf3e9e10) at Objects/call.c:1200
#23 0x0000000000355f07 in PyObject_CallMethodObjArgs (callable=<built-in method close of _io.TextIOWrapper object at remote 0x80270c750>, name='close') at Objects/call.c:1225
#24 0x000000000058c5d7 in iobase_exit (self=<io.TextIOWrapper at remote 0x80270c750>, args=(None, None, None)) at ./Modules/io/iobase.c:469
#25 0x000000000035396b in _PyMethodDef_RawFastCallDict (method=0x629270 <iobase_methods+480>, self=<io.TextIOWrapper at remote 0x80270c750>, args=0x7fffdf3ea4e0, nargs=3, kwargs=0x0) at Objects/call.c:520
#26 0x000000000035200d in _PyCFunction_FastCallDict (func=<built-in method __exit
of _io.TextIOWrapper object at remote 0x80270c750>, args=0x7fffdf3ea4e0, nargs=3, kwargs=0x0) at Objects/call.c:584
#27 0x0000000000351501 in _PyObject_FastCallDict (callable=<built-in method __exit
of _io.TextIOWrapper object at remote 0x80270c750>, args=0x7fffdf3ea4e0, nargs=3, kwargs=0x0) at Objects/call.c:103
#28 0x0000000000378239 in _PyEval_EvalFrameDefault (f=Frame 0x802c20630, for file /usr/home/vstinner/prog/python/master/Lib/linecache.py, line 393, in updatecache (), throwflag=0) at Python/ceval.c:3155
...
#107 0x0000000800677776 in thread_start (curthread=0x8027e7a00) at /usr/src/lib/libthr/thread/thr_create.c:292

@vstinner
Copy link
Member

vstinner commented Mar 4, 2019

Sometimes, I can reproduce the crash using:
./python -m test --matchfile=bisect5 test_multiprocessing_spawn --fail-env-changed -F

Using this file:

test.test_multiprocessing_spawn.WithThreadsTestQueue.test_timeout
test.test_multiprocessing_spawn.WithProcessesTestBarrier.test_default_timeout
test.test_multiprocessing_spawn.WithThreadsTestManagerRestart.test_rapid_restart
test.test_multiprocessing_spawn.WithProcessesTestPool.test_release_task_refs
test.test_multiprocessing_spawn.WithManagerTestLock.test_rlock

It seems like the following test is enough to creates a coredump:

test.test_multiprocessing_spawn.WithThreadsTestManagerRestart.test_rapid_restart

Problem: it's really hard to write a *reliable* script/method to trigger the crash. This race condition is very well hidden!

@vstinner vstinner changed the title test_multiprocessing_spawn: WithProcessesTestSharedMemory dumps core in AMD64 FreeBSD CURRENT Shared 3.x test_multiprocessing_spawn dumps core in AMD64 FreeBSD CURRENT Shared 3.x Mar 4, 2019
@vstinner
Copy link
Member

vstinner commented Mar 4, 2019

Currently guilty:

commit ef4ac96
Author: Eric Snow <ericsnowcurrently@gmail.com>
Date: Sun Feb 24 15:40:47 2019 -0800

bpo-33608: Factor out a private, per-interpreter _Py_AddPendingCall(). (GH-11617)

This involves moving the global "pending calls" state to PyInterpreterState.

@vstinner
Copy link
Member

vstinner commented Mar 4, 2019

Ok, I confirm that the commit ef4ac96 introduced a regression in test.test_multiprocessing_spawn.WithThreadsTestManagerRestart.test_rapid_restart.

@ericsnowcurrently
Copy link
Member

This is resolved with #56368, no?

@vstinner
Copy link
Member

vstinner commented Mar 6, 2019

This is resolved with #56368, no?

I was waiting to see if buildbot workers feel better. It's the case, so I close the issue.

@vstinner vstinner closed this as completed Mar 6, 2019
@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.8 only security fixes tests Tests in the Lib/test dir
Projects
None yet
Development

No branches or pull requests

3 participants