classification
Title: repeated Py_Initialize/PyRun_SimpleString/Py_Finalize segfaults
Type: crash Stage: resolved
Components: Interpreter Core Versions: Python 3.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Jelle Zijlstra, brett.cannon, ned.deily, python-dev, smcv, xiang.zhang
Priority: Keywords: patch

Created on 2016-08-11 08:46 by smcv, last changed 2016-08-16 04:24 by ned.deily. This issue is now closed.

Files
File name Uploaded Description Edit
issue27736.patch xiang.zhang, 2016-08-13 13:52 review
Messages (13)
msg272423 - (view) Author: Simon McVittie (smcv) Date: 2016-08-11 08:46
dbus-python has a regression test for https://bugs.freedesktop.org/show_bug.cgi?id=23831 which repeatedly initializes the interpreter, imports dbus and finalizes the interpreter. This test passes in Python up to 3.5, but is failing under Python 3.6 nightly builds on Travis-CI, and in Python 3.6.0a3 (package version 3.6.0~a3-1) on Debian.

I've been able to reproduce the crash without anything specific to dbus with this C code:

#include <stdio.h>

#include <Python.h>

int main(void)
{
    int i;

    puts("1..1");

    for (i = 0; i < 100; ++i) {
        Py_Initialize();
        if (PyRun_SimpleString("\n") != 0) {
            puts("not ok 1 - there was an exception");
            return 1;
        }
        Py_Finalize();
    }

    puts("ok 1 - was able to loop 100 times");

    return 0;
}

It appears the crash is reliably in the 10th repeat:

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff783a4bb in type_dealloc (type=0x7ffff7d8ad80 <DirEntryType>)
    at ../Objects/typeobject.c:3032
3032	../Objects/typeobject.c: No such file or directory.
(gdb) bt
#0  0x00007ffff783a4bb in type_dealloc (type=0x7ffff7d8ad80 <DirEntryType>)
    at ../Objects/typeobject.c:3032
#1  0x00007ffff7817a1b in insertdict (value=<optimized out>, hash=<optimized out>, 
    key=<optimized out>, mp=<optimized out>) at ../Objects/dictobject.c:806
#2  PyDict_SetItem (
    op=op@entry={'open': None, 'O_DIRECT': None, 'chdir': None, 'O_ACCMODE': None, '__package__': None, 'WCOREDUMP': None, 'setgroups': None, 'O_CREAT': None, 'O_CLOEXEC': None, 'chown': None, 'sched_getscheduler': None, 'RTLD_NODELETE': None, 'terminal_size': None, 'EX_IOERR': None, 'sched_setaffinity': None, 'XATTR_SIZE_MAX': None, 'fstat': None, 'sched_rr_get_interval': None, 'O_LARGEFILE': None, 'times_result': None, 'get_inheritable': None, 'WIFEXITED': None, 'ST_NODEV': None, 'forkpty': None, 'ctermid': None, 'O_RSYNC': None, 'SCHED_FIFO': None, 'stat': None, 'replace': None, 'EX_NOINPUT': None, 'WUNTRACED': None, 'set_blocking': None, '_have_functions': None, 'unsetenv': None, 'setresgid': None, 'fchown': None, 'getgrouplist': None, 'openpty': None, 'lockf': None, 'chroot': None, 'readv': None, 'EX_NOHOST': None, 'error': None, 'WEXITSTATUS': None, 'WIFSIGNALED': None, 'WNOHANG': None, 'POSIX_FADV_WILLNEED': None, 'SEEK_HOLE': None, 'dup': None, 'POSIX_FADV_NOREUSE': None, 'kill': None, 'statvfs_result': None, 'WIFCON...(truncated), 
    key='DirEntry', value=value@entry=None) at ../Objects/dictobject.c:1228
#3  0x00007ffff782659c in _PyModule_ClearDict (
    d={'open': None, 'O_DIRECT': None, 'chdir': None, 'O_ACCMODE': None, '__package__': None, 'WCOREDUMP': None, 'setgroups': None, 'O_CREAT': None, 'O_CLOEXEC': None, 'chown': None, 'sched_getscheduler': None, 'RTLD_NODELETE': None, 'terminal_size': None, 'EX_IOERR': None, 'sched_setaffinity': None, 'XATTR_SIZE_MAX': None, 'fstat': None, 'sched_rr_get_interval': None, 'O_LARGEFILE': None, 'times_result': None, 'get_inheritable': None, 'WIFEXITED': None, 'ST_NODEV': None, 'forkpty': None, 'ctermid': None, 'O_RSYNC': None, 'SCHED_FIFO': None, 'stat': None, 'replace': None, 'EX_NOINPUT': None, 'WUNTRACED': None, 'set_blocking': None, '_have_functions': None, 'unsetenv': None, 'setresgid': None, 'fchown': None, 'getgrouplist': None, 'openpty': None, 'lockf': None, 'chroot': None, 'readv': None, 'EX_NOHOST': None, 'error': None, 'WEXITSTATUS': None, 'WIFSIGNALED': None, 'WNOHANG': None, 'POSIX_FADV_WILLNEED': None, 'SEEK_HOLE': None, 'dup': None, 'POSIX_FADV_NOREUSE': None, 'kill': None, 'statvfs_result': None, 'WIFCON...(truncated))
    at ../Objects/moduleobject.c:593
#4  0x00007ffff782672e in _PyModule_Clear (m=m@entry=<module at remote 0x7ffff65e6598>)
    at ../Objects/moduleobject.c:544
#5  0x00007ffff78d5874 in PyImport_Cleanup () at ../Python/import.c:452
#6  0x00007ffff78e3a38 in Py_FinalizeEx () at ../Python/pylifecycle.c:588
#7  0x0000000000400795 in main () at /home/smcv/src/dbus-python/test/import-repeatedly.c:19
(gdb) frame 7
#7  0x0000000000400795 in main () at /home/smcv/src/dbus-python/test/import-repeatedly.c:19
19	        Py_Finalize();
(gdb) p i
$1 = 10
msg272424 - (view) Author: Simon McVittie (smcv) Date: 2016-08-11 08:48
This might be a duplicate of https://bugs.python.org/issue24853 but there wasn't enough detail on that bug for me to be sure.
msg272560 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2016-08-12 19:51
Thanks for the good test, Simon.  Bisection points to b841972ed0bd for Issue27038 as the culprit here.  (It was pushed between 3.6.0a2 and a3.)  Brett, Jelle, can you please take a look at this?  I'm going to keep it as a Release Blocker for now.
msg272564 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2016-08-12 21:01
The revision that Ned found only changed posixmodule.c by adding PyModule_AddObject(m, "DirEntry", (PyObject *)&DirEntryType); at the end of the module initializer. Only thing I can think of is it needs to go into the `if (!initialized)` block: https://github.com/python/cpython/blob/11b48001fb8d908f6db164e9d2233e911f22d4f4/Modules/posixmodule.c#L13214 (maybe as an else clause for the PyType_Ready() call at https://github.com/python/cpython/blob/11b48001fb8d908f6db164e9d2233e911f22d4f4/Modules/posixmodule.c#L13259).
msg272586 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-08-13 13:11
It looks to me from the traceback Simon given that DirEntryType's dealloc function is called. From the code, shouldn't we Py_INCREF DirEntryType before PyModule_AddObject?
msg272587 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-08-13 13:52
Try Simon's snippet I think I am right.

With the current code, the result is:

1..1
Segmentation fault (core dumped)

With Py_INCREF:

1..1
ok 1 - was able to loop 100 times

I upload the working patch.
msg272612 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2016-08-13 20:17
Thanks for the patch.  FWIW, that does seem to solve the crash. Brett, look good to you?  If so, I can push the fix.
msg272776 - (view) Author: Brett Cannon (brett.cannon) * (Python committer) Date: 2016-08-15 16:23
Patch LGTM.
msg272791 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016-08-15 18:41
New changeset a95d98086621 by Ned Deily in branch 'default':
Issue #27736: Prevent segfault after interpreter re-initialization due
https://hg.python.org/cpython/rev/a95d98086621
msg272794 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2016-08-15 19:50
I've applied the patch for 3.6.0a4 so the immediate issue is fixed.  I wonder if it would be worthwhile adapting Simon's test case to add to the embedding test cases in Lib/test/test_capi.py.  I'm going to leave the issue open for a while in case someone is interested in doing that.
msg272820 - (view) Author: Xiang Zhang (xiang.zhang) * (Python committer) Date: 2016-08-16 03:19
We don't have to change Lib/test/test_capi.py. It has already got a repeated init and fini test[1]. All we need to do is increase the loop number[2]. I have tried with 15, the test case fails reliably. And with the previous patch applied, it won't fail.

[1] https://hg.python.org/cpython/file/tip/Lib/test/test_capi.py#l386
[2] https://hg.python.org/cpython/file/tip/Programs/_testembed.c#l44
msg272824 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016-08-16 04:19
New changeset 4f69626fd923 by Ned Deily in branch 'default':
Issue #27736: Improve the existing embedded interpreter init/fini test
https://hg.python.org/cpython/rev/4f69626fd923
msg272825 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2016-08-16 04:24
Excellent!  Thanks for digging into that.  Increasing the iterations seems to work fine within the (understandable) limitations of the existing embedded tests, e.g. only work from a source tree build (not an installed build nor from a separate build directory) etc.  Case closed.
History
Date User Action Args
2016-08-16 04:24:58ned.deilysetstatus: open -> closed
resolution: fixed
messages: + msg272825

stage: test needed -> resolved
2016-08-16 04:19:11python-devsetmessages: + msg272824
2016-08-16 03:19:33xiang.zhangsetmessages: + msg272820
2016-08-15 19:50:59ned.deilysetpriority: release blocker ->
assignee: ned.deily ->
messages: + msg272794

stage: commit review -> test needed
2016-08-15 18:41:57python-devsetnosy: + python-dev
messages: + msg272791
2016-08-15 16:23:05brett.cannonsetassignee: brett.cannon -> ned.deily
messages: + msg272776
stage: patch review -> commit review
2016-08-13 20:17:05ned.deilysetmessages: + msg272612
stage: needs patch -> patch review
2016-08-13 13:52:01xiang.zhangsetfiles: + issue27736.patch
keywords: + patch
messages: + msg272587
2016-08-13 13:11:46xiang.zhangsetnosy: + xiang.zhang
messages: + msg272586
2016-08-12 21:01:18brett.cannonsetmessages: + msg272564
2016-08-12 19:51:06ned.deilysetnosy: + brett.cannon, Jelle Zijlstra
messages: + msg272560

assignee: brett.cannon
stage: needs patch
2016-08-12 06:31:07gregory.p.smithsetpriority: normal -> release blocker
nosy: + ned.deily
2016-08-11 08:48:20smcvsetmessages: + msg272424
2016-08-11 08:46:27smcvcreate