Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FreeBSD: test_threading: test_recursion_limit() crash with SIGSEGV and create a coredump #82087

Closed
vstinner opened this issue Aug 21, 2019 · 13 comments
Labels
3.9 only security fixes tests Tests in the Lib/test dir

Comments

@vstinner
Copy link
Member

BPO 37906
Nosy @ronaldoussoren, @vstinner, @xdegaye
Files
  • stack.py
  • stack2.py
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2019-08-21.12:45:23.346>
    labels = ['tests', '3.9']
    title = 'FreeBSD: test_threading: test_recursion_limit() crash with SIGSEGV and create a coredump'
    updated_at = <Date 2019-11-21.10:23:28.330>
    user = 'https://github.com/vstinner'

    bugs.python.org fields:

    activity = <Date 2019-11-21.10:23:28.330>
    actor = 'xdegaye'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Tests']
    creation = <Date 2019-08-21.12:45:23.346>
    creator = 'vstinner'
    dependencies = []
    files = ['48553', '48554']
    hgrepos = []
    issue_num = 37906
    keywords = []
    message_count = 7.0
    messages = ['350079', '350081', '350082', '350085', '350088', '350089', '357153']
    nosy_count = 3.0
    nosy_names = ['ronaldoussoren', 'vstinner', 'xdegaye']
    pr_nums = []
    priority = 'normal'
    resolution = None
    stage = None
    status = 'open'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue37906'
    versions = ['Python 3.9']

    @vstinner
    Copy link
    Member Author

    On my FreeBSD 12.0-RELEASE-p10 VM, test_threading.test_recursion_limit() does crash with SIGSEGV and create a coredump.

    vstinner@freebsd$ ./python -m test -v test_threading -m test_recursion_limit
    == CPython 3.9.0a0 (heads/master:e0b6117e27, Aug 21 2019, 12:23:28) [Clang 6.0.1 (tags/RELEASE_601/final 335540)]
    == FreeBSD-12.0-RELEASE-p10-amd64-64bit-ELF little-endian
    == cwd: /usr/home/vstinner/python/master/build/test_python_3547
    == CPU count: 2
    == encodings: locale=UTF-8, FS=utf-8
    Run tests sequentially
    0:00:01 load avg: 4.85 [1/1] test_threading
    test_recursion_limit (test.test_threading.ThreadingExceptionTests) ... FAIL

    ======================================================================
    FAIL: test_recursion_limit (test.test_threading.ThreadingExceptionTests)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "/usr/home/vstinner/python/master/Lib/test/test_threading.py", line 1086, in test_recursion_limit
        self.assertEqual(p.returncode, 0, "Unexpected error: " + stderr.decode())
    AssertionError: -11 != 0 : Unexpected error: 

    Ran 1 test in 6.017s

    FAILED (failures=1)
    Warning -- files was modified by test_threading
    Before: []
    After: ['python.core']
    test test_threading failed
    test_threading failed

    == Tests result: FAILURE ==

    1 test failed:
    test_threading

    Total duration: 7 sec 284 ms
    Tests result: FAILURE

    @vstinner vstinner added 3.9 only security fixes tests Tests in the Lib/test dir labels Aug 21, 2019
    @vstinner
    Copy link
    Member Author

    I used git bisect in the 3.8 branch and I found this change:

    commit 8399641
    Author: Miss Islington (bot) <31488909+miss-islington@users.noreply.github.com>
    Date: Thu Aug 1 07:38:57 2019 -0700

    bpo-18049: Sync thread stack size to main thread size on macOS (GH-14748)
    
    
    This changeset increases the default size of the stack
    for threads on macOS to the size of the stack
    of the main thread and reenables the relevant
    recursion test.
    (cherry picked from commit 1a057bab0f18d6ad843ce321d1d77a4819497ae4)
    
    Co-authored-by: Ronald Oussoren <ronaldoussoren@mac.com>
    

    Before this change, the test was skipped on FreeBSD:

    ...
    test_recursion_limit (test.test_threading.ThreadingExceptionTests) ... skipped 'test macosx problem'
    ...

    @vstinner
    Copy link
    Member Author

    The crash start to occur with a Python callstack depth larger than 750. It doesn't crash with setrecursionlimit(750). See attached stack.py. Example:

    vstinner@freebsd$ ./python stack.py 750 10240
    setrecursionlimit(750)
    stack_size: 10240.0 kiB = 10.0 MiB
    end of main thread

    It seems like calling _thread.stack_size(s) doesn't help:

    vstinner@freebsd$ ./python stack.py 1000 10240
    setrecursionlimit(1000)
    stack_size: 10240.0 kiB = 10.0 MiB
    Segmentation fault (core dumped)

    --

    I see different options:

    • Reduce the Python recursion limit in the test
    • Find a way to increase the default thread stack size on FreeBSD
    • Skip the test on FreeBSD

    On macOS, configure.ac uses a stack of 8 MiB:

                # Issue bpo-18075: the default maximum stack size (8MBytes) is too
                # small for the default recursion limit. Increase the stack size
                # to ensure that tests don't crash
                # Note: This matches the value of THREAD_STACK_SIZE in
                # thread_pthread.h
                LINKFORSHARED="-Wl,-stack_size,1000000 $LINKFORSHARED"
    

    @vstinner
    Copy link
    Member Author

    Oh, my script called _thread.stack_size() to read the stack size, but that doesn't work: calling _thread.stack_size() sets the stack size to 0 again.

    stack2.py is the fixed script. It seems like the test works with a stack of 8 MiB and the default recursion limit of 1000 Python frames:

    vstinner@freebsd$ ./python stack.py 1000 4096
    setrecursionlimit(1000)
    stack_size: 4096.0 kiB = 4.0 MiB
    Segmentation fault (core dumped)

    vstinner@freebsd$ ./python stack.py 1000 8192
    setrecursionlimit(1000)
    stack_size: 8192.0 kiB = 8.0 MiB
    end of main thread

    So the problem is that the FreeBSD default thread stack size is too small.

    @ronaldoussoren
    Copy link
    Contributor

    I'd increase the default stack size for FreeBSD as well. AFAIK FreeBSD uses clang as the default compiler like macOS, and it is therefore likely that the two platforms have similar stack usage for similar code.

    BTW. I can provide a PR for this but cannot easily test on FreeBSD.

    @vstinner
    Copy link
    Member Author

    On my FreeBSD 12.0 VM, /usr/bin/ld is a symbolic link to ld.ldd: it's the LLVM linker called "LLD":
    https://lld.llvm.org/

    I modified manually Makefile to give 16 MiB stack to the main thread using:

    LINKFORSHARED= -Wl,--export-dynamic "-Wl,-z" "-Wl,stack-size=0x1000000"

    I added manually the following to Python/thread_pthread.h to give 16 MiB stack to thread:

    #undef  THREAD_STACK_SIZE
    #define THREAD_STACK_SIZE       0x1000000

    Using that, the RecurisionError is raised at a depth of 1000 Python function calls: before we exhaust the C stack.

    --

    The syntax used in configure.ac doesn't work with LLD. I get such error:

    cc -pthread -Wl,--export-dynamic -Wl,-stack_size,1000000 -o Programs/_testembed Programs/_testembed.o libpython3.9d.a -lcrypt -ldl -lutil -lm -lm

    /usr/bin/ld: error: cannot open 1000000: No such file or directory

    @xdegaye
    Copy link
    Mannequin

    xdegaye mannequin commented Nov 21, 2019

    See the android related issue bpo-38852.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @emaste
    Copy link
    Contributor

    emaste commented Apr 19, 2022

    Using that, the RecurisionError is raised at a depth of 1000 Python function calls: before we exhaust the C stack.
    To confirm, this is the expected/desired result?

    The syntax used in configure.ac doesn't work with LLD.
    It looks like lld expects the = on --stack-size

    static bool isKnownZFlag(StringRef s) {
      return llvm::is_contained(knownZFlags, s) ||
             s.startswith("common-page-size=") || s.startswith("bti-report=") ||
             s.startswith("cet-report=") ||
             s.startswith("dead-reloc-in-nonalloc=") ||
             s.startswith("max-page-size=") || s.startswith("stack-size=") ||
             s.startswith("start-stop-visibility=");
    }
    
    uint64_t lld::args::getZOptionValue(opt::InputArgList &args, int id,
                                        StringRef key, uint64_t Default) {
      for (auto *arg : args.filtered_reverse(id)) {
        std::pair<StringRef, StringRef> kv = StringRef(arg->getValue()).split('=');
        if (kv.first == key) {
          uint64_t result = Default;
          if (!to_integer(kv.second, result))
            error("invalid " + key + ": " + kv.second);
          return result;
        }
      }
      return Default;
    }
    

    @Jehops
    Copy link

    Jehops commented Apr 19, 2022

    Something seems to have changed in the meantime, because all tests now pass for me on both supported FreeBSD production releases (12.3 and 13.0) and the main branch (all under amd64). Here is the output.

    % uname -a
    FreeBSD 12amd64-default 12.3-RELEASE-p4 FreeBSD 12.3-RELEASE-p4 amd64
    % clang -v
    FreeBSD clang version 10.0.1 (git@github.com:llvm/llvm-project.git llvmorg-10.0.1-0-gef32c611aa2)
    % python3.8 -m test -v test_threading -m test_recursion_limit
    == CPython 3.8.13 (default, Apr 2 2022, 14:51:52) [Clang 10.0.1 (git@github.com:llvm/llvm-project.git llvmorg-10.0.1-0-gef32c611a
    == FreeBSD-12.3-RELEASE-p4-amd64-64bit-ELF little-endian
    == cwd: /tmp/test_python_70495
    == CPU count: 12
    == encodings: locale=UTF-8, FS=utf-8
    0:00:00 load avg: 0.18 Run tests sequentially
    0:00:00 load avg: 0.18 [1/1] test_threading
    test_recursion_limit (test.test_threading.ThreadingExceptionTests) ... ok
    
    ----------------------------------------------------------------------
    
    Ran 1 test in 0.018s
    
    OK
    
    == Tests result: SUCCESS ==
    
    1 test OK.
    
    Total duration: 60 ms
    Tests result: SUCCESS
    
    % uname -a
    FreeBSD 13amd64-default 13.0-RELEASE-p4 FreeBSD 13.0-RELEASE-p4 amd64
    % clang -v
    FreeBSD clang version 11.0.1 (git@github.com:llvm/llvm-project.git llvmorg-11.0.1-0-g43ff75f2c3fe)
    % python3.8 -m test -v test_threading -m test_recursion_limit
    == CPython 3.8.13 (default, Mar 24 2022, 18:04:08) [Clang 11.0.1 (git@github.com:llvm/llvm-project.git llvmorg-11.0.1-0-g43ff75f2c
    == FreeBSD-13.0-RELEASE-p4-amd64-64bit-ELF little-endian
    == cwd: /tmp/test_python_95250
    == CPU count: 12
    == encodings: locale=UTF-8, FS=utf-8
    0:00:00 load avg: 0.47 Run tests sequentially
    0:00:00 load avg: 0.47 [1/1] test_threading
    test_recursion_limit (test.test_threading.ThreadingExceptionTests) ... ok
    
    ----------------------------------------------------------------------
    
    Ran 1 test in 0.018s
    
    OK
    
    == Tests result: SUCCESS ==
    
    1 test OK.
    
    Total duration: 58 ms
    Tests result: SUCCESS
    
    # uname -a
    FreeBSD 14amd64-default 14.0-CURRENT FreeBSD 14.0-CURRENT 1400051 amd64
    root@14amd64-default:~ # clang -v
    FreeBSD clang version 13.0.0 (git@github.com:llvm/llvm-project.git llvmorg-13.0.0-0-gd7b669b3a303)
    Target: x86_64-unknown-freebsd14.0
    Thread model: posix
    InstalledDir: /usr/bin
    root@14amd64-default:~ # python3.8 -m test -v test_threading -m test_recursion_limit
    == CPython 3.8.13 (default, Mar 24 2022, 17:57:22) [Clang 13.0.0 (git@github.com:llvm/llvm-project.git llvmorg-13.0.0-0-gd7b669b3a
    == FreeBSD-14.0-CURRENT-amd64-64bit-ELF little-endian
    == cwd: /tmp/test_python_13854
    == CPU count: 12
    == encodings: locale=UTF-8, FS=utf-8
    0:00:00 load avg: 0.39 Run tests sequentially
    0:00:00 load avg: 0.39 [1/1] test_threading
    test_recursion_limit (test.test_threading.ThreadingExceptionTests) ... ok
    
    ----------------------------------------------------------------------
    
    Ran 1 test in 0.019s
    
    OK
    
    == Tests result: SUCCESS ==
    
    1 test OK.
    
    Total duration: 61 ms
    Tests result: SUCCESS
    

    Running stack.py with a callstack depth of 1000 and a 4 MB stack size also now works.

    % python3.8 stack.py 1000 4096
    setrecursionlimit(1000)
    stack_size: 4096.0 kiB = 4.0 MiB
    end of main thread
    

    @vstinner
    Copy link
    Member Author

    The crash occurred on Python built in debug mode: ./configure --with-pydebug.

    @vstinner
    Copy link
    Member Author

    I can no longer reproduce the issue. Maybe it's because this test only use Python-to-Python function calls which uses way less stack memory in Python 3.11: https://docs.python.org/dev/whatsnew/3.11.html#inlined-python-function-calls

    I close the issue.

    @Jehops
    Copy link

    Jehops commented Apr 19, 2022

    Thanks. FWIW now, I tested with the python 3.8 configure arguments below and the tests pass.

    --enable-shared --without-ensurepip --with-system-ffi --with-pydebug --enable-ipv6 --with-system-libmpdec --with-lto --with-pymalloc --prefix=/usr/local ${_LATE_CONFIGURE_ARGS}

    @Jehops
    Copy link

    Jehops commented Apr 19, 2022

    I didn't intend to post the shell variable in the configure arguments above. The remaining arguments are almost certainly not consequential for the issue here, but for posterity, the full argument list was --enable-shared --without-ensurepip --with-system-ffi --with-pydebug --enable-ipv6 --with-system-libmpdec --with-lto --with-pymalloc --prefix=/usr/local --localstatedir=/var --mandir=/usr/local/man --infodir=/usr/local/share/info/ --build=amd64-portbld-freebsd13.0

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.9 only security fixes tests Tests in the Lib/test dir
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants