Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Address sanitizer (gcc version) is generating false positives #88529

Closed
markshannon opened this issue Jun 9, 2021 · 21 comments
Closed

Address sanitizer (gcc version) is generating false positives #88529

markshannon opened this issue Jun 9, 2021 · 21 comments

Comments

@markshannon
Copy link
Member

BPO 44363
Nosy @markshannon, @ericsnowcurrently, @pablogsal, @miss-islington
PRs
  • bpo-44363: Get test_capi passing with address sanitizer #26639
  • [3.9] bpo-44363: Get test_capi passing with address sanitizer (GH-26639) #26641
  • [3.10] bpo-44363: Get test_capi passing with address sanitizer (GH-26639) #26642
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2021-06-09.11:45:08.375>
    created_at = <Date 2021-06-09.09:30:48.088>
    labels = ['invalid']
    title = 'Address sanitizer (gcc version) is generating false positives'
    updated_at = <Date 2021-06-10.12:02:25.931>
    user = 'https://github.com/markshannon'

    bugs.python.org fields:

    activity = <Date 2021-06-10.12:02:25.931>
    actor = 'miss-islington'
    assignee = 'none'
    closed = True
    closed_date = <Date 2021-06-09.11:45:08.375>
    closer = 'pablogsal'
    components = []
    creation = <Date 2021-06-09.09:30:48.088>
    creator = 'Mark.Shannon'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 44363
    keywords = []
    message_count = 21.0
    messages = ['395403', '395406', '395410', '395412', '395413', '395414', '395416', '395423', '395426', '395428', '395429', '395432', '395434', '395435', '395439', '395447', '395450', '395515', '395520', '395523', '395525']
    nosy_count = 4.0
    nosy_names = ['Mark.Shannon', 'eric.snow', 'pablogsal', 'miss-islington']
    pr_nums = ['26639', '26641', '26642']
    priority = 'normal'
    resolution = 'not a bug'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue44363'
    versions = []

    @markshannon
    Copy link
    Member Author

    Running the buildbot for #26595
    results in failures:
    https://buildbot.python.org/all/#/builders/581/builds/63
    Which claim that the address calculation in LOAD_ATTR_MODULE is out of bounds.

    The tests pass with an added assert to verify that the address in question is in bounds.

    All failures seem to happen after a fork, which seems to be a longstanding weakness of the address sanitizer.

    I'd like to merge #26595. I'd also like to keep the buildbots working. Dichotomy.

    @pablogsal
    Copy link
    Member

    Buildbots have the priority, please don't merge anything that breaks the buildbots. We should figure out how to silence the false positive if indeed is one, but I don't think it is. The address sanitizer is not speculative, it literally checks if someone has written out of bounds

    @pablogsal
    Copy link
    Member

    I can reproduce that failure of address sanitizer with GCC and clang separately, so it doesn't seem like a false positive.

    @pablogsal
    Copy link
    Member

    Also, I can reproduce locally without any fork whatsoever:

    ./python -m test test_lib2to3

    SUMMARY: AddressSanitizer: heap-buffer-overflow Python/ceval.c:3549 in _PyEval_EvalFrameDefault
    Shadow bytes around the buggy address:
    0x0c1c8003ec40: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
    0x0c1c8003ec50: 00 00 00 00 00 00 00 00 00 00 00 00 fa fa fa fa
    0x0c1c8003ec60: fa fa fa fa 00 00 00 00 00 00 00 00 00 00 00 00
    0x0c1c8003ec70: 00 00 00 00 00 00 00 00 fa fa fa fa fa fa fa fa
    0x0c1c8003ec80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    =>0x0c1c8003ec90: 00 00 00 00 fa[fa]fa fa fa fa fa fa fd fd fd fd
    0x0c1c8003eca0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fa
    0x0c1c8003ecb0: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
    0x0c1c8003ecc0: 00 00 00 00 00 00 00 00 00 00 00 00 fa fa fa fa
    0x0c1c8003ecd0: fa fa fa fa 00 00 00 00 00 00 00 00 00 00 00 00
    0x0c1c8003ece0: 00 00 00 00 00 00 00 00 fa fa fa fa fa fa fa fa
    Shadow byte legend (one shadow byte represents 8 application bytes):
    Addressable: 00
    Partially addressable: 01 02 03 04 05 06 07
    Heap left redzone: fa
    Freed heap region: fd
    Stack left redzone: f1
    Stack mid redzone: f2
    Stack right redzone: f3
    Stack after return: f5
    Stack use after scope: f8
    Global redzone: f9
    Global init order: f6
    Poisoned by user: f7
    Container overflow: fc
    Array cookie: ac
    Intra object redzone: bb
    ASan internal: fe
    Left alloca redzone: ca
    Right alloca redzone: cb
    Shadow gap: cc
    ==14183==ABORTING

    @pablogsal
    Copy link
    Member

    Even valgrind sees the problem:

    ❯ valgrind ./python -m test test_lib2to3
    ==27010== Memcheck, a memory error detector
    ==27010== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
    ==27010== Using Valgrind-3.17.0 and LibVEX; rerun with -h for copyright info
    ==27010== Command: ./python -m test test_lib2to3
    ==27010==
    0:00:00 load avg: 1.53 Run tests sequentially
    0:00:00 load avg: 1.53 [1/1] test_lib2to3
    ==27010== Invalid read of size 8
    ==27010== at 0x2363E1: _PyEval_EvalFrameDefault (ceval.c:3549)
    ==27010== by 0x23B270: _PyEval_EvalFrame (pycore_ceval.h:46)
    ==27010== by 0x23B270: _PyEval_Vector (ceval.c:5174)
    ==27010== by 0x172498: _PyFunction_Vectorcall (call.c:342)
    ==27010== by 0x2278AF: _PyObject_VectorcallTstate (abstract.h:114)
    ==27010== by 0x22C88F: PyObject_Vectorcall (abstract.h:123)
    ==27010== by 0x22C88F: call_function (ceval.c:6001)
    ==27010== by 0x239642: _PyEval_EvalFrameDefault (ceval.c:4200)
    ==27010== by 0x23B270: _PyEval_EvalFrame (pycore_ceval.h:46)
    ==27010== by 0x23B270: _PyEval_Vector (ceval.c:5174)
    ==27010== by 0x172498: _PyFunction_Vectorcall (call.c:342)
    ==27010== by 0x2278AF: _PyObject_VectorcallTstate (abstract.h:114)
    ==27010== by 0x22C88F: PyObject_Vectorcall (abstract.h:123)
    ==27010== by 0x22C88F: call_function (ceval.c:6001)
    ==27010== by 0x239642: _PyEval_EvalFrameDefault (ceval.c:4200)
    ==27010== by 0x23B270: _PyEval_EvalFrame (pycore_ceval.h:46)
    ==27010== by 0x23B270: _PyEval_Vector (ceval.c:5174)
    ==27010== Address 0x62c5e28 is 8 bytes after a block of size 160 alloc'd
    ==27010== at 0x483E7C5: malloc (vg_replace_malloc.c:380)
    ==27010== by 0x1B2911: _PyMem_RawMalloc (obmalloc.c:99)
    ==27010== by 0x1B4702: PyObject_Malloc (obmalloc.c:697)
    ==27010== by 0x1A1EF9: clone_combined_dict_keys (dictobject.c:678)
    ==27010== by 0x1A2216: dict_merge (dictobject.c:2453)
    ==27010== by 0x1A2A59: PyDict_Merge (dictobject.c:2603)
    ==27010== by 0x1A2AF1: dict_update_arg (dictobject.c:2273)
    ==27010== by 0x1A2D4E: dict_update_common (dictobject.c:2298)
    ==27010== by 0x1A2DD1: dict_update (dictobject.c:2316)
    ==27010== by 0x33FCBA: method_vectorcall_VARARGS_KEYWORDS (descrobject.c:346)
    ==27010== by 0x2278AF: _PyObject_VectorcallTstate (abstract.h:114)
    ==27010== by 0x22C88F: PyObject_Vectorcall (abstract.h:123)
    ==27010== by 0x22C88F: call_function (ceval.c:6001)
    ==27010==

    @pablogsal
    Copy link
    Member

    I'm closing this as 3 different tools complain about the same thing, with a very simple reproducer.

    @markshannon
    Copy link
    Member Author

    If I run the following on main
    ./configure --with-address-sanitizer
    make clean
    make -j12 test

    I get 22 failures.

    So something is broken.

    test_lib2to3 does fork; at least it does when I run it under gdb.

    @pablogsal
    Copy link
    Member

    As I mentioned in https://bugs.python.org/issue43693, the way to run it is the following:

    $ export ASAN_OPTIONS=detect_leaks=0:allocator_may_return_null=1:handle_segv=0
    $ ./configure --with-address-sanitizer --without-pymalloc
    $ make -j -s
    $ ./python -m test test_lib2to3

    There are known failures with things like ctypes and libcript so the following tests are ignored:

    test_ctypes test_capi test_crypt test_decimal test_faulthandler test_interpreters

    -----------------

    test_lib2to3 does one fork for this call:

    https://github.com/python/cpython/blob/main/Lib/lib2to3/tests/test_parser.py#L64

    but that doesn't matter. Comment that and keeps falining

    @markshannon
    Copy link
    Member Author

    I still get quite a few failures on the main branch.
    It seems like ceval.c:1600 upsets the sanitizer, at least for gcc.

    There isn't anything wrong with that line, but as I plan to change it anyway I guess it doesn't matter:
    #26595

    I plan to get the main branch passing the tests with address sanitization (at least on my machine with my gcc :)

    Running the tests doesn't seem to take that long (at least not compared with refleak tests).
    Do you think it would be it feasible to run the address sanitizer on all PRs, so that we can keep it passing?

    @markshannon markshannon changed the title Likely false positive for address sanitizer after fork Address sanitizer (gcc version) is generating false positives Jun 9, 2021
    @markshannon markshannon changed the title Likely false positive for address sanitizer after fork Address sanitizer (gcc version) is generating false positives Jun 9, 2021
    @pablogsal
    Copy link
    Member

    Do you think it would be it feasible to run the address sanitizer on all PRs, so that we can keep it passing?

    That's a good idea,let me see how easy doing that would be

    @pablogsal
    Copy link
    Member

    It seems like ceval.c:1600 upsets the sanitizer, at least for gcc.

    Even if you don't trust the sanitizer, you can also use valgrind, it shows the same error:

    ❯ valgrind ./python -m test test_lib2to3
    ==27010== Memcheck, a memory error detector
    ==27010== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
    ==27010== Using Valgrind-3.17.0 and LibVEX; rerun with -h for copyright info
    ==27010== Command: ./python -m test test_lib2to3
    ==27010==
    0:00:00 load avg: 1.53 Run tests sequentially
    0:00:00 load avg: 1.53 [1/1] test_lib2to3
    ==27010== Invalid read of size 8
    ==27010== at 0x2363E1: _PyEval_EvalFrameDefault (ceval.c:3549)
    ==27010== by 0x23B270: _PyEval_EvalFrame (pycore_ceval.h:46)

    For running valgrind, make sure do disable pymalloc:

    export PYTHONMALLOC=malloc

    @pablogsal
    Copy link
    Member

    This is not a false positive. I break into gdb at the moment the sanitizer makes the report and inspecting the values that apparently are wrong. I did that by breaking into __sanitizer::ColorizeReports which is called for making the report. THis is what I did:

    $export CFLAGS='-g3 -O0'
    $export ASAN_OPTIONS=detect_leaks=0:allocator_may_return_null=1:handle_segv=0
    $./configure --with-address-sanitizer --without-pymalloc
    $make -j -s
    $gdb --args ./python -m test test_lib2to3 -v -m test_prefix_preservation
    [Thread debugging using libthread_db enabled]
    Using host libthread_db library "/usr/lib/libthread_db.so.1".
    == CPython 3.11.0a0 (heads/specialize-load-attr-dirty:e858ea1571, Jun 9 2021, 17:34:14) [GCC 11.1.0]
    [Detaching after vfork from child process 18105]
    == Linux-5.12.9-arch1-1-x86_64-with-glibc2.33 little-endian
    == cwd: /home/pablogsal/github/python/master/build/test_python_18103æ
    == CPU count: 36
    == encodings: locale=UTF-8, FS=utf-8
    0:00:00 load avg: 1.64 Run tests sequentially
    0:00:00 load avg: 1.64 [1/1] test_lib2to3
    test_prefix_preservation (lib2to3.tests.test_fixers.Test_dict) ... ok
    test_prefix_preservation (lib2to3.tests.test_fixers.Test_except) ... =================================================================

    Breakpoint 1, __sanitizer::ColorizeReports ()
    at /build/gcc/src/gcc/libsanitizer/sanitizer_common/sanitizer_symbolizer_report.cpp:62
    62 /build/gcc/src/gcc/libsanitizer/sanitizer_common/sanitizer_symbolizer_report.cpp: No such file or directory.
    (gdb) up
    #1 0x00007ffff75f6af1 in __sanitizer::SanitizerCommonDecorator::SanitizerCommonDecorator (this=<synthetic pointer>)
    at /build/gcc/src/gcc/libsanitizer/sanitizer_common/sanitizer_report_decorator.h:26
    26 /build/gcc/src/gcc/libsanitizer/sanitizer_common/sanitizer_report_decorator.h: No such file or directory.
    (gdb)
    #2 __asan::Decorator::Decorator (this=<synthetic pointer>) at /build/gcc/src/gcc/libsanitizer/asan/asan_descriptions.h:45
    45 /build/gcc/src/gcc/libsanitizer/asan/asan_descriptions.h: No such file or directory.
    (gdb)
    #3 __asan::ErrorGeneric::Print (this=this@entry=0x7ffff7710908 <_asan::ScopedInErrorReport::current_error+8>)
    at /build/gcc/src/gcc/libsanitizer/asan/asan_errors.cpp:574
    574 /build/gcc/src/gcc/libsanitizer/asan/asan_errors.cpp: No such file or directory.
    (gdb)
    #4 0x00007ffff768531e in __asan::ErrorDescription::Print (
    this=this@entry=0x7ffff7710900 <_asan::ScopedInErrorReport::current_error>)
    at /build/gcc/src/gcc/libsanitizer/asan/asan_errors.h:440
    440 /build/gcc/src/gcc/libsanitizer/asan/asan_errors.h: No such file or directory.
    (gdb)
    #5 0x00007ffff7685528 in __asan::ScopedInErrorReport::~ScopedInErrorReport (this=0x7ffffffd12e6,
    __in_chrg=<optimized out>) at /build/gcc/src/gcc/libsanitizer/asan/asan_report.cpp:141
    141 /build/gcc/src/gcc/libsanitizer/asan/asan_report.cpp: No such file or directory.
    (gdb)
    #6 0x00007ffff7684d8d in __asan::ReportGenericError (pc=93824995352503, bp=bp@entry=140737488166752,
    sp=sp@entry=140737488166736, addr=106515189666344, is_write=is_write@entry=false, access_size=access_size@entry=8,
    exp=0, fatal=true) at /build/gcc/src/gcc/libsanitizer/asan/asan_report.cpp:478
    478 in /build/gcc/src/gcc/libsanitizer/asan/asan_report.cpp
    (gdb)
    #7 0x00007ffff7685c1c in __asan::__asan_report_load8 (addr=<optimized out>)
    at /build/gcc/src/gcc/libsanitizer/asan/asan_rtl.cpp:121
    121 /build/gcc/src/gcc/libsanitizer/asan/asan_rtl.cpp: No such file or directory.
    (gdb)
    #8 0x000055555584dfb7 in _PyEval_EvalFrameDefault (tstate=0x612000000040, f=0x6080000b08b0, throwflag=0)
    at Python/ceval.c:3549
    3549 DEOPT_IF(ep->me_key != name, LOAD_ATTR);
    (gdb) p ep->me_key
    $3 = (PyObject *) 0x60e0000b1250
    (gdb) p ep->me_key->ob_refcnt
    $4 = 652835033347

    That ep->me_key is obviously corrupted.

    @markshannon
    Copy link
    Member Author

    What commit are you running that on?

    @pablogsal
    Copy link
    Member

    commit e858ea1 (HEAD -> specialize-load-attr)
    Author: Mark Shannon <mark@hotpy.org>
    Date: Wed Jun 9 09:33:13 2021 +0100

    Assert that address is in bounds (ASAN thinks it might not be).
    
    diff --git a/Python/ceval.c b/Python/ceval.c
    index 06a02b40f9..784d0244e8 100644
    --- a/Python/ceval.c
    +++ b/Python/ceval.c
    @@ -3517,6 +3517,7 @@ _PyEval_EvalFrameDefault(PyThreadState *tstate, PyFrameObject *f, int throwflag)
                 DEOPT_IF(!PyModule_CheckExact(owner), LOAD_ATTR);
                 PyDictObject *dict = (PyDictObject *)((PyModuleObject *)owner)->md_dict;
                 DEOPT_IF(dict->ma_keys->dk_version != cache1->dk_version_or_hint, LOAD_ATTR);
    +            assert(cache0->index < dict->ma_keys->dk_nentries);
                 PyDictKeyEntry *ep = DK_ENTRIES(dict->ma_keys) + cache0->index;
                 res = ep->me_value;
                 DEOPT_IF(res == NULL, LOAD_ATTR);

    @markshannon
    Copy link
    Member Author

    This line seems to be responsible for most of the failures:
    https://github.com/python/cpython/blob/main/Objects/frameobject.c#L985

    Which does appear to be a true positive.

    @pablogsal
    Copy link
    Member

    From the gdb session looks to me that you are reading freed or invalid memory from the dictionary keys. Well, I'm quite sure you are doing that, but I don't know why they is happening.

    @ericsnowcurrently
    Copy link
    Member

    This was my bad. I only partially fixed original problem when un-reverting 3fa63e in #70796. I've merged the rest of the fix in #70813. Sorry about that.

    @markshannon
    Copy link
    Member Author

    It looks like I've been a bit unfair to the address sanitizer.

    It does appear to produce incorrect locations sometimes, but that's not really a false positive and the reports are generally useful.

    @pablogsal
    Copy link
    Member

    New changeset 31aa0db by Mark Shannon in branch 'main':
    bpo-44363: Get test_capi passing with address sanitizer (GH-26639)
    31aa0db

    @miss-islington
    Copy link
    Contributor

    New changeset 8652694 by Miss Islington (bot) in branch '3.9':
    bpo-44363: Get test_capi passing with address sanitizer (GH-26639)
    8652694

    @miss-islington
    Copy link
    Contributor

    New changeset 0895e62 by Miss Islington (bot) in branch '3.10':
    bpo-44363: Get test_capi passing with address sanitizer (GH-26639)
    0895e62

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    None yet
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants