Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fuzzer] Parser null deref with continuation characters and generator parenthesis error #89657

Closed
ammaraskar opened this issue Oct 16, 2021 · 12 comments
Labels
3.9 only security fixes 3.10 only security fixes 3.11 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@ammaraskar
Copy link
Member

BPO 45494
Nosy @gpshead, @ambv, @ammaraskar, @lysnikolaou, @pablogsal, @miss-islington
PRs
  • bpo-45494: Fix parser crash when reporting errors involving invalid continuation characters #28993
  • [3.10] bpo-45494: Fix parser crash when reporting errors involving invalid continuation characters (GH-28993) #29070
  • [3.9] bpo-45494: Fix parser crash when reporting errors involving invalid continuation characters (GH-28993) #29071
  • bpo-45494: Fix error location in EOF tokenizer errors #29108
  • [3.10] bpo-45494: Fix error location in EOF tokenizer errors (GH-29108) #29672
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2021-10-20.16:53:29.072>
    created_at = <Date 2021-10-16.14:24:17.142>
    labels = ['interpreter-core', '3.10', '3.9', 'type-crash', '3.11']
    title = '[fuzzer] Parser null deref with continuation characters and generator parenthesis error'
    updated_at = <Date 2021-11-20.17:59:41.789>
    user = 'https://github.com/ammaraskar'

    bugs.python.org fields:

    activity = <Date 2021-11-20.17:59:41.789>
    actor = 'miss-islington'
    assignee = 'none'
    closed = True
    closed_date = <Date 2021-10-20.16:53:29.072>
    closer = 'lukasz.langa'
    components = ['Parser']
    creation = <Date 2021-10-16.14:24:17.142>
    creator = 'ammar2'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 45494
    keywords = ['patch']
    message_count = 11.0
    messages = ['404082', '404099', '404117', '404119', '404341', '404349', '404359', '404494', '404497', '406677', '406678']
    nosy_count = 6.0
    nosy_names = ['gregory.p.smith', 'lukasz.langa', 'ammar2', 'lys.nikolaou', 'pablogsal', 'miss-islington']
    pr_nums = ['28993', '29070', '29071', '29108', '29672']
    priority = 'high'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'crash'
    url = 'https://bugs.python.org/issue45494'
    versions = ['Python 3.9', 'Python 3.10', 'Python 3.11']

    @ammaraskar
    Copy link
    Member Author

    Another parser crash found by the fuzzer:

    "\
    "(1for c in I,\
    \

    Recreator:

    >>> import ast
    >>> ast.literal_eval('"\\\n"(1for c in I,\\\n\\')
    [1]    17916 segmentation fault  ./python
    
    >>> import ast
    >>> ast.literal_eval(r'''
    ... "\
    ... "(1for c in I,\
    ... \ ''')
    [1]    17935 segmentation fault  ./python

    Raw ASAN stacktrace
    -------------------

    ==1668==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000001 (pc 0x7f4157e5e08c bp 0x7fffbd48b300 sp 0x7fffbd48aab8 T0)
    

    ==1668==The signal is caused by a READ memory access.
    ==1668==Hint: address points to the zero page.
    #0 0x7f4157e5e08c in strchr-avx2.S:57 /build/glibc-eX1tMB/glibc-2.31/sysdeps/x86_64/multiarch/strchr-avx2.S:57
    #1 0x4d7a88 in strchr /src/llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_common_interceptors.inc:0
    #2 0x9fa6f5 in get_error_line cpython3/Parser/pegen.c:406:25
    #3 0x9fa6f5 in _PyPegen_raise_error_known_location cpython3/Parser/pegen.c:497:26
    #4 0xa18a92 in RAISE_ERROR_KNOWN_LOCATION cpython3/Parser/pegen.h:169:5
    #5 0xa331d5 in invalid_arguments_rule cpython3/Parser/parser.c:17831:20
    #6 0xa21a87 in arguments_rule cpython3/Parser/parser.c:15462:38
    #7 0xa2056b in primary_raw cpython3/Parser/parser.c:12867:18
    #8 0xa2056b in primary_rule cpython3/Parser/parser.c:12745:22
    #9 0xa1f9cd in await_primary_rule cpython3/Parser/parser.c:12700:28
    #10 0xa1f119 in power_rule cpython3/Parser/parser.c:12578:18
    #11 0xa1eabc in factor_rule cpython3/Parser/parser.c:12530:26
    #12 0xa1dc04 in term_raw cpython3/Parser/parser.c:12373:27
    #13 0xa1dc04 in term_rule cpython3/Parser/parser.c:12138:22
    #14 0xa1c899 in sum_raw cpython3/Parser/parser.c:12093:25
    #15 0xa1c899 in sum_rule cpython3/Parser/parser.c:11975:22
    #16 0xa1bb99 in shift_expr_raw cpython3/Parser/parser.c:11936:24
    #17 0xa1bb99 in shift_expr_rule cpython3/Parser/parser.c:11818:22
    #18 0xa1af2c in bitwise_and_raw cpython3/Parser/parser.c:11779:31
    #19 0xa1af2c in bitwise_and_rule cpython3/Parser/parser.c:11700:22
    #20 0xa1a49c in bitwise_xor_raw cpython3/Parser/parser.c:11661:32
    #21 0xa1a49c in bitwise_xor_rule cpython3/Parser/parser.c:11582:22
    #22 0xa1917c in bitwise_or_raw cpython3/Parser/parser.c:11543:32
    #23 0xa1917c in bitwise_or_rule cpython3/Parser/parser.c:11464:22
    #24 0xa2cd39 in comparison_rule cpython3/Parser/parser.c:10727:18
    #25 0xa2c912 in inversion_rule cpython3/Parser/parser.c:10680:31
    #26 0xa2b951 in conjunction_rule cpython3/Parser/parser.c:10559:18
    #27 0xa258e1 in disjunction_rule cpython3/Parser/parser.c:10473:18
    #28 0xa17cb1 in invalid_expression_rule cpython3/Parser/parser.c:18253:18
    #29 0xa17cb1 in expression_rule cpython3/Parser/parser.c:9754:39
    #30 0xa56979 in expressions_rule cpython3/Parser/parser.c:9628:18
    #31 0xa0acf5 in eval_rule cpython3/Parser/parser.c:1035:18
    #32 0xa0acf5 in _PyPegen_parse cpython3/Parser/parser.c:33076:18
    #33 0xa001a5 in _PyPegen_run_parser cpython3/Parser/pegen.c:1350:9
    #34 0xa01fa5 in _PyPegen_run_parser_from_string cpython3/Parser/pegen.c:1482:14
    #35 0xa80fc9 in _PyParser_ASTFromString cpython3/Parser/peg_api.c:14:21
    #36 0x8611ca in Py_CompileStringObject cpython3/Python/pythonrun.c:1371:11
    #37 0xc04a8f in builtin_compile_impl cpython3/Python/bltinmodule.c:842:14
    #38 0xc04a8f in builtin_compile cpython3/Python/clinic/bltinmodule.c.h:249:20
    #39 0xb78ade in cfunction_vectorcall_FASTCALL_KEYWORDS cpython3/Objects/methodobject.c:446:24
    #40 0x57c0ec in _PyObject_VectorcallTstate cpython3/Include/internal/pycore_call.h:89:11
    #41 0x57c0ec in PyObject_Vectorcall cpython3/Objects/call.c:298:12
    #42 0x766191 in call_function cpython3/Python/ceval.c:6619:13
    #43 0x748137 in _PyEval_EvalFrameDefault cpython3/Python/ceval.c:4734:19
    #44 0x741ae4 in _PyEval_EvalFrame cpython3/Include/internal/pycore_ceval.h:48:16
    #45 0x741ae4 in _PyEval_Vector cpython3/Python/ceval.c:5810:24
    #46 0x57cb50 in _PyFunction_Vectorcall cpython3/Objects/call.c:0
    #47 0x57c0ec in _PyObject_VectorcallTstate cpython3/Include/internal/pycore_call.h:89:11
    #48 0x57c0ec in PyObject_Vectorcall cpython3/Objects/call.c:298:12
    #49 0x766191 in call_function cpython3/Python/ceval.c:6619:13
    #50 0x748137 in _PyEval_EvalFrameDefault cpython3/Python/ceval.c:4734:19
    #51 0x741ae4 in _PyEval_EvalFrame cpython3/Include/internal/pycore_ceval.h:48:16
    #52 0x741ae4 in _PyEval_Vector cpython3/Python/ceval.c:5810:24
    #53 0x57cb50 in _PyFunction_Vectorcall cpython3/Objects/call.c:0
    #54 0x57c920 in _PyObject_VectorcallTstate cpython3/Include/internal/pycore_call.h:89:11
    #55 0x57c920 in PyObject_CallOneArg cpython3/Objects/call.c:375:12
    #56 0x579d18 in fuzz_ast_literal_eval cpython3/Modules/_xxtestfuzz/fuzzer.c:425:25
    #57 0x579d18 in _run_fuzz cpython3/Modules/_xxtestfuzz/fuzzer.c:443:14
    #58 0x579d18 in LLVMFuzzerTestOneInput cpython3/Modules/_xxtestfuzz/fuzzer.c:565:11
    #59 0x472623 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) cxa_noexception.cpp:0
    #60 0x45ded2 in fuzzer::RunOneTest(fuzzer::Fuzzer*, char const*, unsigned long) /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerDriver.cpp:324:6
    #61 0x463985 in fuzzer::FuzzerDriver(int*, char***, int ()(unsigned char const, unsigned long)) cxa_noexception.cpp:0
    #62 0x48c672 in main /src/llvm-project/compiler-rt/lib/fuzzer/FuzzerMain.cpp:20:10
    #63 0x7f4157cfa0b2 in __libc_start_main /build/glibc-eX1tMB/glibc-2.31/csu/libc-start.c:308:16
    #64 0x43b16d in _start

    @ammaraskar ammaraskar added 3.11 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump labels Oct 16, 2021
    @pablogsal
    Copy link
    Member

    Presto!! PR 28993

    @gpshead
    Copy link
    Member

    gpshead commented Oct 16, 2021

    I confirmed that 3.9 does NOT seem to have the problem:

    Python 3.9.5 (default, May 19 2021, 11:32:47) 
    [GCC 9.3.0] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> x = r'''
    ... "\
    ... "(1for c in I,\
    ... \ '''
    >>> import ast
    >>> ast.literal_eval(x)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/usr/lib/python3.9/ast.py", line 62, in literal_eval
        node_or_string = parse(node_or_string, mode='eval')
      File "/usr/lib/python3.9/ast.py", line 50, in parse
        return compile(source, filename, mode, flags,
      File "<unknown>", line 3
        "\
          ^
    SyntaxError: Generator expression must be parenthesized

    @gpshead gpshead added 3.10 only security fixes labels Oct 16, 2021
    @pablogsal
    Copy link
    Member

    I confirmed that 3.9 does NOT seem to have the problem:

    It does, is just that is not a crash. The point where the error message point is totally wrong

    @gpshead gpshead added 3.9 only security fixes labels Oct 17, 2021
    @ambv
    Copy link
    Contributor

    ambv commented Oct 19, 2021

    New changeset a106343 by Pablo Galindo Salgado in branch 'main':
    bpo-45494: Fix parser crash when reporting errors involving invalid continuation characters (GH-28993)
    a106343

    @ambv
    Copy link
    Contributor

    ambv commented Oct 19, 2021

    New changeset 5c9cab5 by Łukasz Langa in branch '3.10':
    [3.10] bpo-45494: Fix parser crash when reporting errors involving invalid continuation characters (GH-28993) (GH-29070)
    5c9cab5

    @ambv
    Copy link
    Contributor

    ambv commented Oct 19, 2021

    Note: this *does* fail on 3.9, too. Even if it doesn't crash the production build, it does fail an assertion in a pydebug build:

    test_error_offset_continuation_characters (test.test_exceptions.ExceptionTests) ... Assertion failed: (!_PyErr_Occurred(tstate)), function _PyObject_Call, file Objects/call.c, line 261.
    Fatal Python error: Aborted

    Current thread 0x00000001184d1dc0 (most recent call first):
    File "/private/tmp/cpy/Lib/test/test_exceptions.py", line 187 in check
    File "/private/tmp/cpy/Lib/test/test_exceptions.py", line 198 in test_error_offset_continuation_characters

    @ambv
    Copy link
    Contributor

    ambv commented Oct 20, 2021

    New changeset 88f4ec8 by Łukasz Langa in branch '3.9':
    [3.9] bpo-45494: Fix parser crash when reporting errors involving invalid continuation characters (GH-28993) (bpo-29071)
    88f4ec8

    @ambv
    Copy link
    Contributor

    ambv commented Oct 20, 2021

    Thanks for the fix, Pablo! ✨ 🍰 ✨

    @ambv ambv closed this as completed Oct 20, 2021
    @ambv ambv closed this as completed Oct 20, 2021
    @pablogsal
    Copy link
    Member

    New changeset 79ff0d1 by Pablo Galindo Salgado in branch 'main':
    bpo-45494: Fix error location in EOF tokenizer errors (GH-29108)
    79ff0d1

    @miss-islington
    Copy link
    Contributor

    New changeset a427eb8 by Miss Islington (bot) in branch '3.10':
    bpo-45494: Fix error location in EOF tokenizer errors (GH-29108)
    a427eb8

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @vincedani
    Copy link

    Hello @ammaraskar, it looks like you are (or were) fuzzing this repository, and you’ve found some interesting bugs. 🥇

    I would like to create a Python based test case reduction test suite that contains fuzzer generated outputs, and benchmark automatic test case reducers how they perform on Python inputs. It looks like to me you have opened this issue with the already reduced input that caused malfunction. Is it possible that you still have the output of the fuzzer, which is free of any reduction?

    I’m also interested in these issues of yours:

    with the same motivation.

    Thanks in advance,
    Daniel

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.9 only security fixes 3.10 only security fixes 3.11 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump
    Projects
    None yet
    Development

    No branches or pull requests

    6 participants