Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compiling recursive Python ASTs crash the interpreter #55314

Closed
benjaminp opened this issue Feb 3, 2011 · 27 comments
Closed

Compiling recursive Python ASTs crash the interpreter #55314

benjaminp opened this issue Feb 3, 2011 · 27 comments
Labels
3.9 only security fixes 3.10 only security fixes 3.11 bug and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@benjaminp
Copy link
Contributor

BPO 11105
Nosy @terryjreedy, @gpshead, @abalkin, @benjaminp, @jkloth, @meadori, @ambv, @ericsnowcurrently, @pablogsal, @miss-islington, @isidentical, @Fidget-Spinner
PRs
  • bpo-11105: Do not crash when compiling recursive ASTs #20594
  • [3.10] bpo-11105: Do not crash when compiling recursive ASTs (GH-20594) #26521
  • [3.9] bpo-11105: Do not crash when compiling recursive ASTs (GH-20594) #26522
  • bpo-11105: use a lower recursion limit for infinite recursion tests #26550
  • [TEMP] bpo-11105: use PY_LOCAL()s for AST converter functions #26554
  • bpo-44336: Prevent hanging on child process handles #26578
  • bpo-11105: document the new test.support.infinite_recursion context m… #26604
  • [3.9] bpo-11105: reduce the recursion limit for tests (GH-26550). #26605
  • [3.10] bpo-11105: reduce the recursion limit for tests (GH-26550). #26607
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2021-06-08.17:42:44.965>
    created_at = <Date 2011-02-03.05:02:39.444>
    labels = ['interpreter-core', '3.10', '3.9', 'type-crash', '3.11']
    title = 'Compiling recursive Python ASTs crash the interpreter'
    updated_at = <Date 2021-06-08.17:42:44.964>
    user = 'https://github.com/benjaminp'

    bugs.python.org fields:

    activity = <Date 2021-06-08.17:42:44.964>
    actor = 'BTaskaya'
    assignee = 'none'
    closed = True
    closed_date = <Date 2021-06-08.17:42:44.965>
    closer = 'BTaskaya'
    components = ['Interpreter Core']
    creation = <Date 2011-02-03.05:02:39.444>
    creator = 'benjamin.peterson'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 11105
    keywords = ['patch']
    message_count = 27.0
    messages = ['127783', '127797', '127798', '127799', '127800', '127801', '127816', '155978', '155986', '358952', '359122', '373078', '373084', '395040', '395046', '395050', '395172', '395173', '395174', '395177', '395189', '395190', '395341', '395342', '395343', '395344', '395345']
    nosy_count = 12.0
    nosy_names = ['terry.reedy', 'gregory.p.smith', 'belopolsky', 'benjamin.peterson', 'jkloth', 'meador.inge', 'lukasz.langa', 'eric.snow', 'pablogsal', 'miss-islington', 'BTaskaya', 'kj']
    pr_nums = ['20594', '26521', '26522', '26550', '26554', '26578', '26604', '26605', '26607']
    priority = None
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'crash'
    url = 'https://bugs.python.org/issue11105'
    versions = ['Python 3.9', 'Python 3.10', 'Python 3.11']

    @benjaminp
    Copy link
    Contributor Author

    You don't want to know why I was thinking about this...

    $ ./python 
    Python 3.2rc2+ (py3k:88302, Feb  1 2011, 19:02:10) 
    [GCC 4.4.4] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import ast
    >>> e = ast.UnaryOp(op=ast.Not(), lineno=0, col_offset=0)
    >>> e.operand = e
    >>> compile(ast.Expression(e), "<test>", "eval")
    Segmentation fault

    @benjaminp benjaminp added the type-crash A hard crash of the interpreter, possibly with a core dump label Feb 3, 2011
    @abalkin
    Copy link
    Member

    abalkin commented Feb 3, 2011

    Looks like a stack overflow caused by an infinite recursion. I am not sure if it is possible to add cycle detection code without sacrificing performance or setting some arbitrary limits.

    I wonder: Why ast nodes need to be mutable?

    @benjaminp
    Copy link
    Contributor Author

    2011/2/3 Alexander Belopolsky <report@bugs.python.org>:

    Alexander Belopolsky <belopolsky@users.sourceforge.net> added the comment:

    Looks like a stack overflow caused by an infinite recursion.  I am not sure if it is possible to add cycle detection code without sacrificing performance or setting some arbitrary limits.

    Yes, it's definitely low priority. It's probably easier to crash the
    interpreter by producing differently malformed ast anyway.

    I wonder: Why ast nodes need to be mutable?

    So people can change them.

    @abalkin
    Copy link
    Member

    abalkin commented Feb 3, 2011

    On Thu, Feb 3, 2011 at 12:08 PM, Benjamin Peterson
    <report@bugs.python.org> wrote:
    ..

    > I wonder: Why ast nodes need to be mutable?

    So people can change them.

    Well, they are hashable, so this needs to be done carefully. Is this
    necessary for AST-based optimizations? Does Python actually change
    AST after it has been created? Note that for some optimizations it
    may be more appropriate to build a new tree rather than mutate the old
    one. Depending on the algorithm, you may or may not need to change
    the nodes after they have been created in the process.

    @benjaminp
    Copy link
    Contributor Author

    2011/2/3 Alexander Belopolsky <report@bugs.python.org>:
    >
    > Alexander Belopolsky <belopolsky@users.sourceforge.net> added the comment:
    >
    > On Thu, Feb 3, 2011 at 12:08 PM, Benjamin Peterson
    > <report@bugs.python.org> wrote:
    > ..
    >>> I wonder: Why ast nodes need to be mutable?
    >>
    >> So people can change them.
    >
    > Well, they are hashable, so this needs to be done carefully.  Is this
    > necessary for AST-based optimizations?  Does Python actually change
    > AST after it has been created?  Note that for some optimizations it
    > may be more appropriate to build a new tree rather than mutate the old
    > one.  Depending on the algorithm, you may or may not need to change
    > the nodes after they have been created in the process.

    Other people are, though. The hash is by identity anyway.

    1 similar comment
    @benjaminp
    Copy link
    Contributor Author

    2011/2/3 Alexander Belopolsky <report@bugs.python.org>:
    >
    > Alexander Belopolsky <belopolsky@users.sourceforge.net> added the comment:
    >
    > On Thu, Feb 3, 2011 at 12:08 PM, Benjamin Peterson
    > <report@bugs.python.org> wrote:
    > ..
    >>> I wonder: Why ast nodes need to be mutable?
    >>
    >> So people can change them.
    >
    > Well, they are hashable, so this needs to be done carefully.  Is this
    > necessary for AST-based optimizations?  Does Python actually change
    > AST after it has been created?  Note that for some optimizations it
    > may be more appropriate to build a new tree rather than mutate the old
    > one.  Depending on the algorithm, you may or may not need to change
    > the nodes after they have been created in the process.

    Other people are, though. The hash is by identity anyway.

    @birkenfeld
    Copy link
    Member

    Alex: If the node attributes were not mutable, it would be extremely awkward, not to say inefficient, to mutate an already existing AST as returned by ast.parse().

    The AST objects in the _ast module aren't what Python works with internally, anyway. When calling ast.parse(), the AST is converted to Python objects (these are defined in Python-ast.c), and compile()ing such an object converts them back to the internal tree representation. This conversion is where recursions would need to be handled.

    @gpshead
    Copy link
    Member

    gpshead commented Mar 16, 2012

    i haven't confirmed if it is this exact bug but I believe a coworker just ran into something similar. he wrote code to use the ast to remove docstrings from code before passing it to compile() (as that saves a noticable amount of memory). in the case the ast for code like:

    def foo():
      """this is a docstring."""

    Removing the docstring and passing such a thing to compile triggers a problem. A workaround was to add a pass in such cases:

    if (node.body and isinstance(node.body[0], ast.Expr) and
    isinstance(node.body[0].value, ast.Str)):
    docstring = node.body.pop(0)
    if len(node.body) == 0:
    # An empty body will sometimes provoke a segfault when you call
    # compile on the code object.
    node.body.append(ast.Pass(lineno=docstring.lineno,
    col_offset=docstring.col_offset))

    regardless, it'd be better if compile() *never* crashed on strange input.

    @benjaminp
    Copy link
    Contributor Author

    Have him try on 3.3. This should be less of an issue there where there is an AST validator. It doesn't fix this bug, but it does fix most accidental AST construction bugs.

    @isidentical
    Copy link
    Sponsor Member

    We can probably implement something like this to prevent this happening
    diff --git a/Parser/asdl_c.py b/Parser/asdl_c.py
    index daac0966f5..f9da52da7f 100755
    --- a/Parser/asdl_c.py
    +++ b/Parser/asdl_c.py
    @@ -559,6 +559,11 @@ class Obj2ModVisitor(PickleVisitor):
                 self.emit("asdl_seq_SET(%s, i, val);" % field.name, depth+2)
                 self.emit("}", depth+1)
             else:
    +            self.emit("if (obj == tmp) {", depth+1)
    +            self.emit("PyErr_SetString(PyExc_RuntimeError, \"Recursing fields "
    +                      "are not supported for AST nodes.\");", depth+2, reflow=False)
    +            self.emit("goto failed;", depth+2)
    +            self.emit("}", depth+1)
                 self.emit("res = obj2ast_%s(tmp, &%s, arena);" %
                           (field.type, field.name), depth+1)
                 self.emit("if (res != 0) goto failed;", depth+1)

    @isidentical isidentical added the 3.9 only security fixes label Dec 28, 2019
    @pppery
    Copy link
    Mannequin

    pppery mannequin commented Dec 31, 2019

    What about indirect cycles like below:

    >> e = ast.UnaryOp(op=ast.Not(), lineno=0, col_offset=0)
    >> f = ast.UnaryOp(op=ast.Not(), lineno=0, col_offset=0)
    >> e.operand = f
    >> f.operand = e
    >> compile(ast.Expression(e), "<test>", "eval")

    (I tested, this also crashes)

    @terryjreedy
    Copy link
    Member

    With 3.9 on Windows, using Benjamin's example, I do not get the Windows equivalent of a seg fault. However, execution stops at compile with no exception, including SystemExit.

    These examples amount to limited fuzz testing of compile(). I think it should raise something like "SyntaxError: recursive ast" or even 'bad ast' if malformed non-recursive asts have the same issue.

    @terryjreedy terryjreedy added 3.10 only security fixes and removed 3.9 only security fixes labels Jul 6, 2020
    @isidentical
    Copy link
    Sponsor Member

    With 3.9 on Windows, using Benjamin's example, I do not get the Windows equivalent of a seg fault. However, execution stops at compile with no exception, including SystemExit.

    I can still reproduce on Linux,

     $ python
    Python 3.10.0a0 (heads/bpo-xxxxx:f2947e354c, May 21 2020, 18:54:57) 
    [GCC 9.2.1 20191008] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import ast
    >>> e = ast.UnaryOp(op=ast.Not(), lineno=0, col_offset=0)
    >>> e.operand = e
    >>> compile(ast.Expression(e), "<test>", "eval")
    [1]    15320 segmentation fault (core dumped)  python

    These examples amount to limited fuzz testing of compile(). I think it should raise something like "SyntaxError: recursive ast" or even 'bad ast' if malformed non-recursive asts have the same issue.

    I dont think it would be easy to locate such errors before they happen, instead I propose (actually already proposed in PR 20594) to add recursion guards to places where this might happen. This can prevent crashes on both direct and indirect cycles

    >>> import ast
    >>> e = ast.UnaryOp(op=ast.Not(), lineno=0, col_offset=0)
    >>> e.operand = e
    >>> compile(ast.Expression(e), "<test>", "eval")
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    RecursionError: maximum recursion depth exceeded while traversing 'expr' node
    >>> e = ast.UnaryOp(op=ast.Not(), lineno=0, col_offset=0)
    >>> f = ast.UnaryOp(op=ast.Not(), lineno=0, col_offset=0)
    >>> e.operand = f
    >>> f.operand = e
    >>> compile(ast.Expression(e), "<test>", "eval")
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    RecursionError: maximum recursion depth exceeded while traversing 'expr' node

    @pablogsal
    Copy link
    Member

    New changeset f349124 by Batuhan Taskaya in branch 'main':
    bpo-11105: Do not crash when compiling recursive ASTs (GH-20594)
    f349124

    @miss-islington
    Copy link
    Contributor

    New changeset 976598d by Miss Islington (bot) in branch '3.10':
    bpo-11105: Do not crash when compiling recursive ASTs (GH-20594)
    976598d

    @isidentical isidentical added 3.9 only security fixes 3.11 bug and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) labels Jun 3, 2021
    @isidentical isidentical changed the title Compiling evil ast crashes interpreter Compiling recursive Python ASTs crash the interpreter Jun 3, 2021
    @pablogsal
    Copy link
    Member

    New changeset de58b31 by Batuhan Taskaya in branch '3.9':
    [3.9] bpo-11105: Do not crash when compiling recursive ASTs (GH-20594) (GH-26522)
    de58b31

    @Fidget-Spinner
    Copy link
    Member

    The newly added test test_recursion_direct seems to trigger a stack overflow on windows in debug mode instead of a RecursionError. Release mode isn't affected and the test passes there.

    One of the buildbots reflects this too: https://buildbot.python.org/all/#/builders/146/builds/337/steps/4/logs/stdio

    I can avoid the crash by lowering the recursion limit in Python from 1000 to 500. The stack size for a window build is currently set to 2MB, which is usually lesser than *nix 8MB. So I think an easy solution is to increase the stack size for windows builds.

    I'm guessing release builds aren't affected because some of the Py_EnterRecursiveCall helper functions are probably inlined and thus use less of the stack.

    Opinions are greatly appreciated.

    @pablogsal
    Copy link
    Member

    Batuhan, can you take a look?

    @isidentical
    Copy link
    Sponsor Member

    Batuhan, can you take a look?

    Yes.

    @isidentical
    Copy link
    Sponsor Member

    The stack size for a window build is currently set to 2MB, which is usually lesser than *nix 8MB. So I think an easy solution is to increase the stack size for windows builds.

    I'm guessing release builds aren't affected because some of the Py_EnterRecursiveCall helper functions are probably inlined and thus use less of the stack.

    Opinions are greatly appreciated.

    I don't think that we should make a global change for this case, AFAIK some of the core parts of the interpreter maintain their own recursion checks with different handling of windows limits. E.g;

    cpython/Python/marshal.c

    Lines 25 to 40 in fa106a6

    /* High water mark to determine when the marshalled object is dangerously deep
    * and risks coring the interpreter. When the object stack gets this deep,
    * raise an exception instead of continuing.
    * On Windows debug builds, reduce this value.
    *
    * BUG: https://bugs.python.org/issue33720
    * On Windows PGO builds, the r_object function overallocates its stack and
    * can cause a stack overflow. We reduce the maximum depth for all Windows
    * releases to protect against this.
    * #if defined(MS_WINDOWS) && defined(_DEBUG)
    */
    #if defined(MS_WINDOWS)
    #define MAX_MARSHAL_STACK_DEPTH 1000
    #else
    #define MAX_MARSHAL_STACK_DEPTH 2000
    #endif

    We might need to end up with the same motion and do the handling by ourselves. Wdyt @pablogsal @kj?

    @isidentical
    Copy link
    Sponsor Member

    After playing with it for a couple hours and without much success of creating a test environment (only using buildbots), I decided not to introduce hard limits. Even though they make the original tests to pass, they don't solve the problem overall and also more important part is that the 'hard limits' might cause regressions for people who do compile(<AST>) calls.

    For normal windows builds (as @kj noted) something might work in the current revision and we might just break it with introducing hard limits. Since the trees are tend to get really branchy, I don't think it is a good idea.

    I'm open to any proposals/plans

    Extra: In the worst case that we can't come up with something (the AST converter functions are really long 2000+ LoC C functions so it is possible that there might be stuff that eats a lot of space on the stack), we can either
    a) revert => not a good option, this is not a regression on the python itself. It is a fix for other os's and windows release builds
    b) always skip the test on windows => we can do that but it might be counterintuitive for the future
    c) use a really low recursion limit for the test_recursion_* for windows => I'm open to fallback to this if nothing comes up.

    we might need to revert this though as is it is not a regression. It used to crash with the same exact error, just outside of the test suite, and now since it works for linux/macos/others + windows for release builds I wonder whether can just skip the test on windows and keep it as is in the worst scenario).

    @pablogsal
    Copy link
    Member

    b) always skip the test on windows => we can do that but it might be counterintuitive for the future

    Well, is not that the test is flaky technically, this means that the feature doesn't work on Windows (non release builds). So the reasoning has to be why we want/need to not support this on Windows. Otherwise we need to customize the limit on debug builds.

    @isidentical
    Copy link
    Sponsor Member

    New changeset e58d762 by Batuhan Taskaya in branch 'main':
    bpo-11105: reduce the recursion limit for tests (GH-26550)
    e58d762

    @isidentical
    Copy link
    Sponsor Member

    New changeset 8004c45 by Batuhan Taskaya in branch 'main':
    bpo-11105: document the new test.support.infinite_recursion context manager (GH-26604)
    8004c45

    @isidentical
    Copy link
    Sponsor Member

    New changeset bd6f0d3 by Batuhan Taskaya in branch '3.10':
    [3.10] bpo-11105: reduce the recursion limit for tests. (GH-26607)
    bd6f0d3

    @isidentical
    Copy link
    Sponsor Member

    New changeset 87f5022 by Batuhan Taskaya in branch '3.9':
    [3.9] bpo-11105: reduce the recursion limit for tests. (GH-26605)
    87f5022

    @isidentical
    Copy link
    Sponsor Member

    The issue has been solved, all buildbots should now pass. Will continue to monitor the situation. Thanks for the report @kj!

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.9 only security fixes 3.10 only security fixes 3.11 bug and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump
    Projects
    None yet
    Development

    No branches or pull requests

    9 participants