-
-
Notifications
You must be signed in to change notification settings - Fork 29.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stack overflow when parse long expression to AST #76939
Comments
Python 2 can crash when compile long expression. >>> x = eval('""' + '+chr(33)'*100000)
Segmentation fault (core dumped) This was fixed in Python 3. RecursionError is raised now. >>> x = eval('""' + '+chr(33)'*100000)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
RecursionError: maximum recursion depth exceeded during compilation
>>> x = eval('+chr(33)'*1000000)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
RecursionError: maximum recursion depth exceeded during compilation But compiling to AST still can crash. >>> import ast
>>> x = ast.parse('+chr(33)'*1000000)
Segmentation fault (core dumped) |
There is also a regression in 3.7 when compile a long expression. In 3.6: >>> compile('+a'*1000000, '?', 'eval')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
RecursionError: maximum recursion depth exceeded during compilation In 3.7: >>> compile('+a'*1000000, '?', 'eval')
Segmentation fault (core dumped) |
Experimenting on Windows with IDLE's current 3.7 Shell, where a user process crash restarts Shell, compile('+a'*31365, '?', 'eval') consistently gives RecursionError, values a bit larger sometimes crash, and values much larger (32000, at max) consistently crash. Perhaps it is not just a coincidence that the regressed lower limit for crashing is near 2**15. |
Ditto as to the limit for ast.parse. >>> import ast; ast.parse('+chr(33)'*32000)
============================ RESTART: Shell =========================
>>> import ast; ast.parse('+chr(33)'*31000)
<_ast.Module object at 0x000001E7920F34A8> I see the same in 3.6 and 3.5, so this is not a 3.7 regression. Rather, was the ast.parse behavior somehow copied into compile for 3.7, perhaps by 'simplifying' code? |
On Linux the limit is much larger that 2**15. This depends on the stack size, it is smaller on Windows. The stack is overflowed by recursive call of ast2obj_expr in Python/Python-ast.c. The same problem exists in other recursive AST processing code: optimizing, unparsing. This is why 3.7 can crash in cases when 3.6 was not crashed. I don't see an easy way of fixing this. The common way is surrounding recursive calls with Py_EnterRecursiveCall()/Py_LeaveRecursiveCall(). But this make the limit too small (1000 by default). In other cases it is enough. The data structures with 1000 levels of nesting are rare. In any case the Python compiler has lower limit on nesting indented blocks. But in this case the linear sequence of binary operators is compiled to deeply nested structure. I see two hard ways of properly fixing this.
|
The other option is to simply not worry about it and acknowledge you can On Sat, Feb 10, 2018, 14:30 Serhiy Storchaka, <report@bugs.python.org>
|
Agreed, this is the most practical option. Do you want to write such acknowledgement? |
If ast_parse returns, a correct tree (A) rather than a buggy tree is a hard requirement. If ast_parse does not return, an exception (B) rather than a crash is strongly desired. We should not risk A to get B. I presume that Serhiy is suggesting that option 1 either has such a risk or would consume developer resources that might be better spent on other improvements. For Python, option 2, seems pretty useless for real code because there are much better ways to sum a long sequence: sum(iterable_of_numbers), seq.append, ''.join(iterable_of_string). Possible addition to the ast.parse entry. |
On Sun, Feb 11, 2018, 16:26 Serhiy Storchaka, <report@bugs.python.org>
I think adding a test to test_crashers and an appropriate comment pointing
|
A consequence of this is that ast.literal_eval() can crash. >>> import ast
>>> ast.literal_eval("+0"*200000)
Segmentation fault (core dumped) It should be documented that ast.literal_eval() is not safe. |
The PR adds the documentation warnings. Serhiy, can you double-check that I have the appropriate functions and the comment is acceptable? |
Thank you Brett! The comment LGTM. Is it worth to add warnings to other functions?
|
You're probably right and it's worth propagating the warning a bit wider. |
I have the changes in for Python 3 for the ast module. Updated TODO list:
|
Actually, the TODO list is:
|
I think we can ignore the inspect module. It is unlikely that it will cause a crash unintentionally, and it is hard to use this for attacks. The attacker needs to create an extension function with malicious __text_signature__, but if he is able to execute arbitrary binary code, there is a much larger problem. And perhaps there is no need to repeat the warning for exec() and eval(). They are considered more dangerous than compile(). |
Thanks for the feedback, Serhiy! Based on that, the new TODO list is:
|
In 3.7 compile() can crash not only when compiling to an AST object (due to recursive AST optimization). |
@serhiy: Correct, which is what the warning says: https://github.com/python/cpython/pull/6043/files . |
At this point there is only Python 2 stuff and I'm not bothered enough to see those changed, so I will leave this open for anyone who wants to put in the effort to backport the warnings. |
This seems like an easy issue that the warnings in docs for ast need to be manually backported to 2.7 since miss Islington cannot cherry pick two PRs for 2.7 in https://bugs.python.org/issue32758#msg313511 . |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: