msg295771 - (view) |
Author: Hrvoje Nikšić (hniksic) * |
Date: 2017-06-12 12:37 |
Our application compiles snippets of user-specified code using the compile built-in with ast.PyCF_ONLY_AST flag. At this stage we catch syntax errors and perform some sanity checks on the AST. The AST is then compiled into actual code using compile() and run using further guards.
We found that using a bare "return" in the code works with ast.PyCF_ONLY_AST, but raises SyntaxError when compiled without the flag:
>>> import ast
>>> compile('return', '', 'exec', ast.PyCF_ONLY_AST, True)
<_ast.Module object at 0x7f35df872310>
>>> compile('return', '', 'exec', 0, True)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "", line 1
SyntaxError: 'return' outside function
Is this intended behavior? It doesn't seem to be documented anywhere.
|
msg295908 - (view) |
Author: Michał Radwański (enedil) * |
Date: 2017-06-13 11:53 |
Docs mention:
ast.parse(source, filename='<unknown>', mode='exec')
Parse the source into an AST node. Equivalent to compile(source, filename, mode, ast.PyCF_ONLY_AST).
If you just parse code into AST, you first check whether it is possible to turn such source into a Python syntax tree. In that case, it obviously is, as you may imagine a function, that returns nothing:
def func():
return
If however you try to make executable code of the source, it is checked whether the constructs make sense in provided context. And, as you may imagine, top-level code with return statement is not valid, hence the error.
|
msg295909 - (view) |
Author: Nick Coghlan (ncoghlan) * |
Date: 2017-06-13 12:10 |
It's intended behaviour, but you're right that we don't explicitly document anywhere that SyntaxError can be reported from three different places:
- the initial parsing based on the language Grammar
- the conversion of the parse tree into the AST
- the conversion of the AST into a runtime code object
It isn't possible to separate the first two from pure Python code, but ast.parse() (aka the ast.PyCF_ONLY_AST compile flag) skips the last one.
As Michał noted, it's usually that last stage which checks for "higher level" constructs related to lexical structure, where certain statements can only be meaningfully executed when used inside a suitable compound statement, but can still be parsed outside it:
```
>>> ast.dump(ast.parse("break"))
'Module(body=[Break()])'
>>> ast.dump(ast.parse("continue"))
'Module(body=[Continue()])'
>>> ast.dump(ast.parse("return"))
'Module(body=[Return(value=None)])'
>>> ast.dump(ast.parse("yield"))
'Module(body=[Expr(value=Yield(value=None))])'
```
(`await` currently isn't in that category, but that's specifically due to the parser hacks used to enable it without needing a __future__ import)
The appropriate fix would probably be to add a sentence to the `ast.PyCF_ONLY_AST` documentation to say that some syntax errors are only detected when compiling the AST to a code object.
|
msg295910 - (view) |
Author: Hrvoje Nikšić (hniksic) * |
Date: 2017-06-13 12:18 |
> The appropriate fix would probably be to add a sentence to the
> `ast.PyCF_ONLY_AST` documentation to say that some syntax errors
> are only detected when compiling the AST to a code object.
Yes, please. I'm not saying the current behavior is wrong (it makes sense that some constructs are legal as AST, but can't be converted into code), I just found it surprising. In other words, we would have found it very useful for the documentation to mention that code generation performs additional checks on the AST that are not performed during the ast.PyCF_ONLY_AST compilation.
|
msg296225 - (view) |
Author: Terry J. Reedy (terry.reedy) * |
Date: 2017-06-17 01:45 |
Python grammar is constrained to be, I believe, LL(1). Whatever the constraint is, there are a few syntax rules that either cannot be written or would be difficult to write within the constraint. So they are checked during compilation. Can you suggest a couple of sentences you would have like to have seen, and where?
I might also note that not all exceptions raised by compile are literally 'SyntaxError's.
|
msg301001 - (view) |
Author: Hrvoje Nikšić (hniksic) * |
Date: 2017-08-29 21:39 |
> Can you suggest a couple of sentences you would have like to have
> seen, and where?
Thanks, I would suggest to add something like this to the documentation of ast.parse:
"""
``parse`` raises ``SyntaxError`` if the compiled source is invalid, and ``ValueError`` if the source contains null bytes. Note that a successful parse does not guarantee correct syntax of ``source``. Further syntax errors can be detected, and ``SyntaxError`` raised, when the source is compiled to a code object using ``compile`` without the ``ast.PyCF_ONLY_AST`` flag, or executed with ``exec``. For example, a lone ``break`` statement can be parsed, but not converted into a code object or executed.
"""
I don't think the ``compile`` docs need to be changed, partly because they're already sizable, and partly because they don't document individual flags at all. (A reference to the ``ast`` module regarding the flags, like the one for AST objects in the first paragraph, might be a useful addition.)
|
msg401136 - (view) |
Author: Irit Katriel (iritkatriel) * |
Date: 2021-09-06 13:45 |
Adding Pablo, a.k.a The King of Errors.
|
msg401172 - (view) |
Author: Terry J. Reedy (terry.reedy) * |
Date: 2021-09-06 18:18 |
The doc section in question is
https://docs.python.org/3/library/ast.html#ast.parse
I confirmed that 'break', 'continue', 'yield', and 'return' still parse, with the results how having "type_ignores=[]" added.
'Module(body=[Expr(value=Yield())], type_ignores=[])'
I do not understand Nick's comment about 'await' as 'await' is not a legal statement'
The current initial paragraph says:
"Parse the source into an AST node. Equivalent to compile(source, filename, mode, ast.PyCF_ONLY_AST)."
I suggest following this with:
"If the AST node is compiled to a code object, there are additional syntax checks that can raise errors. "For example, 'return' parses to a node, but is not legal outside of a def statement."
Hrvoje's suggested adding something like
"If source contains a null character ('\0'), ValueError is raised."
I don't think that this is needed as ast.parse is explicitly noted as equivalent to a compile call, and the compile doc says "This function raises SyntaxError if the compiled source is invalid, and ValueError if the source contains null bytes." (Should not that be 'null characters' given that *source* is now unicode?) This statement need not be repeated here.
|
msg401174 - (view) |
Author: Irit Katriel (iritkatriel) * |
Date: 2021-09-06 18:40 |
Re null in source code, see issue20115.
|
msg402177 - (view) |
Author: Pablo Galindo Salgado (pablogsal) * |
Date: 2021-09-19 22:45 |
New changeset e6d05a4092b4176a30d1d1596585df13c2ab676d by Pablo Galindo Salgado in branch 'main':
bpo-30637: Improve the docs of ast.parse regarding differences with compile() (GH-28459)
https://github.com/python/cpython/commit/e6d05a4092b4176a30d1d1596585df13c2ab676d
|
msg402182 - (view) |
Author: miss-islington (miss-islington) |
Date: 2021-09-19 23:07 |
New changeset f17c979d909f05916e354ae54c82bff71dbede35 by Miss Islington (bot) in branch '3.10':
bpo-30637: Improve the docs of ast.parse regarding differences with compile() (GH-28459)
https://github.com/python/cpython/commit/f17c979d909f05916e354ae54c82bff71dbede35
|
msg402184 - (view) |
Author: miss-islington (miss-islington) |
Date: 2021-09-19 23:14 |
New changeset 41e2a31c13ba73e2c30e9bf0be9417fd17e8ace2 by Miss Islington (bot) in branch '3.9':
bpo-30637: Improve the docs of ast.parse regarding differences with compile() (GH-28459)
https://github.com/python/cpython/commit/41e2a31c13ba73e2c30e9bf0be9417fd17e8ace2
|
msg403152 - (view) |
Author: Pablo Galindo Salgado (pablogsal) * |
Date: 2021-10-04 19:18 |
New changeset f025ea23210173b42303360aca05132e4ffdfed3 by Pablo Galindo (Miss Islington (bot)) in branch '3.10':
bpo-30637: Improve the docs of ast.parse regarding differences with compile() (GH-28459)
https://github.com/python/cpython/commit/f025ea23210173b42303360aca05132e4ffdfed3
|
|
Date |
User |
Action |
Args |
2022-04-11 14:58:47 | admin | set | github: 74822 |
2021-10-04 19:18:41 | pablogsal | set | messages:
+ msg403152 |
2021-09-19 23:32:16 | pablogsal | set | status: open -> closed resolution: fixed stage: patch review -> resolved |
2021-09-19 23:14:01 | miss-islington | set | messages:
+ msg402184 |
2021-09-19 23:07:22 | miss-islington | set | messages:
+ msg402182 |
2021-09-19 22:45:17 | miss-islington | set | pull_requests:
+ pull_request26862 |
2021-09-19 22:45:12 | miss-islington | set | nosy:
+ miss-islington pull_requests:
+ pull_request26861
|
2021-09-19 22:45:02 | pablogsal | set | messages:
+ msg402177 |
2021-09-19 19:39:59 | pablogsal | set | keywords:
+ patch stage: needs patch -> patch review pull_requests:
+ pull_request26860 |
2021-09-06 18:40:05 | iritkatriel | set | messages:
+ msg401174 |
2021-09-06 18:18:38 | terry.reedy | set | messages:
+ msg401172 versions:
+ Python 3.11, - Python 2.7, Python 3.7 |
2021-09-06 13:45:28 | iritkatriel | set | nosy:
+ iritkatriel, pablogsal messages:
+ msg401136
|
2017-08-29 21:39:59 | hniksic | set | messages:
+ msg301001 |
2017-06-17 01:45:01 | terry.reedy | set | nosy:
+ terry.reedy messages:
+ msg296225
|
2017-06-13 12:18:03 | hniksic | set | messages:
+ msg295910 |
2017-06-13 12:10:56 | ncoghlan | set | assignee: docs@python type: enhancement components:
+ Documentation, - Interpreter Core
nosy:
+ docs@python, ncoghlan messages:
+ msg295909 stage: needs patch |
2017-06-13 11:53:12 | enedil | set | nosy:
+ enedil messages:
+ msg295908
|
2017-06-12 12:37:36 | hniksic | create | |