Title: Syntax error reported on compile(...), but not on compile(..., ast.PyCF_ONLY_AST)
Type: enhancement Stage: needs patch
Components: Documentation Versions: Python 3.7, Python 2.7
Status: open Resolution:
Dependencies: Superseder:
Assigned To: docs@python Nosy List: docs@python, enedil, hniksic, ncoghlan, terry.reedy
Priority: normal Keywords:

Created on 2017-06-12 12:37 by hniksic, last changed 2017-08-29 21:39 by hniksic.

Messages (6)
msg295771 - (view) Author: Hrvoje Nikšić (hniksic) Date: 2017-06-12 12:37
Our application compiles snippets of user-specified code using the compile built-in with ast.PyCF_ONLY_AST flag. At this stage we catch syntax errors and perform some sanity checks on the AST. The AST is then compiled into actual code using compile() and run using further guards.

We found that using a bare "return" in the code works with ast.PyCF_ONLY_AST, but raises SyntaxError when compiled without the flag:

>>> import ast
>>> compile('return', '', 'exec', ast.PyCF_ONLY_AST, True)
<_ast.Module object at 0x7f35df872310>

>>> compile('return', '', 'exec', 0, True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "", line 1
SyntaxError: 'return' outside function

Is this intended behavior? It doesn't seem to be documented anywhere.
msg295908 - (view) Author: Michał Radwański (enedil) * Date: 2017-06-13 11:53
Docs mention:

ast.parse(source, filename='<unknown>', mode='exec')
    Parse the source into an AST node. Equivalent to compile(source, filename, mode, ast.PyCF_ONLY_AST).

If you just parse code into AST, you first check whether it is possible to turn such source into a Python syntax tree. In that case, it obviously is, as you may imagine a function, that returns nothing:

def func():

If however you try to make executable code of the source, it is checked whether the constructs make sense in provided context. And, as you may imagine, top-level code with return statement is not valid, hence the error.
msg295909 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2017-06-13 12:10
It's intended behaviour, but you're right that we don't explicitly document anywhere that SyntaxError can be reported from three different places:

- the initial parsing based on the language Grammar
- the conversion of the parse tree into the AST
- the conversion of the AST into a runtime code object

It isn't possible to separate the first two from pure Python code, but ast.parse() (aka the ast.PyCF_ONLY_AST compile flag) skips the last one.

As Michał noted, it's usually that last stage which checks for "higher level" constructs related to lexical structure, where certain statements can only be meaningfully executed when used inside a suitable compound statement, but can still be parsed outside it:

>>> ast.dump(ast.parse("break"))
>>> ast.dump(ast.parse("continue"))
>>> ast.dump(ast.parse("return"))
>>> ast.dump(ast.parse("yield"))

(`await` currently isn't in that category, but that's specifically due to the parser hacks used to enable it without needing a __future__ import)

The appropriate fix would probably be to add a sentence to the `ast.PyCF_ONLY_AST` documentation to say that some syntax errors are only detected when compiling the AST to a code object.
msg295910 - (view) Author: Hrvoje Nikšić (hniksic) Date: 2017-06-13 12:18
> The appropriate fix would probably be to add a sentence to the
> `ast.PyCF_ONLY_AST` documentation to say that some syntax errors
> are only detected when compiling the AST to a code object.

Yes, please. I'm not saying the current behavior is wrong (it makes sense that some constructs are legal as AST, but can't be converted into code), I just found it surprising. In other words, we would have found it very useful for the documentation to mention that code generation performs additional checks on the AST that are not performed during the ast.PyCF_ONLY_AST compilation.
msg296225 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2017-06-17 01:45
Python grammar is constrained to be, I believe, LL(1).  Whatever the constraint is, there are a few syntax rules that either cannot be written or would be difficult to write within the constraint.  So they are checked during compilation.  Can you suggest a couple of sentences you would have like to have seen, and where?

I might also note that not all exceptions raised by compile are literally 'SyntaxError's.
msg301001 - (view) Author: Hrvoje Nikšić (hniksic) Date: 2017-08-29 21:39
> Can you suggest a couple of sentences you would have like to have
> seen, and where?

Thanks, I would suggest to add something like this to the documentation of ast.parse:

``parse`` raises ``SyntaxError`` if the compiled source is invalid, and ``ValueError`` if the source contains null bytes. Note that a successful parse does not guarantee correct syntax of ``source``. Further syntax errors can be detected, and ``SyntaxError`` raised, when the source is compiled to a code object using ``compile`` without the ``ast.PyCF_ONLY_AST`` flag, or executed with ``exec``. For example, a lone ``break`` statement can be parsed, but not converted into a code object or executed.

I don't think the ``compile`` docs need to be changed, partly because they're already sizable, and partly because they don't document individual flags at all. (A reference to the ``ast`` module regarding the flags, like the one for AST objects in the first paragraph, might be a useful addition.)
Date User Action Args
2017-08-29 21:39:59hniksicsetmessages: + msg301001
2017-06-17 01:45:01terry.reedysetnosy: + terry.reedy
messages: + msg296225
2017-06-13 12:18:03hniksicsetmessages: + msg295910
2017-06-13 12:10:56ncoghlansetassignee: docs@python
type: enhancement
components: + Documentation, - Interpreter Core

nosy: + docs@python, ncoghlan
messages: + msg295909
stage: needs patch
2017-06-13 11:53:12enedilsetnosy: + enedil
messages: + msg295908
2017-06-12 12:37:36hniksiccreate