classification
Title: Syntax error reported on compile(...), but not on compile(..., ast.PyCF_ONLY_AST)
Type: enhancement Stage: resolved
Components: Documentation Versions: Python 3.11
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: docs@python Nosy List: docs@python, enedil, hniksic, iritkatriel, miss-islington, ncoghlan, pablogsal, terry.reedy
Priority: normal Keywords: patch

Created on 2017-06-12 12:37 by hniksic, last changed 2021-10-04 19:18 by pablogsal. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 28459 merged pablogsal, 2021-09-19 19:39
PR 28460 merged miss-islington, 2021-09-19 22:45
PR 28461 merged miss-islington, 2021-09-19 22:45
Messages (13)
msg295771 - (view) Author: Hrvoje Nikšić (hniksic) * Date: 2017-06-12 12:37
Our application compiles snippets of user-specified code using the compile built-in with ast.PyCF_ONLY_AST flag. At this stage we catch syntax errors and perform some sanity checks on the AST. The AST is then compiled into actual code using compile() and run using further guards.

We found that using a bare "return" in the code works with ast.PyCF_ONLY_AST, but raises SyntaxError when compiled without the flag:

>>> import ast
>>> compile('return', '', 'exec', ast.PyCF_ONLY_AST, True)
<_ast.Module object at 0x7f35df872310>

>>> compile('return', '', 'exec', 0, True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "", line 1
SyntaxError: 'return' outside function

Is this intended behavior? It doesn't seem to be documented anywhere.
msg295908 - (view) Author: Michał Radwański (enedil) * Date: 2017-06-13 11:53
Docs mention:


ast.parse(source, filename='<unknown>', mode='exec')
    Parse the source into an AST node. Equivalent to compile(source, filename, mode, ast.PyCF_ONLY_AST).

If you just parse code into AST, you first check whether it is possible to turn such source into a Python syntax tree. In that case, it obviously is, as you may imagine a function, that returns nothing:

def func():
    return

If however you try to make executable code of the source, it is checked whether the constructs make sense in provided context. And, as you may imagine, top-level code with return statement is not valid, hence the error.
msg295909 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2017-06-13 12:10
It's intended behaviour, but you're right that we don't explicitly document anywhere that SyntaxError can be reported from three different places:

- the initial parsing based on the language Grammar
- the conversion of the parse tree into the AST
- the conversion of the AST into a runtime code object

It isn't possible to separate the first two from pure Python code, but ast.parse() (aka the ast.PyCF_ONLY_AST compile flag) skips the last one.

As Michał noted, it's usually that last stage which checks for "higher level" constructs related to lexical structure, where certain statements can only be meaningfully executed when used inside a suitable compound statement, but can still be parsed outside it:

```
>>> ast.dump(ast.parse("break"))
'Module(body=[Break()])'
>>> ast.dump(ast.parse("continue"))
'Module(body=[Continue()])'
>>> ast.dump(ast.parse("return"))
'Module(body=[Return(value=None)])'
>>> ast.dump(ast.parse("yield"))
'Module(body=[Expr(value=Yield(value=None))])'
```

(`await` currently isn't in that category, but that's specifically due to the parser hacks used to enable it without needing a __future__ import)

The appropriate fix would probably be to add a sentence to the `ast.PyCF_ONLY_AST` documentation to say that some syntax errors are only detected when compiling the AST to a code object.
msg295910 - (view) Author: Hrvoje Nikšić (hniksic) * Date: 2017-06-13 12:18
> The appropriate fix would probably be to add a sentence to the
> `ast.PyCF_ONLY_AST` documentation to say that some syntax errors
> are only detected when compiling the AST to a code object.

Yes, please. I'm not saying the current behavior is wrong (it makes sense that some constructs are legal as AST, but can't be converted into code), I just found it surprising. In other words, we would have found it very useful for the documentation to mention that code generation performs additional checks on the AST that are not performed during the ast.PyCF_ONLY_AST compilation.
msg296225 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2017-06-17 01:45
Python grammar is constrained to be, I believe, LL(1).  Whatever the constraint is, there are a few syntax rules that either cannot be written or would be difficult to write within the constraint.  So they are checked during compilation.  Can you suggest a couple of sentences you would have like to have seen, and where?

I might also note that not all exceptions raised by compile are literally 'SyntaxError's.
msg301001 - (view) Author: Hrvoje Nikšić (hniksic) * Date: 2017-08-29 21:39
> Can you suggest a couple of sentences you would have like to have
> seen, and where?

Thanks, I would suggest to add something like this to the documentation of ast.parse:

"""
``parse`` raises ``SyntaxError`` if the compiled source is invalid, and ``ValueError`` if the source contains null bytes. Note that a successful parse does not guarantee correct syntax of ``source``. Further syntax errors can be detected, and ``SyntaxError`` raised, when the source is compiled to a code object using ``compile`` without the ``ast.PyCF_ONLY_AST`` flag, or executed with ``exec``. For example, a lone ``break`` statement can be parsed, but not converted into a code object or executed.
"""

I don't think the ``compile`` docs need to be changed, partly because they're already sizable, and partly because they don't document individual flags at all. (A reference to the ``ast`` module regarding the flags, like the one for AST objects in the first paragraph, might be a useful addition.)
msg401136 - (view) Author: Irit Katriel (iritkatriel) * (Python committer) Date: 2021-09-06 13:45
Adding Pablo, a.k.a The King of Errors.
msg401172 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2021-09-06 18:18
The doc section in question is
https://docs.python.org/3/library/ast.html#ast.parse

I confirmed that 'break', 'continue', 'yield', and 'return' still parse, with the results how having "type_ignores=[]" added.
  'Module(body=[Expr(value=Yield())], type_ignores=[])'
I do not understand Nick's comment about 'await' as 'await' is not a legal statement'

The current initial paragraph says:
  "Parse the source into an AST node. Equivalent to compile(source, filename, mode, ast.PyCF_ONLY_AST)."

I suggest following this with:
  "If the AST node is compiled to a code object, there are additional syntax checks that can raise errors.  "For example, 'return' parses to a node, but is not legal outside of a def statement."

Hrvoje's suggested adding something like
  "If source contains a null character ('\0'), ValueError is raised."

I don't think that this is needed as ast.parse is explicitly noted as equivalent to a compile call, and the compile doc says "This function raises SyntaxError if the compiled source is invalid, and ValueError if the source contains null bytes."  (Should not that be 'null characters' given that *source* is now unicode?) This statement need not be repeated here.
msg401174 - (view) Author: Irit Katriel (iritkatriel) * (Python committer) Date: 2021-09-06 18:40
Re null in source code, see issue20115.
msg402177 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-09-19 22:45
New changeset e6d05a4092b4176a30d1d1596585df13c2ab676d by Pablo Galindo Salgado in branch 'main':
bpo-30637: Improve the docs of ast.parse regarding differences with compile() (GH-28459)
https://github.com/python/cpython/commit/e6d05a4092b4176a30d1d1596585df13c2ab676d
msg402182 - (view) Author: miss-islington (miss-islington) Date: 2021-09-19 23:07
New changeset f17c979d909f05916e354ae54c82bff71dbede35 by Miss Islington (bot) in branch '3.10':
bpo-30637: Improve the docs of ast.parse regarding differences with compile() (GH-28459)
https://github.com/python/cpython/commit/f17c979d909f05916e354ae54c82bff71dbede35
msg402184 - (view) Author: miss-islington (miss-islington) Date: 2021-09-19 23:14
New changeset 41e2a31c13ba73e2c30e9bf0be9417fd17e8ace2 by Miss Islington (bot) in branch '3.9':
bpo-30637: Improve the docs of ast.parse regarding differences with compile() (GH-28459)
https://github.com/python/cpython/commit/41e2a31c13ba73e2c30e9bf0be9417fd17e8ace2
msg403152 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-10-04 19:18
New changeset f025ea23210173b42303360aca05132e4ffdfed3 by Pablo Galindo (Miss Islington (bot)) in branch '3.10':
bpo-30637: Improve the docs of ast.parse regarding differences with compile() (GH-28459)
https://github.com/python/cpython/commit/f025ea23210173b42303360aca05132e4ffdfed3
History
Date User Action Args
2021-10-04 19:18:41pablogsalsetmessages: + msg403152
2021-09-19 23:32:16pablogsalsetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2021-09-19 23:14:01miss-islingtonsetmessages: + msg402184
2021-09-19 23:07:22miss-islingtonsetmessages: + msg402182
2021-09-19 22:45:17miss-islingtonsetpull_requests: + pull_request26862
2021-09-19 22:45:12miss-islingtonsetnosy: + miss-islington
pull_requests: + pull_request26861
2021-09-19 22:45:02pablogsalsetmessages: + msg402177
2021-09-19 19:39:59pablogsalsetkeywords: + patch
stage: needs patch -> patch review
pull_requests: + pull_request26860
2021-09-06 18:40:05iritkatrielsetmessages: + msg401174
2021-09-06 18:18:38terry.reedysetmessages: + msg401172
versions: + Python 3.11, - Python 2.7, Python 3.7
2021-09-06 13:45:28iritkatrielsetnosy: + iritkatriel, pablogsal
messages: + msg401136
2017-08-29 21:39:59hniksicsetmessages: + msg301001
2017-06-17 01:45:01terry.reedysetnosy: + terry.reedy
messages: + msg296225
2017-06-13 12:18:03hniksicsetmessages: + msg295910
2017-06-13 12:10:56ncoghlansetassignee: docs@python
type: enhancement
components: + Documentation, - Interpreter Core

nosy: + docs@python, ncoghlan
messages: + msg295909
stage: needs patch
2017-06-13 11:53:12enedilsetnosy: + enedil
messages: + msg295908
2017-06-12 12:37:36hniksiccreate