classification
Title: add a "expected expression" syntax error
Type: enhancement Stage: patch review
Components: Parser Versions: Python 3.11
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: CCLDArjun, aroberge, eamanu, lys.nikolaou, pablogsal
Priority: normal Keywords: patch

Created on 2021-06-06 20:38 by CCLDArjun, last changed 2021-06-17 13:01 by aroberge.

Pull Requests
URL Status Linked Edit
PR 26592 closed CCLDArjun, 2021-06-08 06:36
Messages (7)
msg395213 - (view) Author: Arjun (CCLDArjun) * Date: 2021-06-06 20:38
Recently, CPython got a new syntax error, "SyntaxError: expression expected after dictionary key and ':'". I propose to add a "expected expression" in general for consistency. I would like to have it changed for all the "equals" (e.g. PLUSEQUAL, MINEQUAL, etc). 


>>> x =
  File "<stdin>", line 1
    x =
        ^
SyntaxError: invalid syntax

Would be enhanced by:

>>> x +=
  File "<stdin>", line 1
    x =
         ^
SyntaxError: expected expression
msg395215 - (view) Author: Arjun (CCLDArjun) * Date: 2021-06-06 20:46
I forgot to add, I would be willing to make the necessary changes, if accepted
msg395225 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-06-06 22:28
This one will be very tricky to do correctly because the '=' is very context-sensitive and the parser can be confused when backtracking, so this *may* be quite delicate/complex. 

I need to play a bit with this to know how feasible this would be.
msg395226 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-06-06 22:42
I suspect this is going to be a pain for malformed expressions on the right. For instance:

>>> a = {x: for x in {}}
  File "<stdin>", line 1
    a = {x: for x in {}}
            ^^^
SyntaxError: expected an expression
msg395233 - (view) Author: Arjun (CCLDArjun) * Date: 2021-06-06 23:42
> This one will be very tricky to do correctly because the '=' is very context-sensitive and the parser can be confused when backtracking, so this *may* be quite delicate/complex

Well, I was thinking we could just do a simple check in _PyPegen_check_tokenizer_errors or _PyPegen_run_parser functions. If the last three tokens in the Parser object's tokens array are NAME, EQUAL/MINEQUAL/etc and NEWLINE, we raise the special error. Is this the right way to do it? I saw that unclosed parentheses' special error are checked in the same place. 

> I suspect this is going to be a pain for malformed expressions on the right

Yea, I realized that the "expected an expression" error can be used in multiple places. Could be added one by one?
msg395234 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-06-06 23:52
>> Well, I was thinking we could just do a simple check in _PyPegen_check_tokenizer_errors or _PyPegen_run_parser functions. If the last three tokens in the Parser object's tokens array are NAME, EQUAL/MINEQUAL/etc and NEWLINE, we raise the special error. Is this the right way to do it? I saw that unclosed parentheses' special error are checked in the same place. 

I find that quite inelegant and error prone. A PEG parser is not assured to finish when you think it will finish as it can backtrack and expand to parse left recursive rules. Incorrect syntax must be handled by the parser itself using the parser process, not after the fact. Once the parser has finished, the semantic information is gone and you are left with unstructured tokens.


> Yea, I realized that the "expected an expression" error can be used in multiple places. Could be added one by one?

Not sure what you mean here, but this should be added in a single place related to the assignment rule, otherwise is going to be quite difficult to maintain.
msg395235 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-06-06 23:57
>  I saw that unclosed parentheses' special error are checked in the same place. 

Just to clarify: unclosed parentheses is a tokenizer error, not a parser error and this is handled by checking the tokenize status when it has already failed. The reason is done after the failed parser is because our tokenizer is made lazy and to check for unclosed pantheses you need to fully parse everything, and this needs a driver. There is no semantic analysis here, just checking the lexer status: that's why is handled separately
History
Date User Action Args
2021-06-17 13:01:31arobergesetnosy: + aroberge
2021-06-17 11:51:33eamanusetnosy: + eamanu
2021-06-08 06:36:08CCLDArjunsetkeywords: + patch
stage: patch review
pull_requests: + pull_request25176
2021-06-06 23:57:09pablogsalsetmessages: + msg395235
2021-06-06 23:52:11pablogsalsetmessages: + msg395234
2021-06-06 23:42:37CCLDArjunsetmessages: + msg395233
2021-06-06 22:42:57pablogsalsetmessages: + msg395226
2021-06-06 22:28:19pablogsalsetmessages: + msg395225
2021-06-06 20:46:09CCLDArjunsetmessages: + msg395215
2021-06-06 20:38:43CCLDArjuncreate