classification
Title: Make errors due to full parser stack identifiable
Type: enhancement Stage: needs patch
Components: Interpreter Core Versions: Python 3.11
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: Arfrever, Trundle, aroberge, benjamin.peterson, eric.snow, georg.brandl, iritkatriel, loewis, marienz, ncoghlan, orsenthil, pablogsal, pitrou, r.david.murray, terry.reedy
Priority: normal Keywords: patch

Created on 2011-02-26 22:39 by Trundle, last changed 2021-07-08 16:46 by pablogsal.

Files
File name Uploaded Description Edit
issue11343.patch Trundle, 2011-03-25 01:05 review
Messages (29)
msg129604 - (view) Author: Andreas Stührk (Trundle) * Date: 2011-02-26 22:39
Currently, if the parser's internal stack is full (as it happens when
the parser tries to parse a deeply nested structure), the parser
writes an error message to stderr and a bare MemoryError is
raised. That way, it is really hard for REPLs do handle input that
causes the parser to exhaust its stack correctly (see e.g. ``python -c
"print '(' * 100 + ')' * 100" | python -c "import code;
code.interact()"``).

In my opinion, a SyntaxError would be great because that way one can
easily see where in the source code the error is caused. One reason
against that (stated by MvL in msg1687) is that it's really
syntactically correct Python and only the parser can't parse it. But
then, the same is also true when there are too many indentation levels
in which case an `IndentationError` is raised (which is a subclass of
`SyntaxError`).

I have attached two patches: The first still raises a `MemoryError`
(for the sake of backward compatibility) but adds a message ("too many
opening parens") to it. The second raises a `SyntaxError` instead.
msg130076 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2011-03-04 21:30
I agree that compile-time stack exhaustion is different from runtime object-heap exhaustion and could/should have a different error.

I agree with Martin (from 2000) that SyntaxError is not right either. Perhaps a new ParseError subclass thereof. I believe IndentationError was added since 2000. Its doc is "Base class for syntax errors related to incorrect indentation." and it has TabError as a subclass, so its use for too many indents (not really 'incorrect') is not obvious. But having the position marked (if it would be) would be a plus.

I presume REPL == read-eval-print-loop (from Google). Would a new error help such programs (like code.interact, or IDLE)?
msg132063 - (view) Author: Andreas Stührk (Trundle) * Date: 2011-03-25 01:05
On Fri, Mar 4, 2011 at 9:30 PM, Terry J. Reedy <report@bugs.python.org> wrote:
> I agree with Martin (from 2000) that SyntaxError is not right either. Perhaps a new ParseError subclass thereof.

I added a new `ParserError` that inherits from `SyntaxError`.

> I presume REPL == read-eval-print-loop (from Google). Would a new error help such programs (like code.interact, or IDLE)?

Yes, I meant a read-eval-print-loop. A new error would help and if
it's a subclass of SyntaxError, they most likely need no changes at
all to handle that new error, as they have to deal with SyntaxErrors
anyway.
msg134778 - (view) Author: Andreas Stührk (Trundle) * Date: 2011-04-29 16:04
FWIW, this also affects `ast.literal_eval()`.
msg140456 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-07-15 18:53
This looks like a rather good idea, although I'm not sure it deserves a new global exception type.
msg140469 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2011-07-15 19:47
Just make it a SyntaxError.
msg140470 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2011-07-15 19:48
Oh, I see Martin disagrees. SyntaxError is raised for anything which Python won't compile. For example, too many arguments gets a SyntaxError.
msg140479 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2011-07-16 07:56
+1 for a new exception type to indicate "this may technically be legal Python, but this Python implementation cannot handle it correctly"

Whatever exception type we add, it would be nice to also be able to use it if someone eventually fixes the compiler recursion crasher, so I'd like to paint this particular bikeshed as the more general "SyntaxLimitError".

Possible docs phrasing:

exception:: SyntaxLimitError

Raised when the compilation and code generation process encounters an error due to internal limitations of the current implementation (e.g. the CPython parser has an upper limit on the number of nested sets of parentheses it can handle in a single expression, but no such limitation exists in the Python language reference).
This is a subclass of :exc:`SyntaxError`.
msg140480 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2011-07-16 08:24
I like SyntaxLimitError much better than ParserError.
msg140496 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2011-07-16 14:43
2011/7/16 Nick Coghlan <report@bugs.python.org>:
>
> Nick Coghlan <ncoghlan@gmail.com> added the comment:
>
> +1 for a new exception type to indicate "this may technically be legal Python, but this Python implementation cannot handle it correctly"
>
> Whatever exception type we add, it would be nice to also be able to use it if someone eventually fixes the compiler recursion crasher, so I'd like to paint this particular bikeshed as the more general "SyntaxLimitError".

What is the point of a new exception? When would you ever want to
catch it as opposed to a regular SyntaxError? You're going to have to
change the code either way.

Moreover, all Python implementations are going to have to place some
limits on stack depth and recursion, so it's not really an
implementation detail.

The Python language doesn't make an mention of limited memory, but no
one is going to suggest a MemoryLimitError, which is an
"implementation detail".
msg140498 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2011-07-16 15:10
It's important to remember that other implementations treat CPython as
the "gold standard" for compatibility purposes. If we declare
something to be an ordinary SyntaxError, then that carries strong
implications for what other implementations should do.

Some kinds of errors are inherently implementation specific
(MemoryError and SystemError spring to mind). No sane implementor is
going to try to match CPython like-for-like when it comes to those.
SyntaxError, however, is typically defined by the language definition,
not the CPython implementation of it. By introducing a new exception
type, we're explicitly telling other implementations "Look, this is
our problem that we don't plan to fix as it doesn't typically arise in
real code, but please don't deliberately cripple your own
implementation just to match this behaviour". I'm strongly with MvL on
this one - SyntaxError itself should never be raised for legal Python
code just because the CPython bytecode generation toolchain isn't able
to handle it properly.
msg140499 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2011-07-16 15:16
2011/7/16 Nick Coghlan <report@bugs.python.org>:
>
> Nick Coghlan <ncoghlan@gmail.com> added the comment:
>
> It's important to remember that other implementations treat CPython as
> the "gold standard" for compatibility purposes. If we declare
> something to be an ordinary SyntaxError, then that carries strong
> implications for what other implementations should do.
>
> Some kinds of errors are inherently implementation specific
> (MemoryError and SystemError spring to mind). No sane implementor is
> going to try to match CPython like-for-like when it comes to those.
> SyntaxError, however, is typically defined by the language definition,
> not the CPython implementation of it. By introducing a new exception
> type, we're explicitly telling other implementations "Look, this is
> our problem that we don't plan to fix as it doesn't typically arise in
> real code, but please don't deliberately cripple your own
> implementation just to match this behaviour". I'm strongly with MvL on
> this one - SyntaxError itself should never be raised for legal Python
> code just because the CPython bytecode generation toolchain isn't able
> to handle it properly.

Implementations are quite good at figuring this stuff out themselves.
We don't need a whole new exception (new docs, one more thing in the
builtin namespace) to "send a message" to other implementations. If
some implementation is not part of the language, then its test just
just be marked cpython_only. It shouldn't be brought into public API
and the language.
msg140500 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2011-07-16 15:22
It also makes it clear to users whether they've just run up against a
limitation of the implementation they're using or whether what they've
written is genuinely illegal code. They are NOT the same thing.
Attempting to conflate them into one exception for the mere sake of
not adding a new exception type is a false economy. Take it to
python-dev if you want, but I will strongly oppose this check going in
with it raising an ordinary SyntaxError instead of a new subclass.
msg140501 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2011-07-16 15:33
2011/7/16 Nick Coghlan <report@bugs.python.org>:
>
> Nick Coghlan <ncoghlan@gmail.com> added the comment:
>
> It also makes it clear to users whether they've just run up against a
> limitation of the implementation they're using or whether what they've
> written is genuinely illegal code. They are NOT the same thing.
> Attempting to conflate them into one exception for the mere sake of
> not adding a new exception type is a false economy. Take it to
> python-dev if you want, but I will strongly oppose this check going in
> with it raising an ordinary SyntaxError instead of a new subclass.

Why would it matter to users what variety of limitation they're
running up against? As I said, they're still going to have to change
their code.
msg140502 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2011-07-16 15:53
It matters, because Python "users" are programmers, and most
programmers want to know *why* they're being told something is wrong.
Raising MemoryError is technically incorrect at this point, but at
least gives the right flavour of the user not doing anything
specifically wrong, the program just ran out of resources trying to do
as they asked. I see SyntaxError as significantly worse, however, as
instead of telling the developer "sorry, that's a reasonable request,
but I can't handle it", the interpreter is now telling them "you did
that wrong, change it". The semantics of that are all wrong. There's
nothing wrong with the syntax of the user's code, it's just that the
compiler can't handle it.

Similarly, the eventual fix to the recursive crash in the compiler
itself is likely to be an arbitrary limit on expression nesting.
Again, nothing wrong with their syntax, but they'll need to adjust
their (legal!) code to work around implementation limitations.
msg140504 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2011-07-16 16:50
2011/7/16 Nick Coghlan <report@bugs.python.org>:
>
> Nick Coghlan <ncoghlan@gmail.com> added the comment:
>
> It matters, because Python "users" are programmers, and most
> programmers want to know *why* they're being told something is wrong.
> Raising MemoryError is technically incorrect at this point, but at
> least gives the right flavour of the user not doing anything
> specifically wrong, the program just ran out of resources trying to do
> as they asked. I see SyntaxError as significantly worse, however, as
> instead of telling the developer "sorry, that's a reasonable request,
> but I can't handle it", the interpreter is now telling them "you did
> that wrong, change it". The semantics of that are all wrong. There's
> nothing wrong with the syntax of the user's code, it's just that the
> compiler can't handle it.

I content that in normal code, it is so extremely rare as to be
unheard of, to get exceptions about the parser stack overflowing or
segfault the compiler by too deep nesting. People who are doing this
(generally to prove the point about limitations of the compiler) are
smart enough to know exactly what "parser stack overflowed" means and
separate that from Python as the language. Adding a new exception is
solving a non-existent problem.
msg140506 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-07-16 22:28
> I content that in normal code, it is so extremely rare as to be
> unheard of, to get exceptions about the parser stack overflowing or
> segfault the compiler by too deep nesting. People who are doing this
> (generally to prove the point about limitations of the compiler) are
> smart enough to know exactly what "parser stack overflowed" means and
> separate that from Python as the language. Adding a new exception is
> solving a non-existent problem.

Agreed with Benjamin. Also, "why something is wrong" can simply be told
in the exception message.
msg140507 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011-07-16 23:13
This is a purity versus practicality argument.  I agree with both sides :)
msg140517 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2011-07-17 06:32
After further reflection, I (eventually) came to the same conclusion as Benjamin and Antoine - using SyntaxError for this is fine.

However, the exception *message* should make it clear that this is an implementation limit rather than a language limit.
msg140631 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2011-07-18 23:28
Oh, I thought we never rely on exception "message" for anything
important. However this seems to be an exception for that exception.
:-)
msg140635 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2011-07-18 23:34
> Oh, I thought we never rely on exception "message" for anything
> important. However this seems to be an exception for that exception.
> :-)

I think you're missing the point. People usually don't catch SyntaxError
and run fallback code in this case. And even if they do, they are
unlikely to care about whether the ErreurDeSyntaxe is due to a genuinely
faulty piece of code, or a parser limitation ;)
msg140646 - (view) Author: Nick Coghlan (ncoghlan) * (Python committer) Date: 2011-07-19 02:11
Yeah, at the level of *code* the origin doesn't really matter, so re-using the exception type is actually OK. However, for *people* seeing the stack trace, it may be useful to know what's genuinely illegal syntax and what's a limitation of the current implementation. The exception message is a good place for that additional detail.
msg396883 - (view) Author: Irit Katriel (iritkatriel) * (Python committer) Date: 2021-07-02 23:10
The patch relates to the old parser. 

With the new parser the 100*(+100*) example works. If we go to 1000 instead of 100 we get "SyntaxError: too many nested parentheses".

From the discussion it seems that the idea of a new exception type for implementation-limits vs "real" SyntaxErrors was rejected as not useful.

Is there anything left to do on this issue?
msg397125 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2021-07-08 03:41
I retract my original comment as I now agree with SyntaxError + distinct message.

Given Irit's report, I think the remaining question is whether 

A. The implication that 'too many nested parentheses' is compiler specific is strong enough, and any thing further should be left to 3rd parties to explain, and this issue should be closed as 'out of date'.

B. We should be explicit and add something like 'for this compiler'.

Since Pablo has been working on improving error messages, I would give his opinion high weight.

To find the actual limit, I did the following with Win 10, 3.10.0b3, IDLE:

>>> def f(n):
...     print(eval('('*n + ')'*n"))
... 
And after some trials, found
>>> f(200)
()
>>> f(201)
Traceback (most recent call last):
  File "<pyshell#56>", line 1, in <module>
    f(201)
  File "<pyshell#42>", line 2, in f
    print(eval('('*n + ')'*n))
  File "<string>", line 1
    ((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((()))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))
                                                                                                                                                                                                                                                                                                                                                                                                                        ^
SyntaxError: too many nested parentheses

The error is detected in 'string' line 1, 1-based offset 201, at the final '('.  There are 200 spaces before the caret but they are not obvious in this box, which does not wrap spaces properly. The end_lineno and end_offset are None, None rather than the 1,202 I expected.
msg397133 - (view) Author: Andre Roberge (aroberge) * Date: 2021-07-08 09:45
For information: I created an actual .py file from the last example with 200 parens and got the following error message:

> python example.py
s_push: parser stack overflow
MemoryError

For consistency and greater clarity, I think that having messages such as

- SyntaxError: parser stack overflow (too many nested parenthesis)
- SyntaxError: parser stack overflow (too many statistically nested blocks)

and possibly others (too many arguments?), would be preferable, with
the possibility of simply having

- SyntaxError: parser stack overflow

if the exact cause cannot be identified at the parsing stage.
msg397151 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-07-08 16:38
> > python example.py
> s_push: parser stack overflow
>  MemoryError

That error can only happen with the old parser, so you must be using Python3.8 or 3.9 with the old parser. Can you confirm?
msg397152 - (view) Author: Andre Roberge (aroberge) * Date: 2021-07-08 16:43
@pablo: Sincere apologies ... I tested this with the wrong virtual environment active locally.  With 3.10, it is indeed SyntaxError: too many nested parentheses
msg397153 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-07-08 16:43
As a piece of extra information, this limit has been there for a while:

https://github.com/python/cpython/blob/91a8f8c16ca9a7e2466a8241d9b41769ef97d094/Parser/tokenizer.h#L14

the problem is that the old parser died before reaching it in many situations.
msg397154 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021-07-08 16:46
Regarding what to do, what I would love to optimize is that the message is clear. One of the challenges people may have with too specific messages is that they may not know what "a compiler" or "a parser" means (the first may be especially confusing since many people think Python is not "compiled".

On the other hand, is more informative to say "too many nested parentheses" than "parser stack overflow" since many people don't know what that means, especially in Python.

I
History
Date User Action Args
2021-07-08 16:46:57pablogsalsetmessages: + msg397154
2021-07-08 16:43:38pablogsalsetmessages: + msg397153
2021-07-08 16:43:04arobergesetmessages: + msg397152
2021-07-08 16:38:49pablogsalsetmessages: + msg397151
2021-07-08 09:45:22arobergesetmessages: + msg397133
2021-07-08 03:41:02terry.reedysetversions: + Python 3.11, - Python 3.4
nosy: + aroberge, pablogsal

messages: + msg397125

stage: patch review -> needs patch
2021-07-02 23:10:59iritkatrielsetnosy: + iritkatriel
messages: + msg396883
2013-06-30 05:58:25terry.reedysetversions: + Python 3.4, - Python 3.3
2011-07-19 02:11:21ncoghlansetmessages: + msg140646
2011-07-18 23:34:38pitrousetmessages: + msg140635
2011-07-18 23:28:02orsenthilsetnosy: + orsenthil
messages: + msg140631
2011-07-17 06:32:24ncoghlansetmessages: + msg140517
2011-07-16 23:13:20r.david.murraysetnosy: + r.david.murray
messages: + msg140507
2011-07-16 22:28:55pitrousetmessages: + msg140506
2011-07-16 16:50:34benjamin.petersonsetmessages: + msg140504
2011-07-16 15:53:44ncoghlansetmessages: + msg140502
2011-07-16 15:52:27Arfreversetnosy: + Arfrever
2011-07-16 15:33:25benjamin.petersonsetmessages: + msg140501
2011-07-16 15:22:45ncoghlansetmessages: + msg140500
2011-07-16 15:16:04benjamin.petersonsetmessages: + msg140499
2011-07-16 15:10:41ncoghlansetmessages: + msg140498
2011-07-16 14:43:34benjamin.petersonsetmessages: + msg140496
2011-07-16 08:24:36georg.brandlsetnosy: + georg.brandl
messages: + msg140480
2011-07-16 07:56:41ncoghlansetmessages: + msg140479
2011-07-15 20:38:39eric.snowsetnosy: + eric.snow
2011-07-15 19:48:27benjamin.petersonsetmessages: + msg140470
2011-07-15 19:47:05benjamin.petersonsetmessages: + msg140469
2011-07-15 18:53:11pitrousetnosy: + pitrou, benjamin.peterson, ncoghlan

messages: + msg140456
stage: patch review
2011-04-29 16:04:17Trundlesetmessages: + msg134778
2011-03-25 01:07:09Trundlesetfiles: - parser_nested_MemoryError.patch
2011-03-25 01:07:03Trundlesetfiles: - parser_nested_SyntaxError.patch
2011-03-25 01:05:27Trundlesetfiles: + issue11343.patch

messages: + msg132063
2011-03-04 21:30:10terry.reedysetnosy: + loewis, terry.reedy
messages: + msg130076
2011-02-26 22:40:29Trundlesetfiles: + parser_nested_SyntaxError.patch
nosy: marienz, Trundle
2011-02-26 22:39:50Trundlecreate