Issue25550
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2015-11-04 09:02 by The Compiler, last changed 2022-04-11 14:58 by admin. This issue is now closed.
Messages (7) | |||
---|---|---|---|
msg254040 - (view) | Author: Florian Bruhin (The Compiler) * | Date: 2015-11-04 09:02 | |
I just found this thanks to Hypothesis[1]: >>> import re >>> re.compile('(' * 500) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python3.5/re.py", line 224, in compile return _compile(pattern, flags) File "/usr/lib/python3.5/re.py", line 293, in _compile p = sre_compile.compile(pattern, flags) File "/usr/lib/python3.5/sre_compile.py", line 536, in compile p = sre_parse.parse(p, flags) File "/usr/lib/python3.5/sre_parse.py", line 829, in parse p = _parse_sub(source, pattern, 0) File "/usr/lib/python3.5/sre_parse.py", line 437, in _parse_sub itemsappend(_parse(source, state)) File "/usr/lib/python3.5/sre_parse.py", line 778, in _parse p = _parse_sub(source, state) File "/usr/lib/python3.5/sre_parse.py", line 437, in _parse_sub itemsappend(_parse(source, state)) File "/usr/lib/python3.5/sre_parse.py", line 778, in _parse p = _parse_sub(source, state) File "/usr/lib/python3.5/sre_parse.py", line 437, in _parse_sub itemsappend(_parse(source, state)) [...] File "/usr/lib/python3.5/sre_parse.py", line 493, in _parse subpattern = SubPattern(state) RecursionError: maximum recursion depth exceeded It seems a maximum recursion error has been treated as a bug before (like in issue401612), so I'm guessing this isn't intended here either. [1] https://hypothesis.readthedocs.org/ |
|||
msg254042 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2015-11-04 09:27 | |
This isn't a bug, this is a limitation of the implementation. Regular expression parser is recursive and Python has a limit for recursion depth. You can increase this limit if needed. >>> import sys, re >>> sys.setrecursionlimit(2000) >>> re.compile('(' * 500) Traceback (most recent call last): File "/home/serhiy/py/cpython/Lib/sre_parse.py", line 437, in _parse_sub itemsappend(_parse(source, state)) File "/home/serhiy/py/cpython/Lib/sre_parse.py", line 778, in _parse p = _parse_sub(source, state) File "/home/serhiy/py/cpython/Lib/sre_parse.py", line 437, in _parse_sub itemsappend(_parse(source, state)) ... File "/home/serhiy/py/cpython/Lib/sre_parse.py", line 778, in _parse p = _parse_sub(source, state) File "/home/serhiy/py/cpython/Lib/sre_parse.py", line 437, in _parse_sub itemsappend(_parse(source, state)) File "/home/serhiy/py/cpython/Lib/sre_parse.py", line 781, in _parse source.tell() - start) sre_constants.error: missing ), unterminated subpattern at position 499 |
|||
msg254046 - (view) | Author: Florian Bruhin (The Compiler) * | Date: 2015-11-04 10:26 | |
I see how it's not possible to compile that pattern as it's using recursion - I don't mind that. However, I think this should be handled and re-raised as a re.error ("Exception raised [...] when some other error occurs during compilation or matching."). I think no matter what the pattern is, it's quite unexpected to get anything other than a re.error (or a TypeError) from re.compile. |
|||
msg254060 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2015-11-04 15:54 | |
re.compile can also raise ValueError and OverflowError. Agree that may be these exceptions can be converted to re.error. But it can also raise MemoryError, and KeyboardInterrupt. And these exceptions can't be converted to re.error. RecursionError is closer to the latter case. It depends not only on the pattern itself, but on the place where it is called. A pattern that is compiled successfully at top level can produce a RecursionError if it is compiled deeply in the recursion. >>> import re >>> def r(n): ... if not n: ... return re.compile('x') ... return r(n-1) ... >>> r(990) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 4, in r File "<stdin>", line 4, in r ... File "<stdin>", line 4, in r File "<stdin>", line 3, in r File "/home/serhiy/py/cpython/Lib/re.py", line 224, in compile return _compile(pattern, flags) File "/home/serhiy/py/cpython/Lib/re.py", line 293, in _compile p = sre_compile.compile(pattern, flags) File "/home/serhiy/py/cpython/Lib/sre_compile.py", line 555, in compile p = sre_parse.parse(p, flags) File "/home/serhiy/py/cpython/Lib/sre_parse.py", line 822, in parse source = Tokenizer(str) File "/home/serhiy/py/cpython/Lib/sre_parse.py", line 225, in __init__ self.__next() RecursionError: maximum recursion depth exceeded >>> re.compile('x') re.compile('x') |
|||
msg254236 - (view) | Author: Terry J. Reedy (terry.reedy) * | Date: 2015-11-06 22:22 | |
Is the recursion limit hit enough in real-world problems that we should consider replacing recursive calls and the implicit C call stack with loops and an explicit stack? |
|||
msg254261 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2015-11-07 07:52 | |
I think it is possible to make the parser non-recursive, but without reports about such real-world problems it looks premature and just will complicate the code. For years before 3.5 there was a limitation on only 100 capturing groups. |
|||
msg278752 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * | Date: 2016-10-16 08:25 | |
Closed as not a bug. Unlike to issue401612 this is not an infinite recursion. Raising RecursionError depends on the context (it is more likely in deeply nested call) and there is a common workaround -- just increase maximum recursion depth using sys.setrecursionlimit(). There is nothing specific to regular expressions. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:58:23 | admin | set | github: 69736 |
2016-10-16 08:25:41 | serhiy.storchaka | set | status: open -> closed resolution: not a bug messages: + msg278752 stage: resolved |
2015-11-07 07:52:41 | serhiy.storchaka | set | messages: + msg254261 |
2015-11-06 22:22:05 | terry.reedy | set | nosy:
+ terry.reedy messages: + msg254236 |
2015-11-04 15:54:55 | serhiy.storchaka | set | messages: + msg254060 |
2015-11-04 10:26:25 | The Compiler | set | messages: + msg254046 |
2015-11-04 09:27:11 | serhiy.storchaka | set | messages: + msg254042 |
2015-11-04 09:07:10 | SilentGhost | set | nosy:
+ mrabarnett components: + Regular Expressions versions: + Python 3.6 |
2015-11-04 09:02:56 | The Compiler | create |