classification
Title: Wrong or missing exception when compiling regexes with recursive named backreferences
Type: behavior Stage: resolved
Components: Regular Expressions Versions: Python 3.6, Python 3.4, Python 3.5, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: serhiy.storchaka Nosy List: benjamin.peterson, dhaffey, ezio.melotti, mrabarnett, python-dev, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2015-07-07 01:25 by dhaffey, last changed 2015-12-19 21:26 by serhiy.storchaka. This issue is now closed.

Files
File name Uploaded Description Edit
re_open_group_symbolic_ref.patch serhiy.storchaka, 2015-07-07 09:35 review
re_open_group_symbolic_ref-3.4.patch serhiy.storchaka, 2015-07-07 09:37 review
Messages (5)
msg246390 - (view) Author: Dan Haffey (dhaffey) Date: 2015-07-07 01:25
Error reporting for recursive backreferences in regexes isn't consistent across both types of backref. Here's the exception for a recursive numeric backref:

>>> import re
>>> re.compile(r'(\1)')
Traceback (most recent call last):
    ...
sre_constants.error: cannot refer to an open group at position 1

Here's what I'm seeing on the 3.5 branch for a named backref:

>>> re.compile(r'(?P<spam>(?P=spam))')
Traceback (most recent call last):
    ...
RecursionError: maximum recursion depth exceeded

Which is an improvement over 3.4 and below, where compilation succeeds and appears to treat (?P=spam) as valid but unmatchable.
msg246396 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-07-07 09:35
Here is a patch that forbids symbolic references to opened groups in 3.5+.
msg246397 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-07-07 09:37
It is questionable if an exception should be raised in older Python versions. Here is a patch for 3.4 that just issues a warning.
msg246913 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015-07-18 20:38
New changeset 361d7af9396e by Serhiy Storchaka in branch '3.5':
Issue #24580: Symbolic group references to open group in re patterns now are
https://hg.python.org/cpython/rev/361d7af9396e

New changeset 4d3557500019 by Serhiy Storchaka in branch 'default':
Issue #24580: Symbolic group references to open group in re patterns now are
https://hg.python.org/cpython/rev/4d3557500019
msg256740 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-12-19 21:26
It is too late for 3.4, and I left 2.7 as is. Reopen issue if you think that it is worth to add a warning to 2.7.
History
Date User Action Args
2015-12-19 21:26:32serhiy.storchakasetstatus: open -> closed

nosy: + benjamin.peterson
messages: + msg256740

resolution: fixed
stage: patch review -> resolved
2015-07-18 20:38:14python-devsetnosy: + python-dev
messages: + msg246913
2015-07-07 09:37:33serhiy.storchakasetversions: + Python 2.7, Python 3.4
2015-07-07 09:37:20serhiy.storchakasetfiles: + re_open_group_symbolic_ref-3.4.patch

messages: + msg246397
2015-07-07 09:35:19serhiy.storchakasetfiles: + re_open_group_symbolic_ref.patch
keywords: + patch
messages: + msg246396

stage: needs patch -> patch review
2015-07-07 04:31:20serhiy.storchakasetassignee: serhiy.storchaka
stage: needs patch

nosy: + serhiy.storchaka
versions: + Python 3.5, Python 3.6
2015-07-07 01:25:41dhaffeycreate