Message306479
Currently `{m}`, `{m,n}`, `{m,}` and `{,n}` where m and n are non-negative decimal numbers are accepted in regular expressions as quantifiers that mean repeating the previous RE from m (0 by default) to n (infinity by default) times.
But if the opening brace '{'is not followed by one of the above patterns, it means just the literal '{'.
>>> import re
>>> re.search('(foobar){e}', 'xirefoabralfobarxie')
>>> re.search('(foobar){e}', 'foobar{e}')
<re.Match object; span=(0, 9), match='foobar{e}'>
This conflicts with the regex module which uses braces for defining the "fuzzy" matching.
>>> import regex
>>> regex.search('(foobar){e}', 'xirefoabralfobarxie')
<regex.Match object; span=(0, 6), match='xirefo', fuzzy_counts=(6, 0, 0)>
>>> regex.search('(foobar){e}', 'foobar{e}')
<regex.Match object; span=(0, 6), match='foobar'>
I don't think it is worth to add support of fuzzy matching in the re module, but for compatibility it would be better to raise an error or a warning in case of '{' not following by the one of the recognized patterns. This could also help to catch typos and errors in regular expressions, i.e. in '-{1.2}' or '-{1, 2}' instead of '-{1,2}'.
Possible variants:
1. Emit a DeprecationWarning in 3.7 (and 2.7.15 with the -3 option), raise a re.error in 3.8 or 3.9.
2. Emit a PendingDeprecationWarning in 3.7, a DeprecationWarning in 3.8, and raise a re.error in 3.9 or 3.10.
3. Emit a RuntimeWarning or SyntaxWarning in 3.7 and forever.
4. Emit a FutureWarning in 3.7, and implement the fuzzy matching or replace re with regex sometimes in future. Unlikely. |
|
Date |
User |
Action |
Args |
2017-11-18 11:12:27 | serhiy.storchaka | set | recipients:
+ serhiy.storchaka, ezio.melotti, mrabarnett |
2017-11-18 11:12:27 | serhiy.storchaka | set | messageid: <1511003547.89.0.213398074469.issue32067@psf.upfronthosting.co.za> |
2017-11-18 11:12:27 | serhiy.storchaka | link | issue32067 messages |
2017-11-18 11:12:26 | serhiy.storchaka | create | |
|