Message405327
From Python 3.7, sre_parse.parse() do not create SubPattern instances that can be used to back reproduce original expression if containing non-capturing groups.
In Python 3.6:
>>> import sre_parse
>>> sre_parse.parse("(?:foo (?:bar) | (?:baz))").dump()
SUBPATTERN None 0 0
BRANCH
LITERAL 102
LITERAL 111
LITERAL 111
LITERAL 32
SUBPATTERN None 0 0
LITERAL 98
LITERAL 97
LITERAL 114
LITERAL 32
OR
LITERAL 32
SUBPATTERN None 0 0
LITERAL 98
LITERAL 97
LITERAL 122
In Python 3.7 and beyond:
>>> import sre_parse
>>> sre_parse.parse("(?:foo (?:bar) | (?:baz))").dump()
BRANCH
LITERAL 102
LITERAL 111
LITERAL 111
LITERAL 32
LITERAL 98
LITERAL 97
LITERAL 114
LITERAL 32
OR
LITERAL 32
LITERAL 98
LITERAL 97
LITERAL 122
This behaviour is making it impossible to write a correct colorizer for regular expressions using the sre_parse module from Python 3.7. I'm not a regex expert, so I cannot say wether this change has any effect on the matching itself, but if I trust regex101, it will add a capturing group in the place of the non-capturing group. |
|
Date |
User |
Action |
Args |
2021-10-29 18:35:49 | tristanlatr | set | recipients:
+ tristanlatr, ezio.melotti, mrabarnett |
2021-10-29 18:35:49 | tristanlatr | set | messageid: <1635532549.63.0.469467756484.issue45674@roundup.psfhosted.org> |
2021-10-29 18:35:49 | tristanlatr | link | issue45674 messages |
2021-10-29 18:35:49 | tristanlatr | create | |
|