This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Regex compilation crashed if I change order of alternatives under quantifier
Type: behavior Stage: resolved
Components: Regular Expressions Versions: Python 3.7
process
Status: closed Resolution: not a bug
Dependencies: Superseder:
Assigned To: Nosy List: Renji, ezio.melotti, mrabarnett
Priority: normal Keywords:

Created on 2021-01-09 01:25 by Renji, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Messages (5)
msg384703 - (view) Author: (Renji) Date: 2021-01-09 01:25
I can compile "((a)|b\2)*" expression and this expression successfully return captures from first repetition and second repetition in one time. But if I write (b\2|(a))* expression, I get "invalid group reference 2 at position 3" error. Either first or second behavior incorrect.
python3 --version Python 3.7.3

import re
text="aba"
#match=re.search(r"(b\2|(a))*",text) - not worked
match=re.search(r"((a)|b\2)*",text)
if(match):
    #show aba ba a
    print(match.group(0)+" "+match.group(1)+" "+match.group(2))
msg384708 - (view) Author: Matthew Barnett (mrabarnett) * (Python triager) Date: 2021-01-09 02:00
It's not a crash. It's complaining that you're referring to group 2 before defining it. The re module doesn't support forward references to groups, but only backward references to them.
msg384709 - (view) Author: (Renji) Date: 2021-01-09 02:11
In my example reference and capture group presents in two difference alternatives. They don't follow each other, but executed in random order. If this don't supported in one case, why it supported in other case?
msg384711 - (view) Author: Matthew Barnett (mrabarnett) * (Python triager) Date: 2021-01-09 02:38
Example 1:

    ((a)|b\2)*
     ^^^       Group 2

    ((a)|b\2)*
          ^^   Reference to group 2

    The reference refers backwards to the group.

Example 2:

    (b\2|(a))*
         ^^^   Group 2

    (b\2|(a))*
      ^^       Reference to group 2

    The reference refers forwards to the group.

As I said, the re module doesn't support forward references to groups.

If you have a regex where forward references are unavoidable, try the 3rd-party 'regex' module instead. It's available on PyPI.
msg384712 - (view) Author: (Renji) Date: 2021-01-09 03:04
I through "forward reference" is "\1 (abcd)". Not "some sort of reference in second repetition to data from first repetition".

Ok. In other words refers from on repetition to other supported, but with purely formal restrictions. And remove this restrictions don't planned. Than this issue may be closed.
History
Date User Action Args
2022-04-11 14:59:40adminsetgithub: 87037
2022-03-20 19:42:16serhiy.storchakasetstatus: open -> closed
resolution: not a bug
stage: resolved
2021-01-09 03:04:21Renjisetmessages: + msg384712
2021-01-09 02:38:37mrabarnettsetmessages: + msg384711
2021-01-09 02:11:40Renjisetmessages: + msg384709
2021-01-09 02:00:02mrabarnettsetmessages: + msg384708
2021-01-09 01:25:30Renjicreate