Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grouprefs in lookbehind assertions #39322

Closed
glchapman mannequin opened this issue Sep 29, 2003 · 11 comments
Closed

Grouprefs in lookbehind assertions #39322

glchapman mannequin opened this issue Sep 29, 2003 · 11 comments
Assignees
Labels
topic-regex type-bug An unexpected behavior, bug, or error

Comments

@glchapman
Copy link
Mannequin

glchapman mannequin commented Sep 29, 2003

BPO 814253
Nosy @serhiy-storchaka
Superseder
  • bpo-9179: Lookback with group references incorrect (two issues?)
  • Files
  • sre_parse.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/serhiy-storchaka'
    closed_at = <Date 2014-11-07.21:29:40.275>
    created_at = <Date 2003-09-29.03:31:55.000>
    labels = ['expert-regex', 'type-bug']
    title = 'Grouprefs in lookbehind assertions'
    updated_at = <Date 2015-02-21.10:12:03.181>
    user = 'https://bugs.python.org/glchapman'

    bugs.python.org fields:

    activity = <Date 2015-02-21.10:12:03.181>
    actor = 'python-dev'
    assignee = 'serhiy.storchaka'
    closed = True
    closed_date = <Date 2014-11-07.21:29:40.275>
    closer = 'serhiy.storchaka'
    components = ['Regular Expressions']
    creation = <Date 2003-09-29.03:31:55.000>
    creator = 'glchapman'
    dependencies = []
    files = ['1049']
    hgrepos = []
    issue_num = 814253
    keywords = []
    message_count = 11.0
    messages = ['18411', '18412', '83234', '114290', '190042', '190044', '229918', '230824', '230827', '230829', '236357']
    nosy_count = 5.0
    nosy_names = ['effbot', 'glchapman', 'mrabarnett', 'python-dev', 'serhiy.storchaka']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = '9179'
    type = 'behavior'
    url = 'https://bugs.python.org/issue814253'
    versions = ['Python 2.7', 'Python 3.4', 'Python 3.5']

    @glchapman
    Copy link
    Mannequin Author

    glchapman mannequin commented Sep 29, 2003

    I was trying to get a pattern like this to work:

       pat = re.compile(r'(?<=(...)\1)abc')
       pat.match('jkljklabc', 6)

    Unfortunately, that doesn't work. The problem is that
    sre_parse.Subpattern.getwidth() ignores GROUPREFs
    when calculating the width, so the subpattern in the
    assertion is deemed to have length of 3 (I was hoping
    that sre could detect that the group 1 had a fixed
    length, so the reference to it would also have a fixed
    length).

    I've since discovered that both Perl and PerlRE cannot
    handle the above pattern, but they both generate
    exceptions indicating that the assertion has a variable
    length pattern. I think it would be a good idea if sre
    generated an exception as well (rather than silently
    ignoring GROUPREFs).

    @glchapman glchapman mannequin assigned effbot Sep 29, 2003
    @glchapman glchapman mannequin added the topic-regex label Sep 29, 2003
    @glchapman glchapman mannequin assigned effbot Sep 29, 2003
    @glchapman glchapman mannequin added the topic-regex label Sep 29, 2003
    @glchapman
    Copy link
    Mannequin Author

    glchapman mannequin commented Nov 2, 2003

    Logged In: YES
    user_id=86307

    Attached is a patch which gives GROUPREFs an arbitrary
    variable width, so that they raise an exception if used in a
    lookbehind assertion. Obviously, it would be better if
    GROUPREFs returned the length of the group to which they
    refer, but I don't see any obvious way for getwidth() to get
    that information (perhaps I missed something?).

    @effbot effbot mannequin added type-bug An unexpected behavior, bug, or error labels Sep 11, 2007
    @mrabarnett
    Copy link
    Mannequin

    mrabarnett mannequin commented Mar 6, 2009

    As part of issue bpo-2636 group references now work in lookbehinds.

    However, your example:

    (?<=(...)\1)abc
    

    will fail but:

    (?<=\1(...))abc
    

    will succeed.

    Why? Well, in lookbehinds it searches backwards. In the first regex it
    sees the group reference before the capture, whereas in the second it
    sees the group reference after the capture. (Hope that's clear! :-))

    @BreamoreBoy
    Copy link
    Mannequin

    BreamoreBoy mannequin commented Aug 18, 2010

    I've deliberately changed the stage to patch review and the version to 3.2 to highlight the fact that a lot of work will be needed to get the new regex engine into the standard library. Feel free to change these as is seen fit.

    @BreamoreBoy
    Copy link
    Mannequin

    BreamoreBoy mannequin commented May 26, 2013

    Can this be closed as a result of work done via bpo-2636 or must it remain open?

    @mrabarnett
    Copy link
    Mannequin

    mrabarnett mannequin commented May 26, 2013

    Issue bpo-2636 resulted in the regex module, which supports variable-length look-behinds.

    I don't know how much work it would take even to put a limited fixed-length look-behind fix for this into the re module, so I'm afraid the issue must remain open.

    @serhiy-storchaka
    Copy link
    Member

    The patch for bpo-9179 fixes this issue too.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Nov 7, 2014

    New changeset fac649bf2d10 by Serhiy Storchaka in branch '2.7':
    Issues bpo-814253, bpo-9179: Group references and conditional group references now
    https://hg.python.org/cpython/rev/fac649bf2d10

    New changeset 9fcf4008b626 by Serhiy Storchaka in branch '3.4':
    Issues bpo-814253, bpo-9179: Group references and conditional group references now
    https://hg.python.org/cpython/rev/9fcf4008b626

    New changeset 60fccf0aad83 by Serhiy Storchaka in branch 'default':
    Issues bpo-814253, bpo-9179: Group references and conditional group references now
    https://hg.python.org/cpython/rev/60fccf0aad83

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Nov 7, 2014

    New changeset 0e2c7d774df3 by Serhiy Storchaka in branch '2.7':
    Silence the failure of test_pyclbr after adding a property in sre_parse
    https://hg.python.org/cpython/rev/0e2c7d774df3

    New changeset 246c9570a757 by Serhiy Storchaka in branch '3.4':
    Silence the failure of test_pyclbr after adding a property in sre_parse
    https://hg.python.org/cpython/rev/246c9570a757

    New changeset b2c17681404f by Serhiy Storchaka in branch 'default':
    Silence the failure of test_pyclbr after adding a property in sre_parse
    https://hg.python.org/cpython/rev/b2c17681404f

    @serhiy-storchaka
    Copy link
    Member

    Now group references to groups with fixed width are supported in lookbehind assertions.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Feb 21, 2015

    New changeset b78195af96f5 by Serhiy Storchaka in branch 'default':
    Issues bpo-814253, bpo-9179: Group references and conditional group references now
    https://hg.python.org/cpython/rev/b78195af96f5

    New changeset 5387095b8675 by Serhiy Storchaka in branch '2.7':
    Issues bpo-814253, bpo-9179: Warnings now are raised when group references and
    https://hg.python.org/cpython/rev/5387095b8675

    New changeset e295ad9be16d by Serhiy Storchaka in branch '3.4':
    Issues bpo-814253, bpo-9179: Warnings now are raised when group references and
    https://hg.python.org/cpython/rev/e295ad9be16d

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 9, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    topic-regex type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant