Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

internal error in regular expression engine #62198

Closed
jdemeyer opened this issue May 17, 2013 · 19 comments
Closed

internal error in regular expression engine #62198

jdemeyer opened this issue May 17, 2013 · 19 comments
Assignees
Labels
deferred-blocker stdlib Python modules in the Lib dir topic-regex type-bug An unexpected behavior, bug, or error

Comments

@jdemeyer
Copy link
Contributor

BPO 17998
Nosy @birkenfeld, @doko42, @pitrou, @larryhastings, @tiran, @benjaminp, @ezio-melotti, @dhellmann, @cedk, @bitdancer, @serhiy-storchaka, @jdemeyer
Files
  • re_unsigned_ptrdiff.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/serhiy-storchaka'
    closed_at = <Date 2013-08-19.18:45:44.717>
    created_at = <Date 2013-05-17.15:09:14.788>
    labels = ['expert-regex', 'deferred-blocker', 'type-bug', 'library']
    title = 'internal error in regular expression engine'
    updated_at = <Date 2013-08-19.18:45:44.715>
    user = 'https://github.com/jdemeyer'

    bugs.python.org fields:

    activity = <Date 2013-08-19.18:45:44.715>
    actor = 'serhiy.storchaka'
    assignee = 'serhiy.storchaka'
    closed = True
    closed_date = <Date 2013-08-19.18:45:44.717>
    closer = 'serhiy.storchaka'
    components = ['Library (Lib)', 'Regular Expressions']
    creation = <Date 2013-05-17.15:09:14.788>
    creator = 'jdemeyer'
    dependencies = []
    files = ['30299']
    hgrepos = []
    issue_num = 17998
    keywords = ['patch']
    message_count = 19.0
    messages = ['189461', '189466', '189467', '189469', '189473', '189475', '189476', '189530', '190485', '190486', '193979', '194037', '194095', '194270', '194275', '194276', '194286', '194299', '195655']
    nosy_count = 16.0
    nosy_names = ['georg.brandl', 'doko', 'pitrou', 'larry', 'christian.heimes', 'benjamin.peterson', 'ezio.melotti', 'mrabarnett', 'Arfrever', 'doughellmann', 'ced', 'r.david.murray', 'python-dev', 'serhiy.storchaka', 'bkabrda', 'jdemeyer']
    pr_nums = []
    priority = 'deferred blocker'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue17998'
    versions = ['Python 2.7', 'Python 3.3', 'Python 3.4']

    @jdemeyer
    Copy link
    Contributor Author

    On Linux Ubuntu 13.04, i686:

    $ uname -a
    Linux arando 3.5.0-26-generic #42-Ubuntu SMP Fri Mar 8 23:20:06 UTC 2013 i686 i686 i686 GNU/Linux
    $ python
    Python 2.7.5 (default, May 17 2013, 18:43:24) 
    [GCC 4.7.3] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import re
    >>> re.compile('(.*)\.[0-9]*\.[0-9]*$', re.I|re.S).findall('3.0.0')
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    RuntimeError: internal error in regular expression engine

    This is a 2.7.5 regression, 2.7.4 worked fine.

    @jdemeyer jdemeyer added the stdlib Python modules in the Lib dir label May 17, 2013
    @ezio-melotti ezio-melotti added topic-regex type-bug An unexpected behavior, bug, or error labels May 17, 2013
    @benjaminp
    Copy link
    Contributor

    27162465316f

    @benjaminp
    Copy link
    Contributor

    Also, note this particular case only reproduces on 32 bit.

    @tiran
    Copy link
    Member

    tiran commented May 17, 2013

    I'm able to confirm Benjamin's notes. The regexp works on 64bit Linux but fails with a 32bit build:

    $ CFLAGS="-m32" LDFLAGS="-m32" ./configure
    $ make -j10
    $ ./python -c "import re; print(re.compile('(.*)\.[0-9]*\.[0-9]*$', re.I|re.S).findall('3.0.0'))"
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
    RuntimeError: internal error in regular expression engine

    @mrabarnett
    Copy link
    Mannequin

    mrabarnett mannequin commented May 17, 2013

    Here are some simpler examples of the bug:

    re.compile('.*yz', re.S).findall('xyz')
    re.compile('.?yz', re.S).findall('xyz')
    re.compile('.+yz', re.S).findall('xyz')

    Unfortunately I find it difficult to see what's happening when single-stepping through the code because of the macros. :-(

    @serhiy-storchaka
    Copy link
    Member

    Here is a patch which should fix this bug. I still have to look for similar bugs and write tests.

    @serhiy-storchaka
    Copy link
    Member

    Thank you Matthew for simpler examples. They helped and I'll use them in the tests.

    @pitrou
    Copy link
    Member

    pitrou commented May 18, 2013

    Perhaps it would be safer to revert the original commit in bugfix branches, and just commit the better patch in default?

    @doko42
    Copy link
    Member

    doko42 commented Jun 2, 2013

    what's the status on this one? Can the proposed patch be applied until the decision whether to backout the original change, or not?

    @serhiy-storchaka
    Copy link
    Member

    I'm working on tests. No need to rush.

    @larryhastings
    Copy link
    Contributor

    There is now a need to rush. I'm hoping to cut the release in about two days, so we can have Python 3.4a1 on time. Can we resolve this in the next day or two? Sorry for the short notice.

    @larryhastings
    Copy link
    Contributor

    Can I downgrade this to "deferred blocker"? That means we still need to deal with it before the release, but we don't have to hold up Python 3.4a1 for it.

    @larryhastings
    Copy link
    Contributor

    I'm downgrading this to "deferred blocker". I'm sure we'll get it fixed for Python 3.4 final, but in the meantime there's no sense in holding up Python 3.4a1 for it.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Aug 3, 2013

    New changeset 86b8b035529b by Serhiy Storchaka in branch '3.3':
    Issue bpo-17998: Fix an internal error in regular expression engine.
    http://hg.python.org/cpython/rev/86b8b035529b

    New changeset 36702442ffe0 by Serhiy Storchaka in branch 'default':
    Issue bpo-17998: Fix an internal error in regular expression engine.
    http://hg.python.org/cpython/rev/36702442ffe0

    New changeset e5e425fd1e4f by Serhiy Storchaka in branch '2.7':
    Issue bpo-17998: Fix an internal error in regular expression engine.
    http://hg.python.org/cpython/rev/e5e425fd1e4f

    @serhiy-storchaka
    Copy link
    Member

    Sorry for the delay. I have committed a simple patch which fixes this bug. But I don't close the issue still because there are other related issues.

    @bitdancer
    Copy link
    Member

    This appears to have turned the buildbots red.

    @larryhastings
    Copy link
    Contributor

    This broke the test suite on all the 32-bit Linux buildbots. Sample output is here:

    http://buildbot.python.org/all/builders/x86%20Ubuntu%20Shared%203.x/builds/8349/steps/test/logs/stdio

    There's no obvious fix, and I want to cut 3.4a1 right about now, so I'm going to tag the version in trunk just before this checkin.

    @serhiy-storchaka
    Copy link
    Member

    See bpo-18647.

    @serhiy-storchaka
    Copy link
    Member

    For other related issues see bpo-18672 and bpo-18684.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    deferred-blocker stdlib Python modules in the Lib dir topic-regex type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    9 participants