classification
Title: internal error in regular expression engine
Type: behavior Stage: resolved
Components: Library (Lib), Regular Expressions Versions: Python 3.3, Python 3.4, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: serhiy.storchaka Nosy List: Arfrever, benjamin.peterson, bkabrda, ced, christian.heimes, doko, doughellmann, ezio.melotti, georg.brandl, jdemeyer, larry, mrabarnett, pitrou, python-dev, r.david.murray, serhiy.storchaka
Priority: deferred blocker Keywords: patch

Created on 2013-05-17 15:09 by jdemeyer, last changed 2013-08-19 18:45 by serhiy.storchaka. This issue is now closed.

Files
File name Uploaded Description Edit
re_unsigned_ptrdiff.patch serhiy.storchaka, 2013-05-17 18:07 review
Messages (19)
msg189461 - (view) Author: Jeroen Demeyer (jdemeyer) * Date: 2013-05-17 15:09
On Linux Ubuntu 13.04, i686:

$ uname -a
Linux arando 3.5.0-26-generic #42-Ubuntu SMP Fri Mar 8 23:20:06 UTC 2013 i686 i686 i686 GNU/Linux

$ python
Python 2.7.5 (default, May 17 2013, 18:43:24) 
[GCC 4.7.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>> re.compile('(.*)\.[0-9]*\.[0-9]*$', re.I|re.S).findall('3.0.0')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: internal error in regular expression engine

This is a 2.7.5 regression, 2.7.4 worked fine.
msg189466 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2013-05-17 16:25
27162465316f
msg189467 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2013-05-17 16:29
Also, note this particular case only reproduces on 32 bit.
msg189469 - (view) Author: Christian Heimes (christian.heimes) * (Python committer) Date: 2013-05-17 16:33
I'm able to confirm Benjamin's notes. The regexp works on 64bit Linux but fails with a 32bit build:

$ CFLAGS="-m32" LDFLAGS="-m32" ./configure
$ make -j10
$ ./python -c "import re; print(re.compile('(.*)\.[0-9]*\.[0-9]*$', re.I|re.S).findall('3.0.0'))"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
RuntimeError: internal error in regular expression engine
msg189473 - (view) Author: Matthew Barnett (mrabarnett) * Date: 2013-05-17 17:14
Here are some simpler examples of the bug:

re.compile('.*yz', re.S).findall('xyz')
re.compile('.?yz', re.S).findall('xyz')
re.compile('.+yz', re.S).findall('xyz')

Unfortunately I find it difficult to see what's happening when single-stepping through the code because of the macros. :-(
msg189475 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-05-17 18:07
Here is a patch which should fix this bug. I still have to look for similar bugs and write tests.
msg189476 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-05-17 18:10
Thank you Matthew for simpler examples. They helped and I'll use them in the tests.
msg189530 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013-05-18 16:08
Perhaps it would be safer to revert the original commit in bugfix branches, and just commit the better patch in default?
msg190485 - (view) Author: Matthias Klose (doko) * (Python committer) Date: 2013-06-02 12:45
what's the status on this one?  Can the proposed patch be applied until the decision whether to backout the original change, or not?
msg190486 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-06-02 12:47
I'm working on tests. No need to rush.
msg193979 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2013-07-31 07:07
There is now a need to rush.  I'm hoping to cut the release in about two days, so we can have Python 3.4a1 on time.  Can we resolve this in the next day or two?  Sorry for the short notice.
msg194037 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2013-08-01 08:25
Can I downgrade this to "deferred blocker"?  That means we still need to deal with it before the release, but we don't have to hold up Python 3.4a1 for it.
msg194095 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2013-08-01 18:06
I'm downgrading this to "deferred blocker".  I'm sure we'll get it fixed for Python 3.4 final, but in the meantime there's no sense in holding up Python 3.4a1 for it.
msg194270 - (view) Author: Roundup Robot (python-dev) Date: 2013-08-03 16:31
New changeset 86b8b035529b by Serhiy Storchaka in branch '3.3':
Issue #17998: Fix an internal error in regular expression engine.
http://hg.python.org/cpython/rev/86b8b035529b

New changeset 36702442ffe0 by Serhiy Storchaka in branch 'default':
Issue #17998: Fix an internal error in regular expression engine.
http://hg.python.org/cpython/rev/36702442ffe0

New changeset e5e425fd1e4f by Serhiy Storchaka in branch '2.7':
Issue #17998: Fix an internal error in regular expression engine.
http://hg.python.org/cpython/rev/e5e425fd1e4f
msg194275 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-08-03 17:30
Sorry for the delay. I have committed a simple patch which fixes this bug. But I don't close the issue still because there are other related issues.
msg194276 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2013-08-03 17:39
This appears to have turned the buildbots red.
msg194286 - (view) Author: Larry Hastings (larry) * (Python committer) Date: 2013-08-03 18:29
This broke the test suite on all the 32-bit Linux buildbots.  Sample output is here:

http://buildbot.python.org/all/builders/x86%20Ubuntu%20Shared%203.x/builds/8349/steps/test/logs/stdio

There's no obvious fix, and I want to cut 3.4a1 right about now, so I'm going to tag the version in trunk just before this checkin.
msg194299 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-08-03 20:50
See issue18647.
msg195655 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-08-19 18:45
For other related issues see issue18672 and issue18684.
History
Date User Action Args
2013-08-19 18:45:44serhiy.storchakasetstatus: open -> closed
resolution: fixed
messages: + msg195655

stage: test needed -> resolved
2013-08-03 20:50:35serhiy.storchakasetmessages: + msg194299
2013-08-03 18:29:35larrysetmessages: + msg194286
2013-08-03 17:39:59r.david.murraysetnosy: + r.david.murray
messages: + msg194276
2013-08-03 17:30:51serhiy.storchakasetmessages: + msg194275
2013-08-03 16:31:37python-devsetnosy: + python-dev
messages: + msg194270
2013-08-01 18:06:04larrysetpriority: release blocker -> deferred blocker

messages: + msg194095
2013-08-01 08:25:03larrysetmessages: + msg194037
2013-08-01 07:44:55cedsetnosy: + ced
2013-08-01 05:58:41bkabrdasetnosy: + bkabrda
2013-07-31 07:07:00larrysetmessages: + msg193979
2013-07-20 10:56:22doughellmannsetnosy: + doughellmann
2013-06-11 17:49:53serhiy.storchakalinkissue18190 superseder
2013-06-02 12:47:45serhiy.storchakasetmessages: + msg190486
stage: patch review -> test needed
2013-06-02 12:45:36dokosetpriority: normal -> release blocker
nosy: + larry, doko, georg.brandl
messages: + msg190485

2013-05-18 16:08:49pitrousetnosy: + pitrou
messages: + msg189530
2013-05-18 07:25:07Arfreversetnosy: + Arfrever
2013-05-17 18:10:42serhiy.storchakasetmessages: + msg189476
2013-05-17 18:07:29serhiy.storchakasetfiles: + re_unsigned_ptrdiff.patch
versions: + Python 3.3, Python 3.4
messages: + msg189475

keywords: + patch
stage: patch review
2013-05-17 17:14:11mrabarnettsetmessages: + msg189473
2013-05-17 16:33:49christian.heimessetmessages: + msg189469
2013-05-17 16:29:13benjamin.petersonsetmessages: + msg189467
2013-05-17 16:28:35christian.heimessetnosy: + christian.heimes
2013-05-17 16:25:54benjamin.petersonsetassignee: serhiy.storchaka

messages: + msg189466
nosy: + serhiy.storchaka
2013-05-17 15:15:11ezio.melottisetnosy: + ezio.melotti, benjamin.peterson, mrabarnett
type: behavior
components: + Regular Expressions
2013-05-17 15:09:14jdemeyercreate