classification
Title: re.match raises MemoryError
Type: resource usage Stage: resolved
Components: Regular Expressions Versions: Python 3.1, Python 3.2, Python 3.3, Python 2.7, Python 2.6
process
Status: closed Resolution: duplicate
Dependencies: Superseder: regexp: zero-width matches in MIN_UNTIL
View: 9669
Assigned To: Nosy List: EungJun.Yi, Matthew.Boehm, ezio.melotti, mrabarnett, pitrou, serhiy.storchaka, skrah
Priority: normal Keywords:

Created on 2011-05-25 18:04 by EungJun.Yi, last changed 2013-02-05 17:31 by serhiy.storchaka. This issue is now closed.

Messages (6)
msg136880 - (view) Author: EungJun Yi (EungJun.Yi) * Date: 2011-05-25 18:04
re.match raises MemoryError when trying to match r'()+?1' to 'a1', as shown below.

~$ python
Python 2.7.1+ (r271:86832, Apr 11 2011, 18:05:24) 
[GCC 4.5.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>> re.match(r'()+?1', 'a1')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.7/re.py", line 137, in match
    return _compile(pattern, flags).match(string)
MemoryError
>>>

~$ python3
Python 3.2 (r32:88445, Mar 25 2011, 19:28:28) 
[GCC 4.5.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>> re.match(r'()+?1', 'a1')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.2/re.py", line 153, in match
    return _compile(pattern, flags).match(string)
MemoryError
>>>
msg136906 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2011-05-25 21:49
Confirmed. The test case quickly uses 8GB of memory.
msg136913 - (view) Author: Matthew Barnett (mrabarnett) * (Python triager) Date: 2011-05-25 22:25
This also raises MemoryError:

    re.match(r'()*?1', 'a1')

but none of these do:

    re.match(r'()+1', 'a1')
    re.match(r'()*1', 'a1')
msg136929 - (view) Author: EungJun Yi (EungJun.Yi) * Date: 2011-05-26 04:46
This also raises in 2.6.5

Python 2.6.5 (r265:79063, Apr 16 2010, 13:09:56) 
[GCC 4.4.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>> re.match('()+?1', 'a1')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.6/re.py", line 137, in match
    return _compile(pattern, flags).match(string)
MemoryError
msg137147 - (view) Author: Matthew Boehm (Matthew.Boehm) Date: 2011-05-28 19:43
Here are some windows results with Python 2.7:

>>> import re
>>> re.match("()*?1", "1")
<_sre.SRE_Match object at 0x025C0E60>
>>> re.match("()+?1", "1")
>>> re.match("()+?1", "11")
<_sre.SRE_Match object at 0x025C0E60>
>>> re.match("()*?1", "11")
<_sre.SRE_Match object at 0x025C3C60>
<_sre.SRE_Match object at 0x025C3C60>
>>> re.match("()*?1", "a1")

Traceback (most recent call last):
  File "<pyshell#12>", line 1, in <module>
    re.match("()*?1", "a1")
  File "C:\Python27\lib\re.py", line 137, in match
    return _compile(pattern, flags).match(string)
MemoryError
>>> re.match("()+?1", "a1")

Traceback (most recent call last):
  File "<pyshell#13>", line 1, in <module>
    re.match("()+?1", "a1")
  File "C:\Python27\lib\re.py", line 137, in match
    return _compile(pattern, flags).match(string)
MemoryError

Note that when matching to a string starting with "1", the matcher will not throw a MemoryError.
msg181464 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013-02-05 17:31
This is a duplicate of issue9669.
History
Date User Action Args
2013-02-05 17:31:08serhiy.storchakasetstatus: open -> closed

superseder: regexp: zero-width matches in MIN_UNTIL

nosy: + serhiy.storchaka
messages: + msg181464
resolution: duplicate
stage: needs patch -> resolved
2011-05-28 19:43:55Matthew.Boehmsetnosy: + Matthew.Boehm
messages: + msg137147
2011-05-26 04:46:06EungJun.Yisetmessages: + msg136929
versions: + Python 2.6
2011-05-25 22:25:26mrabarnettsetnosy: + mrabarnett
messages: + msg136913
2011-05-25 21:49:45skrahsetversions: + Python 3.1, Python 3.3
nosy: + ezio.melotti, skrah, pitrou

messages: + msg136906

stage: needs patch
2011-05-25 18:04:24EungJun.Yicreate