classification
Title: re.search hangs on this
Type: Stage:
Components: Regular Expressions Versions: Python 2.5
process
Status: closed Resolution: works for me
Dependencies: Superseder:
Assigned To: Nosy List: facundobatista, itsadok, orivej
Priority: normal Keywords:

Created on 2008-01-27 15:59 by itsadok, last changed 2008-02-05 18:19 by facundobatista. This issue is now closed.

Messages (2)
msg61739 - (view) Author: Israel Tsadok (itsadok) Date: 2008-01-27 15:59
import re
re.search(r'a(b[^b]*b|[^c])*cxxx',
'abbcacabbbbcabbbbbbcabbbbbbbbbbbbbbcacabbbbbbbbbbbbbbcabbbbcac')

perl seems to handle this just fine.

(The original problem was trying to translate some html to text:
re.sub(r'<p(?:"[^"]*"|[^>])*>(.*?)</p>', r'\1\n')

This hanged on several files. Changing [^>] to [^">] resolved my
problem, but the general case remains.)

This might be a dupe of http://bugs.python.org/issue1297193
msg62074 - (view) Author: Facundo Batista (facundobatista) * (Python committer) Date: 2008-02-05 18:19
facundo@virtub:~$ time python -c "import
re;re.search(r'a(b[^b]*b|[^c])*cxxx','abbcacabbbbcabbbbbbcabbbbbbbbbbbbbbcacabbbbbbbbbbbbbbcabbbbcac')"

real    0m2.510s
user    0m2.308s
sys     0m0.028s
facundo@virtub:~$ 

This is a Python 2.5.1 (r251:54863, Oct  5 2007, 13:36:32) [GCC 4.1.3
20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)] on linux2

Note that it took some seconds on my fairly fast computer... could this
be a problem, that it takes long, but does not hang?
History
Date User Action Args
2008-02-05 18:19:33facundobatistasetstatus: open -> closed
resolution: works for me
messages: + msg62074
nosy: + facundobatista
2008-01-29 12:33:45orivejsetnosy: + orivej
2008-01-27 15:59:03itsadokcreate