classification
Title: re.match blocking and taking 100% CPU
Type: Stage: resolved
Components: Regular Expressions Versions: Python 2.7, Python 2.6
process
Status: closed Resolution: not a bug
Dependencies: Superseder: the re module can perform poorly: O(2**n) versus O(n**2)
View: 1662581
Assigned To: Nosy List: Sebastien.Estienne, ezio.melotti, mark.dickinson, mrabarnett, serhiy.storchaka
Priority: normal Keywords:

Created on 2012-11-07 17:00 by Sebastien.Estienne, last changed 2012-11-07 17:35 by mark.dickinson. This issue is now closed.

Files
File name Uploaded Description Edit
re_bug.py Sebastien.Estienne, 2012-11-07 17:00 Example of the bug
Messages (3)
msg175109 - (view) Author: Sebastien Estienne (Sebastien.Estienne) Date: 2012-11-07 17:00
Hello

re.match is blocked and takes 100% cpu forever

re_bug.py is an example of the bug.

thanx
msg175111 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012-11-07 17:34
There is no a Python bug.

You have wrong regexp.  Remove "$" at the end or add ".*" before "$".

And it would be better if instead of '(?P<date>.*?)\s' and '"(?P<method_uri>.*?)"' you use '(?P<date>\S+)\s' and '"(?P<method_uri>[^"]*)"'.
msg175112 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2012-11-07 17:35
This is a known issue: there are a good few duplicates in the tracker.  Issue #1662581 is one, for example.

In this particular case, you can probably fix things by tightening up your regex.  Part of the problem is that '.*' is going to match any sequence of characters, including spaces.  Judicious use of '\S' to match non-whitespace characters might help. There's not much point to the '?' in  '.*?', either.
History
Date User Action Args
2012-11-07 17:35:54mark.dickinsonsetsuperseder: the re module can perform poorly: O(2**n) versus O(n**2)

messages: + msg175112
nosy: + mark.dickinson
2012-11-07 17:34:32serhiy.storchakasetstatus: open -> closed

nosy: + serhiy.storchaka
messages: + msg175111

resolution: not a bug
stage: resolved
2012-11-07 17:00:35Sebastien.Estiennecreate