This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author computercrustie
Recipients computercrustie
Date 2008-06-17.07:15:48
SpamBayes Score 0.07584183
Marked as misclassified No
Message-id <1213686983.49.0.329999493672.issue3128@psf.upfronthosting.co.za>
In-reply-to
Content
After struggling around with my code for nearly 1 hour now, I found out
that one of my regular expressions with a special string causes python
to hang up - not really hang up, because the processor usage is at
nearly 100%, so I think the regex machine is looping infinite.

Here is the regex-string:

re_exc_line = re.compile (
        # ignore everything before the first match
        r'^.*' +
        # first group (includes second | third)
        r'(?:' +
         # second group "(line) (file)"
         r'(?:' +
          # (text to ignore, line [number])
          r'\([^,]+\s*,\s*line\s+(?P<line1>\d+)\)' +
          # any text ([filename]) any text
          r'.*\((?:(?P<file1>[^)]+))*\).*' +
         # end of second group
         r')' +
        # or
        r'|' +
         # third group "(file) (line)"
         r'(?:' +
          # ([filename])
          r'\((?:(?P<file2>[^)]+))*\)' +
          # any text (text to ignore, line [number]) any text
          r'.*\([^,]+\s*,\s*line\s+(?P<line2>\d+)\).*' +
          # end of third group
         r')' +
        # end of first group
        r')' +
        # any text after it
        r'.*$'
        , re.I
    )

It should match either the construct:

1. """some optional text (text to ignore, line [12]) ([any_filename])
followed by optional text"""

or:

2. """some optional text ([any_filename]) (text to ignore, line [12])
followed by optional text"""

If first text matches, it is put into 'line1' and 'file1' and if the
second one matches into 'line2' and 'file2' of the groupdict.

For the upper both examples everything is ok, but having the following
string (I had to change some pathnames, because they contained customer
names):
msg = (
r"Error: Error during parsing: invalid syntax " +
r"(D:\Projects\retest\ver_700\lib\_test\test_sapekl.py, line 14) " +
r"-- Error during parsing: invalid syntax " + 
r"(D:\projects\retest\ver_700\modules\sapekl\__init__.py, line 21) " +
r"-- Attempted relative import in non-package, or beyond toplevel " +
r"package")

used with the upper regex:

re_exc_line.match(msg)

is running for two hours now (on a 3Ghz Machine)!

I've attached everything as an example file and hope, I could help you.
History
Date User Action Args
2008-06-17 07:16:28computercrustiesetspambayes_score: 0.0758418 -> 0.07584183
recipients: + computercrustie
2008-06-17 07:16:23computercrustiesetspambayes_score: 0.0758418 -> 0.0758418
messageid: <1213686983.49.0.329999493672.issue3128@psf.upfronthosting.co.za>
2008-06-17 07:16:18computercrustielinkissue3128 messages
2008-06-17 07:16:05computercrustiecreate