This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author jjlee
Recipients
Date 2006-08-17.01:51:00
SpamBayes Score
Marked as misclassified
Message-id
In-reply-to
Content
Looks like revision 47154 introduced a regexp that
hangs Python (Ctrl-C won't kill the process, CPU usage
sits near 100%) under some circumstances.  A test case
is attached (sgmllib.html and hang_sgmllib.py).

The problem isn't seen if you read the whole file (or
nearly the whole file) at once.  But that doesn't make
it a non-bug, AFAICS.

I'm not sure what the problem is, but presumably the
relevant part of the patch is this:

+starttag = re.compile(r'<[a-zA-Z][-_.:a-zA-Z0-9]*\s*('
+        r'\s*([a-zA-Z_][-:.a-zA-Z_0-9]*)(\s*=\s*'
+       
r'(\'[^\']*\'|"[^"]*"|[-a-zA-Z0-9./,:;+*%?!&$\(\)_#=~@]'
+       
r'[][\-a-zA-Z0-9./,:;+*%?!&$\(\)_#=~\'"@]*(?=[\s>/<])))?'
+    r')*\s*/?\s*(?=[<>])')


The patch attached to bug 1515142 (also from Sam Ruby
-- claims to fix a regression introduced by his recent
sgmllib patches, and has not yet been applied) does NOT
fix the problem.
History
Date User Action Args
2007-08-23 14:42:03adminlinkissue1541697 messages
2007-08-23 14:42:03admincreate