Issue1541697
Created on 2006-08-17 01:51 by jjlee, last changed 2006-09-11 04:25 by nnorwitz.
|
msg29529 - (view) |
Author: John J Lee (jjlee) |
Date: 2006-08-17 01:51 |
|
Looks like revision 47154 introduced a regexp that
hangs Python (Ctrl-C won't kill the process, CPU usage
sits near 100%) under some circumstances. A test case
is attached (sgmllib.html and hang_sgmllib.py).
The problem isn't seen if you read the whole file (or
nearly the whole file) at once. But that doesn't make
it a non-bug, AFAICS.
I'm not sure what the problem is, but presumably the
relevant part of the patch is this:
+starttag = re.compile(r'<[a-zA-Z][-_.:a-zA-Z0-9]*\s*('
+ r'\s*([a-zA-Z_][-:.a-zA-Z_0-9]*)(\s*=\s*'
+
r'(\'[^\']*\'|"[^"]*"|[-a-zA-Z0-9./,:;+*%?!&$\(\)_#=~@]'
+
r'[][\-a-zA-Z0-9./,:;+*%?!&$\(\)_#=~\'"@]*(?=[\s>/<])))?'
+ r')*\s*/?\s*(?=[<>])')
The patch attached to bug 1515142 (also from Sam Ruby
-- claims to fix a regression introduced by his recent
sgmllib patches, and has not yet been applied) does NOT
fix the problem.
|
|
msg29530 - (view) |
Author: George Yoshida (quiver) |
Date: 2006-08-18 04:55 |
|
Logged In: YES
user_id=671362
Slimmed down test case is attached.(The regex pattern in
question is used)
FYI, r47154 is backported to 2.4 branch(r47155).
|
|
msg29531 - (view) |
Author: kovan (kovan) |
Date: 2006-09-05 21:04 |
|
Logged In: YES
user_id=1426755
I've been testing quiver's test case:
- With Eclipse's QuickREx plugin: it hangs. It was
configured in PCRE mode (which uses Jakarta-ORO Perl 5
regular expressions implementation), and no additional options.
- With grep: grep exits with a fatal error and dumps a stack
trace. grep was run also in Perl mode, with the command:
grep -P -f regexp.txt test.txt
I can't find an explanation for this, but I don't know much
about regexps. I hope it has some utility for the resolution
of this bug nevertheless.
|
|
msg29532 - (view) |
Author: kovan (kovan) |
Date: 2006-09-05 21:24 |
|
Logged In: YES
user_id=1426755
Again FYI, here's the diff where presumably the bug was
introduced:
Lib/sgmllib.py?rev=47080&r1=46996&r2=47080">http://svn.python.org/view/python/trunk/Lib/sgmllib.py?rev=47080&r1=46996&r2=47080
|
|
msg29533 - (view) |
Author: kovan (kovan) |
Date: 2006-09-05 21:40 |
|
Logged In: YES
user_id=1426755
Sorry, correct URL is
Lib/sgmllib.py?rev=47154&r1=47080&r2=47154">http://svn.python.org/view/python/trunk/Lib/sgmllib.py?rev=47154&r1=47080&r2=47154
|
|
msg29534 - (view) |
Author: Neal Norwitz (nnorwitz) |
Date: 2006-09-11 04:25 |
|
Logged In: YES
user_id=33168
I reverted the patch and added the test case for sgml so the
infinite loop doesn't recur.
Committed revision 51854. (head)
Committed revision 51850. (2.5)
Committed revision 51853. (2.4)
I will add the hang_re test cause to test_crashers or somewhere.
|
|
| Date |
User |
Action |
Args |
| 2006-08-17 01:51:00 | jjlee | create | |
|