Message44032
HTML examples seen in the wild that cause parse errors
in HTMLParser include:
<a width="100%"cellspacing=0>
-- note lack of space between val and next attr name
<a foo=>
-- trailing attribute has no value after =
<a href=javascript:popup('/popup/html.html')>
-- javascript fragment with embedded quotes
My patch contains improvements to the 'attrfind' and
'locatestarttagend' regexps that allow these examples
to parse.
The existing test_htmlparser.py unit test continues to
pass, except for the one test case where it considers
<a foo=> to be an error.
I commented out that case and added new test cases to
cover the examples above.
|
|
Date |
User |
Action |
Args |
2007-08-23 15:27:46 | admin | link | issue755670 messages |
2007-08-23 15:27:46 | admin | create | |
|