New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
html.HTMLParser raises UnboundLocalError: #62002
Comments
When trying to parse the string {{{
>>> from html.parser import HTMLParser
>>> p = HTMLParser()
>>> p.feed('a&b')
>>> p.close()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.3/html/parser.py", line 149, in close
self.goahead(1)
File "/usr/lib/python3.3/html/parser.py", line 252, in goahead
if k <= i:
UnboundLocalError: local variable 'k' referenced before assignment
}}} Granted, the HTML is invalid, but this error looks like it might have been an oversight. |
Thanks for the report. Yes, that's in a complicated bit of error recovery code, and clearly you found a path through it that doesn't have a corresponding test :) |
Just adding a patch here with a few unit tests to demonstrate the issue, comments here are welcome. This is my first patch, I believe I have put the tests in the correct place. It appears the problem only occurs if there is an incomplete XML entity where a sequence of valid characters (for an XML entity's name) lead to the end-of-file. The test case for "a&b " passes, as it detects the space as an illegal character for the entity name. |
Thanks for the patch Thomas! |
New changeset 9cb90c1a1a46 by Ezio Melotti in branch '3.3': New changeset 20be90a3a714 by Ezio Melotti in branch 'default': |
Fixed, thanks for the report! |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: