Author hanno
Recipients hanno
Date 2018-02-19.19:52:16
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <>
I noticed that the HTMLParser will raise an exception on some inputs.
I'm not sure what the expectations here are, but given that real-world HTML often contains all kinds of broken content I would assume an HTMLParser to always try to parse a document and not be interrupted by an exception if an error occurs.

Here's a minified example:
#!/usr/bin/env python3
import html.parser

However I actually stepped upon HTML failing on a real webpage:

Exception of minified example:

Traceback (most recent call last):
  File "./", line 5, in <module>
  File "/usr/lib64/python3.6/html/", line 111, in feed
  File "/usr/lib64/python3.6/html/", line 179, in goahead
    k = self.parse_html_declaration(i)
  File "/usr/lib64/python3.6/html/", line 264, in parse_html_declaration
    return self.parse_marked_section(i)
  File "/usr/lib64/python3.6/", line 149, in parse_marked_section
    sectName, j = self._scan_name( i+3, i )
  File "/usr/lib64/python3.6/", line 391, in _scan_name
    % rawdata[declstartpos:declstartpos+20])
  File "/usr/lib64/python3.6/", line 34, in error
    "subclasses of ParserBase must override error()")
NotImplementedError: subclasses of ParserBase must override error()
Date User Action Args
2018-02-19 19:52:16hannosetrecipients: + hanno
2018-02-19 19:52:16hannosetmessageid: <>
2018-02-19 19:52:16hannolinkissue32876 messages
2018-02-19 19:52:16hannocreate