classification
Title: Not Implemented Error in stdLib HTMLParser
Type: behavior Stage:
Components: Library (Lib) Versions: Python 3.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: berker.peksag, ezio.melotti, iritkatriel, xtreak, yevgenyp
Priority: normal Keywords:

Created on 2019-10-24 07:29 by yevgenyp, last changed 2021-09-09 18:07 by iritkatriel.

Messages (3)
msg355287 - (view) Author: Yevgeny Pats (yevgenyp) Date: 2019-10-24 07:29
Not implemented error in built-in HTMLParser

from html.parser import HTMLParser
parser = HTMLParser()
parser.feed(bytearray.fromhex('3c215b63612121').decode('ascii'))

# This will throw (found by https://github.com/fuzzitdev/pythonfuzz):
Traceback (most recent call last):
  File "/Users/yevgenyp/fuzzitdev/pythonfuzz/pythonfuzz/fuzzer.py", line 21, in worker
    target(buf)
  File "examples/htmlparser/fuzz.py", line 12, in fuzz
    pass
  File "/usr/local/Cellar/python/3.7.4/Frameworks/Python.framework/Versions/3.7/lib/python3.7/html/parser.py", line 111, in feed
    self.goahead(0)
  File "/usr/local/Cellar/python/3.7.4/Frameworks/Python.framework/Versions/3.7/lib/python3.7/html/parser.py", line 179, in goahead
    k = self.parse_html_declaration(i)
  File "/usr/local/Cellar/python/3.7.4/Frameworks/Python.framework/Versions/3.7/lib/python3.7/html/parser.py", line 264, in parse_html_declaration
    return self.parse_marked_section(i)
  File "/usr/local/Cellar/python/3.7.4/Frameworks/Python.framework/Versions/3.7/lib/python3.7/_markupbase.py", line 159, in parse_marked_section
    self.error('unknown status keyword %r in marked section' % rawdata[i+3:j])
  File "/usr/local/Cellar/python/3.7.4/Frameworks/Python.framework/Versions/3.7/lib/python3.7/_markupbase.py", line 34, in error
    "subclasses of ParserBase must override error()")
NotImplementedError: subclasses of ParserBase must override error()
msg355295 - (view) Author: Karthikeyan Singaravelan (xtreak) * (Python committer) Date: 2019-10-24 07:54
See also https://bugs.python.org/issue32876 and https://bugs.python.org/issue31844
msg401506 - (view) Author: Irit Katriel (iritkatriel) * (Python committer) Date: 2021-09-09 18:07
Changing type since this is an exception and not a crash.

I get a different error now:

>>> parser.feed(bytearray.fromhex('3c215b63612121').decode('ascii'))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/iritkatriel/src/cpython-1/Lib/html/parser.py", line 110, in feed
    self.goahead(0)
    ^^^^^^^^^^^^^^^
  File "/Users/iritkatriel/src/cpython-1/Lib/html/parser.py", line 178, in goahead
    k = self.parse_html_declaration(i)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/iritkatriel/src/cpython-1/Lib/html/parser.py", line 263, in parse_html_declaration
    return self.parse_marked_section(i)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/iritkatriel/src/cpython-1/Lib/_markupbase.py", line 154, in parse_marked_section
    raise AssertionError(
    ^^^^^^^^^^^^^^^^^^^^^
AssertionError: unknown status keyword 'ca' in marked section
History
Date User Action Args
2021-09-09 18:07:17iritkatrielsettype: crash -> behavior

messages: + msg401506
nosy: + iritkatriel
2021-08-20 21:46:41terry.reedylinkissue44918 superseder
2019-10-24 07:54:30xtreaksetnosy: + ezio.melotti, berker.peksag, xtreak
messages: + msg355295
2019-10-24 07:29:59yevgenypcreate