This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author AbcSxyZ
Recipients AbcSxyZ
Date 2020-08-05.20:03:53
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1596657865.41.0.326756245182.issue41489@roundup.psfhosted.org>
In-reply-to
Content
Coming from deprecated feature. Using python 3.7.3

Related and probably fixed with https://bugs.python.org/issue31844
Just in case.

I've got 2 different related problems, the first one creating the second.

Using linked file and this class :
```
from html.parser import HTMLParser

class LinkParser(HTMLParser):
    """ DOM parser to retrieve href of all <a> elements """

    def parse_links(self, html_content):
        self.links = []
        self.feed(html_content)
        return self.links

    def handle_starttag(self, tag, attrs):
        if tag == "a":
            attrs = {key.lower():value for key, *value in attrs}
            urls = attrs.get("href", None)
            if urls and urls[0]:
                self.links.append(urls[0])

    # def error(self, *args, **kwargs):
    #     pass

if __name__ == "__main__":
    with open("error.txt") as File:
        LinkParser().parse_links(File.read())

```

With error method commented, it creates :
```
  File "scanner/link.py", line 8, in parse_links                                                                                                                        
    self.feed(html_content)                                                                                                                                             
  File "/usr/lib/python3.7/html/parser.py", line 111, in feed                                                                                                           
    self.goahead(0)
  File "/usr/lib/python3.7/html/parser.py", line 179, in goahead
    k = self.parse_html_declaration(i)
  File "/usr/lib/python3.7/html/parser.py", line 264, in parse_html_declaration
    return self.parse_marked_section(i)
  File "/usr/lib/python3.7/_markupbase.py", line 159, in parse_marked_section
    self.error('unknown status keyword %r in marked section' % rawdata[i+3:j])
  File "/usr/lib/python3.7/_markupbase.py", line 34, in error
    "subclasses of ParserBase must override error()")
NotImplementedError: subclasses of ParserBase must override error()
```

If error method do not raise anything, using only pass, it creates :
```
  File "/home/simon/Documents/radio-parser/scanner/link.py", line 8, in parse_links
    self.feed(html_content)
  File "/usr/lib/python3.7/html/parser.py", line 111, in feed
    self.goahead(0)
  File "/usr/lib/python3.7/html/parser.py", line 179, in goahead
    k = self.parse_html_declaration(i)
  File "/usr/lib/python3.7/html/parser.py", line 264, in parse_html_declaration
    return self.parse_marked_section(i)
  File "/usr/lib/python3.7/_markupbase.py", line 160, in parse_marked_section
    if not match:
UnboundLocalError: local variable 'match' referenced before assignment
```

We see here `match` variable is not created if `self.error` is called,
and because error do not raise exception, will create UnboundLocalError :

```
    def parse_marked_section(self, i, report=1):
        rawdata= self.rawdata
        assert rawdata[i:i+3] == '<![', "unexpected call to parse_marked_section()"
        sectName, j = self._scan_name( i+3, i )
        if j < 0:
            return j
        if sectName in {"temp", "cdata", "ignore", "include", "rcdata"}:
            # look for standard ]]> ending
            match= _markedsectionclose.search(rawdata, i+3)
        elif sectName in {"if", "else", "endif"}:
            # look for MS Office ]> ending
            match= _msmarkedsectionclose.search(rawdata, i+3)
        else:
            self.error('unknown status keyword %r in marked section' % rawdata[i+3:j])
        if not match:
            return -1
        if report:
            j = match.start(0)
            self.unknown_decl(rawdata[i+3: j])
        return match.end(0)

```
History
Date User Action Args
2020-08-05 20:04:25AbcSxyZsetrecipients: + AbcSxyZ
2020-08-05 20:04:25AbcSxyZsetmessageid: <1596657865.41.0.326756245182.issue41489@roundup.psfhosted.org>
2020-08-05 20:04:25AbcSxyZlinkissue41489 messages
2020-08-05 20:04:25AbcSxyZcreate