This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author nowasky.jr
Recipients nowasky.jr
Date 2020-09-08.21:59:29
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1599602370.01.0.842812753139.issue41748@roundup.psfhosted.org>
In-reply-to
Content
HTML tags that have a attribute name starting with a comma character aren't parsed and break future calls to feed(). 

The problem occurs when such attribute is the second one or later in the HTML tag. Doesn't seems to affect when it's the first attribute.

#POC:

from html.parser import HTMLParser

class MyHTMLParser(HTMLParser):
    def handle_starttag(self, tag, attrs):
        print("Encountered a start tag:", tag)

parser = MyHTMLParser()

#This is ok
parser.feed('<yyy id="poc" a,="">')

#This breaks
parser.feed('<zzz id="poc" ,a="">')

#Future calls to feed() will not work
parser.feed('<img id="poc" src=x>')
History
Date User Action Args
2020-09-08 21:59:30nowasky.jrsetrecipients: + nowasky.jr
2020-09-08 21:59:30nowasky.jrsetmessageid: <1599602370.01.0.842812753139.issue41748@roundup.psfhosted.org>
2020-09-08 21:59:30nowasky.jrlinkissue41748 messages
2020-09-08 21:59:29nowasky.jrcreate