New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HTMLParser ParseError in start tag #40065
Comments
when this - obviously correct html - is parsed: <a href=mailto:xyz@domain.com>xyz</a> this exception is raised: I work around this by adding '@' to the import HTMLParser
HTMLParser.attrfind = re.compile(
r'\s*([a-zA-Z_][-.:a-zA-Z_0-9]*)(\s*=\s*'
r'(\'[^\']*\'|"[^"]*"|[-a-zA-Z0-9./,:;+*%?!&$\(\)
_#=~@]*))?')
myparser = HTMLParser.HTMLParser()
myparser.feed('<a ... ') |
Logged In: YES I don't believe this HTML is obviously correct. In certain cases, authors may specify the value of an The regex is already more liberal than this, allowing slashes |
Logged In: YES Committed to the CVS HEAD; thanks! |
Logged In: YES see request bpo-1046092 to fix it |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: