Message 384475 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	ezio.melotti
Recipients	ezio.melotti, karlcow
Date	2021-01-06.06:46:27
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1609915587.98.0.579924133337.issue42821@roundup.psfhosted.org>
In-reply-to

Content
If we follow the behavior of the browser, we will have to pick one of the two values and discard the other, making this value unaccessible. If we provide both, scripts and libraries that use HTMLParser will have access to both and can decide what to do. For example BeautifulSoup already does the right thing: >>> bs4.BeautifulSoup('<!doctype html><div class="bar" class="foo">text</div>') <!DOCTYPE html> <html><body><div class="bar">text</div></body></html> Changing this might also break code that rely on this behavior. I'm therefore going to close this as "not a bug".

If we follow the behavior of the browser, we will have to pick one of the two values and discard the other, making this value unaccessible.  If we provide both, scripts and libraries that use HTMLParser will have access to both and can decide what to do.

For example BeautifulSoup already does the right thing:
>>> bs4.BeautifulSoup('<!doctype html><div class="bar" class="foo">text</div>')
<!DOCTYPE html>
<html><body><div class="bar">text</div></body></html>

Changing this might also break code that rely on this behavior.  I'm therefore going to close this as "not a bug".

History
Date	User	Action	Args
2021-01-06 06:46:28	ezio.melotti	set	recipients: + ezio.melotti, karlcow
2021-01-06 06:46:27	ezio.melotti	set	messageid: <1609915587.98.0.579924133337.issue42821@roundup.psfhosted.org>
2021-01-06 06:46:27	ezio.melotti	link	issue42821 messages
2021-01-06 06:46:27	ezio.melotti	create