Issue 975556: HTMLParser lukewarm on bogus bare attribute chars

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/40417

classification

Title:	HTMLParser lukewarm on bogus bare attribute chars
Type:	enhancement	Stage:	resolved
Components:	Library (Lib)	Versions:	Python 2.7

process

Status:	closed	Resolution:	accepted
Dependencies:		Superseder:	HTMLParser : A auto-tolerant parsing mode View: 1486713
Assigned To:		Nosy List:	Neil Muller, ajaksu2, mkc, nnseva, r.david.murray
Priority:	normal	Keywords:

Created on 2004-06-18 19:33 by mkc, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (5)
msg60515 - (view)	Author: Mike Coleman (mkc)	Date: 2004-06-18 19:33
I tripped over the same problem mentioned in bug #921657 (HTMLParser.py), except that my bogus attribute char is '\|' instead of '@'. May I suggest that HTMLParser either require strict compliance with the HTML spec, or alternatively that it accept everything reasonable? The latter approach would be much more useful, and it would also be valuable to have this decision documented. In particular, 'attrfind' needs to be changed to accept (following the '=\s*') something like the subpattern given for 'locatestarttagend' (see the "bare value" line).
msg60516 - (view)	Author: Vsevolod Novikov (nnseva)	Date: 2004-10-13 10:15
Logged In: YES user_id=325678 see request #1046092 to fix it
msg81438 - (view)	Author: Daniel Diniz (ajaksu2) *	Date: 2009-02-09 06:12
Per #921657, looks like the current behavior is correct.
msg121676 - (view)	Author: Neil Muller (Neil Muller)	Date: 2010-11-20 16:30
This should probably be solved as part of #1486713 .
msg123175 - (view)	Author: R. David Murray (r.david.murray) *	Date: 2010-12-03 04:14
The new strict=False mode from #1486713 handles this case.

History
Date	User	Action	Args
2022-04-11 14:56:04	admin	set	github: 40417
2010-12-03 04:14:13	r.david.murray	set	status: open -> closed superseder: HTMLParser : A auto-tolerant parsing mode nosy: + r.david.murray messages: + msg123175 resolution: accepted stage: resolved
2010-11-20 16:30:56	Neil Muller	set	nosy: + Neil Muller messages: + msg121676
2009-02-09 06:12:41	ajaksu2	set	nosy: + ajaksu2 type: enhancement messages: + msg81438 versions: + Python 2.7, - Python 2.3
2004-06-18 19:33:18	mkc	create