classification
Title: HTMLParser lukewarm on bogus bare attribute chars
Type: feature request Stage:
Components: Library (Lib) Versions: Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: ajaksu2, mkc, nnseva (3)
Priority: normal Keywords

Created on 2004-06-18 19:33 by mkc, last changed 2009-02-09 06:12 by ajaksu2.

Messages (3)
msg60515 - (view) Author: Mike Coleman (mkc) Date: 2004-06-18 19:33
I tripped over the same problem mentioned in bug
#921657 (HTMLParser.py), except that my bogus attribute
char is '|' instead of '@'.

May I suggest that HTMLParser either require strict
compliance with the HTML spec, or alternatively that it
accept everything reasonable?  The latter approach
would be much more useful, and it would also be
valuable to have this decision documented.

In particular, 'attrfind' needs to be changed to accept
(following the '=\s*') something like the subpattern
given for 'locatestarttagend' (see the "bare value" line).
msg60516 - (view) Author: Vsevolod Novikov (nnseva) Date: 2004-10-13 10:15
Logged In: YES 
user_id=325678

see request #1046092 to fix it
msg81438 - (view) Author: Daniel Diniz (ajaksu2) Date: 2009-02-09 06:12
Per #921657, looks like the current behavior is correct.
History
Date User Action Args
2009-02-09 06:12:41ajaksu2setnosy: + ajaksu2
type: feature request
messages: + msg81438
versions: + Python 2.7, - Python 2.3
2004-06-18 19:33:18mkccreate