classification
Title: handling comments with markupbase and HTMLParser
Type: Stage:
Components: Library (Lib) Versions: Python 2.4
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: danielx_, georg.brandl, jimjjewett
Priority: low Keywords:

Created on 2006-03-04 03:15 by danielx_, last changed 2006-03-09 13:28 by georg.brandl. This issue is now closed.

Files
File name Uploaded Description Edit
attach danielx_, 2006-03-04 03:17
Messages (4)
msg27674 - (view) Author: Daniel (danielx_) Date: 2006-03-04 03:15
If the following webpage is correct about the
definition of a comment, HTMLParser.HTMLParser reports
valid (albiet strange) comments as being erroenous:

http://www.htmlhelp.com/reference/wilbur/misc/comment.html

This site gives '<!>' as an example of a valid html
comment. See attachment for what happens at the
console. A similar thing happens with other
(pathalogical) form of comments.
msg27675 - (view) Author: Daniel (danielx_) Date: 2006-03-04 03:17
Logged In: YES 
user_id=1383230

Sorry, I'm unfamiliar with the bug reporting system and my
attachment doesn't seem to have attached.
msg27676 - (view) Author: Jim Jewett (jimjjewett) Date: 2006-03-06 20:41
Logged In: YES 
user_id=764593

I recommend this as a wontfix.  

As the page itself notes, browsers generally got this 
wrong, and existing webpages rely on this buggy behavior.  
Even today, Opera is going back and forth on how right they 
can afford to be without breaking too many pages.

The suggestion at the bottom of the page notes that if you 
keep your comments sane, you won't have problems on your 
own pages.  Realistically, anything not following that rule 
(no embedded -- or >) is effectively buggy, and HTMLParser 
can only guess at the real intention.
msg27677 - (view) Author: Georg Brandl (georg.brandl) * (Python committer) Date: 2006-03-09 13:28
Logged In: YES 
user_id=849994

Updated markupbase to cope with "<!>" in rev. 42938.
History
Date User Action Args
2006-03-04 03:15:14danielx_create