classification
Title: incorrect handle of declaration in markupbase
Type: behavior Stage: test needed
Components: Library (Lib) Versions: Python 2.6
process
Status: closed Resolution: duplicate
Dependencies: Superseder:
Assigned To: Nosy List: BreamoreBoy, tungwaiyip
Priority: normal Keywords: easy

Created on 2005-02-15 07:04 by tungwaiyip, last changed 2010-08-21 12:08 by BreamoreBoy. This issue is now closed.

Messages (3)
msg60652 - (view) Author: Wai Yip Tung (tungwaiyip) Date: 2005-02-15 07:04
When parsing the document below using sgmllib:

<html>
<!-BAD COMMENT->hello
</html>

The incorrect declaration is returned with hello as one 
single character data:

  "<!-BAD COMMENT->hello"

markupbase should have treated it as an error (to be 
consistent with it strict treatment in _scan_name).

I believe the line 73 of markupbase.py should be

        if rawdata[j:j+2] in ("-", ""):

intead of 

        if rawdata[j:j+1] in ("-", ""):

Also note that the condition in line 79 will not be true

    if rawdata[j:j+1] == '--'
msg60653 - (view) Author: Wai Yip Tung (tungwaiyip) Date: 2005-02-15 17:09
Logged In: YES 
user_id=561546

To clarify the syndrome, actually everything after the <!- is 
returned as a single character data:

"<!-BAD COMMENT->hello\r\n</html>"

This means all the tags like </html> are not parsed as tags but 
as character data as soon as there is a <!-. That's why I think 
it is significant bug to report.

msg114488 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2010-08-21 12:08
Fixed in #1442874.
History
Date User Action Args
2010-08-21 12:08:33BreamoreBoysetstatus: open -> closed

nosy: + BreamoreBoy
messages: + msg114488

resolution: duplicate
2009-04-22 14:44:22ajaksu2setkeywords: + easy
2009-02-16 01:01:07ajaksu2settitle: incorrect handle of declaration in markupbase -> incorrect handle of declaration in markupbase
stage: test needed
type: behavior
versions: + Python 2.6, - Python 2.4
2005-02-15 07:04:13tungwaiyipcreate