This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author halfjuice
Recipients halfjuice
Date 2010-10-06.04:27:52
SpamBayes Score 0.020318165
Marked as misclassified No
Message-id <1286339282.85.0.467383816638.issue10035@psf.upfronthosting.co.za>
In-reply-to
Content
When parsing html containing the following tag:
... <!- ie6 doesn't allow empty div. -> ...
SGMLParser will stop parse following content without any warning. When such tag is removed everything works fine.

When looking into sgmllib.py, statement below found:

    if rawdata.startswith("<!", i):
        # This is some sort of declaration; in "HTML as
        # deployed," this should only be the document type
        # declaration ("<!DOCTYPE html...>").

I think that's why something goes wrong here.
History
Date User Action Args
2010-10-06 04:28:04halfjuicesetrecipients: + halfjuice
2010-10-06 04:28:02halfjuicesetmessageid: <1286339282.85.0.467383816638.issue10035@psf.upfronthosting.co.za>
2010-10-06 04:28:00halfjuicelinkissue10035 messages
2010-10-06 04:27:54halfjuicecreate