This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author eric.araujo
Recipients Hunanyan, Matt.Basta, cpalmer, eric.araujo, ezio.melotti, fantoozler, fdrake, friday, georg.brandl, gsf, momat, orsenthil, r.david.murray, yotam
Date 2011-07-27.15:12:06
SpamBayes Score 0.05191596
Marked as misclassified No
Message-id <1311779527.62.0.34688829476.issue670664@psf.upfronthosting.co.za>
In-reply-to
Content
Ezio wrote:
  >>> myhp.feed('<script><p>foo</p></script>')
  data: '<p>foo'  # where's the </p>?

http://www.w3.org/TR/html4/types#type-cdata says:
  Although the STYLE and SCRIPT elements use CDATA for their data
  model, for these elements, CDATA must be handled differently by user
  agents. Markup and entities must be treated as raw text and passed to
  the application as is. The first occurrence of the character sequence
  "</" (end-tag open delimiter) is treated as terminating the end of
  the element's content. In valid documents, this would be the end tag
  for the element.

So I think the example is invalid (should escape the <), and that HTMLParser is not buggy.
History
Date User Action Args
2011-07-27 15:12:07eric.araujosetrecipients: + eric.araujo, fdrake, georg.brandl, yotam, orsenthil, fantoozler, gsf, cpalmer, ezio.melotti, r.david.murray, momat, Hunanyan, friday, Matt.Basta
2011-07-27 15:12:07eric.araujosetmessageid: <1311779527.62.0.34688829476.issue670664@psf.upfronthosting.co.za>
2011-07-27 15:12:07eric.araujolinkissue670664 messages
2011-07-27 15:12:07eric.araujocreate