This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author neptune235
Recipients
Date 2004-10-28.04:59:36
SpamBayes Score
Marked as misclassified
Message-id
In-reply-to
Content
HTMLParser has a problem related to the fact that is
doesn't seem to comply to the spec for XHTML. What I am
refering to can be read about here:
http://www.w3.org/TR/xhtml1/#h-4.8
In a nutshell, HTMLParser doesn't treat data inside
'script' or 'style' elements as #PCDATA, but rather
behaves like an HTML 4 parser even for XHTML documents,
parsing only end tags. As a result, entity references
in javascript are not converted as they should be.
XHTML authors writing to spec can expect entities in
script sections of XHTML documents to be converted if
the script is not explicitly escaped as a CDATA
section. which brings up problem two, That sections
explicitly escaped as CDATA are also parsed as HTML 4
'script' and 'style' sections...End tags are parsed...
My understanding is that this is bad as well:
http://www.w3.org/TR/2004/REC-xml-20040204/#dt-cdsection
because CDend is the only thing that's supposed to be
parsed in a CDATA section for all XML documents?

History
Date User Action Args
2007-08-23 14:27:09adminlinkissue1055864 messages
2007-08-23 14:27:09admincreate