This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author ezio.melotti
Recipients ezio.melotti, rednaks
Date 2012-03-11.02:32:19
SpamBayes Score 4.6231934e-09
Marked as misclassified No
Message-id <1331433141.08.0.968892785523.issue14251@psf.upfronthosting.co.za>
In-reply-to
Content
Can you provide a minimal example to reproduce this error?

On Python 2 it's always better to decode the HTML first and then pass unicode to the parser.  Even though on Python 2 the parser accepts bytes string too, there are a few corner cases where it fails.

On Python 3 the parser only accepts unicode, and it should work fine with it (especially if you have an updated clone of cpython).  Can you show what failure you get with Python 3?  Also, can you reproduce the error if you use strict=False?
History
Date User Action Args
2012-03-11 02:32:21ezio.melottisetrecipients: + ezio.melotti, rednaks
2012-03-11 02:32:21ezio.melottisetmessageid: <1331433141.08.0.968892785523.issue14251@psf.upfronthosting.co.za>
2012-03-11 02:32:20ezio.melottilinkissue14251 messages
2012-03-11 02:32:19ezio.melotticreate