This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author r.david.murray
Recipients ezio.melotti, flox, r.david.murray, stefan.schweizer
Date 2010-01-05.20:13:30
SpamBayes Score 1.5938078e-06
Marked as misclassified No
Message-id <1262722412.63.0.419740189183.issue7626@psf.upfronthosting.co.za>
In-reply-to
Content
w3m (a text mode browser) does not treat the &eacute without the ; as an entity ref (it puts &eacute literally into the display), while firefox does turn it into an eacute with or without the ;.  I'm sure somebody somewhere has a table listing which browsers have what behavior. 

Firefox does render, eg, &test without a trailing semi as &test.  If you want to mirror that result in code using HTMLParser, you can implement the behavior in your entityref handler.

However, this brings up an interesting issue.  Firefox also renders "&test;" literally.  You can't implement that full behavior using HTMLParser, as far as I can see, since you loose the information as to whether the entity ref was terminated by a semicolon or not. So there may be a legitimate feature request with respect to that issue.
History
Date User Action Args
2010-01-05 20:13:32r.david.murraysetrecipients: + r.david.murray, ezio.melotti, flox, stefan.schweizer
2010-01-05 20:13:32r.david.murraysetmessageid: <1262722412.63.0.419740189183.issue7626@psf.upfronthosting.co.za>
2010-01-05 20:13:30r.david.murraylinkissue7626 messages
2010-01-05 20:13:30r.david.murraycreate