Message 127865 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	Brian.Jones
Recipients	Brian.Jones
Date	2011-02-04.03:43:53
SpamBayes Score	0.002925628
Marked as misclassified	No
Message-id	<1296791035.1.0.458454998569.issue11113@psf.upfronthosting.co.za>
In-reply-to

Content
In Python 3.2b2, html.entities.codepoint2name and name2codepoint only support the 252 HTML entity names defined in the HTML 4 spec from 1997. I'm wondering if there's a reason not to support W3C Recommendation 'XML Entity Definitions for Characters' http://www.w3.org/TR/xml-entity-names/ This standard contains significantly more characters, and it is noted in that spec that the HTML 5 drafts use that spec's entities. You can see the current HTML 5 'Named character references' here: http://www.w3.org/TR/html5/named-character-references.html#named-character-references If this is just a matter of somebody going in to do the grunt work, let me know. If startup costs associated with importing a huge dictionary are a concern, perhaps a more efficient type that enables the same lookup interface can be defined. If other reasons exist to not move in this direction, please do let me know!

In Python 3.2b2, html.entities.codepoint2name and name2codepoint only support the 252 HTML entity names defined in the HTML 4 spec from 1997. I'm wondering if there's a reason not to support W3C Recommendation 'XML Entity Definitions for Characters' 

http://www.w3.org/TR/xml-entity-names/

This standard contains significantly more characters, and it is noted in that spec that the HTML 5 drafts use that spec's entities. You can see the current HTML 5 'Named character references' here: 

http://www.w3.org/TR/html5/named-character-references.html#named-character-references

If this is just a matter of somebody going in to do the grunt work, let me know. 

If startup costs associated with importing a huge dictionary are a concern, perhaps a more efficient type that enables the same lookup interface can be defined. 

If other reasons exist to not move in this direction, please do let me know!

History
Date	User	Action	Args
2011-02-04 03:43:55	Brian.Jones	set	recipients: + Brian.Jones
2011-02-04 03:43:55	Brian.Jones	set	messageid: <1296791035.1.0.458454998569.issue11113@psf.upfronthosting.co.za>
2011-02-04 03:43:54	Brian.Jones	link	issue11113 messages
2011-02-04 03:43:53	Brian.Jones	create