Author ezio.melotti
Recipients ezio.melotti
Date 2012-10-16.09:41:50
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1350380511.49.0.860998010837.issue16245@psf.upfronthosting.co.za>
In-reply-to
Content
A JSON file containing all the HTML5 entities is now available at http://dev.w3.org/html5/spec/entities.json.
I tested from the interpreter to see if it matches the values in html.entities.html5 and there are a dozen of entities that need to be updated:

>>> s = json.load(open('entities.json'))
>>> from html.entities import html5
>>> for (k1,i1),(k2,i2) in zip(sorted(s.items()), sorted(html5.items())):
...   if i1['characters'] != i2: (k1, k2, i1['characters'], i2, i1['codepoints'], list(map(ord, i2)))
... 
('&DotDot;', 'DotDot;', '⃜', '◌⃜', [8412], [9676, 8412])
('&DownBreve;', 'DownBreve;', '̑', '◌̑', [785], [9676, 785])
('&LeftAngleBracket;', 'LeftAngleBracket;', '⟨', '〈', [10216], [9001])
('&NewLine;', 'NewLine;', '\n', '␊', [10], [9226])
('&RightAngleBracket;', 'RightAngleBracket;', '⟩', '〉', [10217], [9002])
('&Tab;', 'Tab;', '\t', '␉', [9], [9225])
('&TripleDot;', 'TripleDot;', '⃛', '◌⃛', [8411], [9676, 8411])
('&lang;', 'lang;', '⟨', '〈', [10216], [9001])
('&langle;', 'langle;', '⟨', '〈', [10216], [9001])
('&rang;', 'rang;', '⟩', '〉', [10217], [9002])
('&rangle;', 'rangle;', '⟩', '〉', [10217], [9002])
('&tdot;', 'tdot;', '⃛', '◌⃛', [8411], [9676, 8411])

The Tools/scripts/parseentities.py script should also be updated (or possibly a new, separate script should be added), so it can be used to generate the html5 dict.  I'm setting this as release blocker so that the update gets done before the release (other values might change in the meanwhile).
History
Date User Action Args
2012-10-16 09:41:51ezio.melottisetrecipients: + ezio.melotti
2012-10-16 09:41:51ezio.melottisetmessageid: <1350380511.49.0.860998010837.issue16245@psf.upfronthosting.co.za>
2012-10-16 09:41:51ezio.melottilinkissue16245 messages
2012-10-16 09:41:50ezio.melotticreate