classification
Title: Refactor HTMLParser.unescape to use html.entities.html5
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: ezio.melotti Nosy List: eric.araujo, ezio.melotti, python-dev, r.david.murray
Priority: normal Keywords: patch

Created on 2012-06-24 02:45 by ezio.melotti, last changed 2012-06-24 20:05 by ezio.melotti. This issue is now closed.

Files
File name Uploaded Description Edit
issue15156.diff ezio.melotti, 2012-06-24 14:26 review
issue15156-2.diff ezio.melotti, 2012-06-24 17:35 review
Messages (5)
msg163702 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2012-06-24 02:45
HTMLParser has an internal method called unescape [0] used to convert named character references to the equivalent characters, and it does so by using html.entities.name2codepoint to recreate the equivalent of html.entities.entityrefs with the addition of '.
Now that the html5 entities have been added to html.entities, the parser should use them instead of name2codepoint.

[0]: see Lib/html/parser.py:500
msg163790 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2012-06-24 14:26
Here's a patch, please review.
msg163811 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2012-06-24 17:35
Patch updated after the review.
msg163837 - (view) Author: Roundup Robot (python-dev) Date: 2012-06-24 20:04
New changeset 0d53703b1a99 by Ezio Melotti in branch 'default':
#15156: HTMLParser now uses the new "html.entities.html5" dictionary.
http://hg.python.org/cpython/rev/0d53703b1a99
msg163838 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2012-06-24 20:05
Fixed, thanks for the reviews!
History
Date User Action Args
2012-06-24 20:05:35ezio.melottisetstatus: open -> closed
resolution: fixed
messages: + msg163838

stage: patch review -> resolved
2012-06-24 20:04:09python-devsetnosy: + python-dev
messages: + msg163837
2012-06-24 17:35:31ezio.melottisetfiles: + issue15156-2.diff

messages: + msg163811
2012-06-24 14:26:50ezio.melottisetfiles: + issue15156.diff
keywords: + patch
messages: + msg163790

stage: needs patch -> patch review
2012-06-24 02:45:42ezio.melotticreate