Rietveld Code Review Tool
Help | Bug tracker | Discussion group | Source code | Sign in
(88814)

#15156: Refactor HTMLParser.unescape to use html.entities.html5

Can't Edit
Can't Publish+Mail
Start Review
Created:
7 years, 2 months ago by ezio.melotti
Modified:
7 years, 2 months ago
Reviewers:
georg, storchaka
CC:
ezio.melotti, eric.araujo, r.david.murray, devnull_psf.upfronthosting.co.za
Visibility:
Public.

Patch Set 1 #

Total comments: 13

Patch Set 2 #

Unified diffs Side-by-side diffs Delta from patch set Stats Patch
Doc/library/html.entities.rst View 1 1 chunk +0 lines, -4 lines 0 comments Download
Lib/html/parser.py View 1 2 chunks +14 lines, -16 lines 0 comments Download
Lib/test/test_htmlparser.py View 1 2 chunks +6 lines, -1 line 0 comments Download

Messages

Total messages: 4
Georg
http://bugs.python.org/review/15156/diff/5223/Doc/library/html.entities.rst File Doc/library/html.entities.rst (left): http://bugs.python.org/review/15156/diff/5223/Doc/library/html.entities.rst#oldcode17 Doc/library/html.entities.rst:17: simple textual substitution in the Latin-1 character set (ISO-8859-1). ...
7 years, 2 months ago #1
storchaka_gmail.com
http://bugs.python.org/review/15156/diff/5223/Lib/html/parser.py File Lib/html/parser.py (right): http://bugs.python.org/review/15156/diff/5223/Lib/html/parser.py#newcode522 Lib/html/parser.py:522: for x in range(2, len(s)): On 2012/06/24 18:05:22, Georg ...
7 years, 2 months ago #2
ezio.melotti
http://bugs.python.org/review/15156/diff/5223/Doc/library/html.entities.rst File Doc/library/html.entities.rst (left): http://bugs.python.org/review/15156/diff/5223/Doc/library/html.entities.rst#oldcode17 Doc/library/html.entities.rst:17: simple textual substitution in the Latin-1 character set (ISO-8859-1). ...
7 years, 2 months ago #3
storchaka_gmail.com
7 years, 2 months ago #4
Patch looks good for me.

http://bugs.python.org/review/15156/diff/5223/Lib/html/parser.py
File Lib/html/parser.py (right):

http://bugs.python.org/review/15156/diff/5223/Lib/html/parser.py#newcode519
Lib/html/parser.py:519: from html.entities import html5
On 2012/06/24 19:31:15, ezio.melotti wrote:
> On 2012/06/24 18:05:22, Georg wrote:
> > Why the local import?
> It kept it local like the previous version, but I was planning to make it
global
> and define an __all__ as a separate commit.

If you are going to make refactoring, I see a lot of targets in this function.

http://bugs.python.org/review/15156/diff/5223/Lib/html/parser.py#newcode522
Lib/html/parser.py:522: for x in range(2, len(s)):
On 2012/06/24 19:31:15, ezio.melotti wrote:
> On 2012/06/24 18:39:57, storchaka wrote:
> > for x in range(len(s) - 1, 1, -1):
> I'm not sure which version is better.  For cases like &copyleft you have to
do
> the same number of iterations for both the versions, in other cases you'll
have
> to do more or less.  I find my version more readable.

I was worried about the other issue, that prefix of some entity can be other
entity. But now I see that html.entities.html5 does not contain such conflicts.
Sign in to reply to this message.

RSS Feeds Recent Issues | This issue
This is Rietveld 894c83f36cb7+