Title: html.parser.HTMLParser.unescape works only with the first 128 entities
Created on 2011-09-02 21:08

unescape_bug.patch peter.otten, 2011-09-03 10:15 review
Author: Yves Dorfsman Date: 2011-09-02 21:08
html.parser.HTMLParser.unescape works only with the first 128 entities, it leaves the other ones as they are.
Author: Yves Dorfsman Date: 2011-09-03 08:35
Added a test case:

If you set the loop < 128 then the test passes (set at 1000 right now).
Author: Peter Otten Date: 2011-09-03 10:15
The unescape() method uses re.sub(regex, sub, re.ASCII), but the third argument is count, not flags. Fix is easy: use

re.sub(regex, sub, flags=re.ASCII).
Author: Roundup Robot Date: 2011-09-05 14:16
New changeset 9896fc2a8167 by Ezio Melotti in branch '3.2':
#12888: Fix a bug in HTMLParser.unescape that prevented it to escape more than 128 entities.  Patch by Peter Otten.

New changeset 7b6096852665 by Ezio Melotti in branch 'default':
#12888: merge with 3.2.
Author: Ezio Melotti Date: 2011-09-05 14:26
Fixed, thanks for the report and the patch!
