Message 155366 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	rednaks
Recipients	rednaks
Date	2012-03-11.02:23:13
SpamBayes Score	1.0655421e-11
Marked as misclassified	No
Message-id	<1331432595.42.0.705024671549.issue14251@psf.upfronthosting.co.za>
In-reply-to

Content
Hello ! while parsing a HTML code i got an decode Error : but this issue can be fixed by replacing the last string by s.decode() like in the diff file. I also tried to execute my script under python3.2 and it does not parsing any thing File "/usr/lib/python2.7/HTMLParser.py", line 111, in feed self.goahead(0) File "/usr/lib/python2.7/HTMLParser.py", line 155, in goahead k = self.parse_starttag(i) File "/usr/lib/python2.7/HTMLParser.py", line 260, in parse_starttag attrvalue = self.unescape(attrvalue) File "/usr/lib/python2.7/HTMLParser.py", line 410, in unescape return re.sub(r"&(#?[xX]?(?:[0-9a-fA-F]+\|\w{1,8}));", replaceEntities, s) File "/usr/lib/python2.7/re.py", line 151, in sub return _compile(pattern, flags).sub(repl, string, count) UnicodeDecodeError: 'ascii' codec can't decode byte 0x97 in position 1: ordinal not in range(128)

Hello !
while parsing a HTML code i got an decode Error :

but this issue can be fixed by replacing  the last string by s.decode() like in
the diff file.
I also tried to execute my script under python3.2 and it does not parsing any thing 

  File "/usr/lib/python2.7/HTMLParser.py", line 111, in feed
    self.goahead(0)
  File "/usr/lib/python2.7/HTMLParser.py", line 155, in goahead
    k = self.parse_starttag(i)
  File "/usr/lib/python2.7/HTMLParser.py", line 260, in parse_starttag
    attrvalue = self.unescape(attrvalue)
  File "/usr/lib/python2.7/HTMLParser.py", line 410, in unescape
    return re.sub(r"&(#?[xX]?(?:[0-9a-fA-F]+|\w{1,8}));", replaceEntities, s)
  File "/usr/lib/python2.7/re.py", line 151, in sub
    return _compile(pattern, flags).sub(repl, string, count)
UnicodeDecodeError: 'ascii' codec can't decode byte 0x97 in position 1: ordinal
not in range(128)

History
Date	User	Action	Args
2012-03-11 02:23:15	rednaks	set	recipients: + rednaks
2012-03-11 02:23:15	rednaks	set	messageid: <1331432595.42.0.705024671549.issue14251@psf.upfronthosting.co.za>
2012-03-11 02:23:14	rednaks	link	issue14251 messages
2012-03-11 02:23:13	rednaks	create