This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author rednaks
Recipients rednaks
Date 2012-03-11.02:23:13
SpamBayes Score 1.0655421e-11
Marked as misclassified No
Message-id <1331432595.42.0.705024671549.issue14251@psf.upfronthosting.co.za>
In-reply-to
Content
Hello !
while parsing a HTML code i got an decode Error :

but this issue can be fixed by replacing  the last string by s.decode() like in
the diff file.
I also tried to execute my script under python3.2 and it does not parsing any thing 

  File "/usr/lib/python2.7/HTMLParser.py", line 111, in feed
    self.goahead(0)
  File "/usr/lib/python2.7/HTMLParser.py", line 155, in goahead
    k = self.parse_starttag(i)
  File "/usr/lib/python2.7/HTMLParser.py", line 260, in parse_starttag
    attrvalue = self.unescape(attrvalue)
  File "/usr/lib/python2.7/HTMLParser.py", line 410, in unescape
    return re.sub(r"&(#?[xX]?(?:[0-9a-fA-F]+|\w{1,8}));", replaceEntities, s)
  File "/usr/lib/python2.7/re.py", line 151, in sub
    return _compile(pattern, flags).sub(repl, string, count)
UnicodeDecodeError: 'ascii' codec can't decode byte 0x97 in position 1: ordinal
not in range(128)
History
Date User Action Args
2012-03-11 02:23:15rednakssetrecipients: + rednaks
2012-03-11 02:23:15rednakssetmessageid: <1331432595.42.0.705024671549.issue14251@psf.upfronthosting.co.za>
2012-03-11 02:23:14rednakslinkissue14251 messages
2012-03-11 02:23:13rednakscreate