This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vstinner
Recipients ezio.melotti, jinz, vstinner
Date 2016-02-01.16:54:25
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <>
> PAYLOAD.decode('utf8')  passes in P2.7.* and fails in P3.4

Well, Python 2 decoder didn't respect the Unicode standard. Please see:

Python 3 is now stricted. You can still decode surrogate characters if you need them *for a good reason* using:

>>> b'\xed\xa0\x80'.decode('utf-8', 'surrogatepass')

By they way, there is also:

>>> b'\xed\xa0\x80'.decode('utf-8', 'surrogateescape')

which is very different but may also help.

I suggest to close the issue as NOT A BUG.
Date User Action Args
2016-02-01 16:54:25vstinnersetrecipients: + vstinner, ezio.melotti, jinz
2016-02-01 16:54:25vstinnersetmessageid: <>
2016-02-01 16:54:25vstinnerlinkissue26260 messages
2016-02-01 16:54:25vstinnercreate