This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author Quentin.Pradet
Recipients Quentin.Pradet, docs@python
Date 2016-01-27.17:15:08
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1453914908.97.0.691892593639.issue26220@psf.upfronthosting.co.za>
In-reply-to
Content
From https://docs.python.org/3.6/howto/unicode.html#the-string-type:

> The following examples show the differences::
>
>     >>> b'\x80abc'.decode("utf-8", "strict")  #doctest: +NORMALIZE_WHITESPACE
>     Traceback (most recent call last):
>         ...
>     UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0:
>       invalid start byte
>     >>> b'\x80abc'.decode("utf-8", "replace")
>     '\ufffdabc'
>     >>> b'\x80abc'.decode("utf-8", "backslashreplace")
>     '\\x80abc'
>     >>> b'\x80abc'.decode("utf-8", "ignore")
>     'abc'
>
> (In this code example, the Unicode replacement character has been replaced by
> a question mark because it may not be displayed on some systems.)

I think the whole sentence after the snippet can be removed because this is exactly what Python 3.2+ outputs. It looks like the commit which added this sentence dates from Python 3.1: https://github.com/python/cpython/commit/34d4c82af56ebc1b65514a118f0ec7feeb8e172f, but another commit around Python 3.3 removed it: https://github.com/python/cpython/commit/63172c46706ae9b2a3bc80d639504a57fff4e716.
History
Date User Action Args
2016-01-27 17:15:09Quentin.Pradetsetrecipients: + Quentin.Pradet, docs@python
2016-01-27 17:15:08Quentin.Pradetsetmessageid: <1453914908.97.0.691892593639.issue26220@psf.upfronthosting.co.za>
2016-01-27 17:15:08Quentin.Pradetlinkissue26220 messages
2016-01-27 17:15:08Quentin.Pradetcreate