Author tmp12342
Recipients ezio.melotti, gvanrossum, kennyluck, lemburg, loewis, mjpieters, pitrou, python-dev, serhiy.storchaka, tchrist, tmp12342, vstinner
Date 2015-08-11.11:28:27
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1439292508.02.0.41876431185.issue12892@psf.upfronthosting.co.za>
In-reply-to
Content
Serhiy, I understand the first reason, but https://docs.python.org/3/library/codecs.html says
> applicable to text encodings:
> [...]
> This code will then be turned back into the same byte when the 'surrogateescape' error handler is used when encoding the data.
Shouldn't it be corrected? Text encoding is defined as "A codec which encodes Unicode strings to bytes."


And about second one, could you explain a bit more? I mean, I don't know how to interpret it.

You say b'\xD8\x00' are invalid ASCII bytes, but from these two only 0xD8 is invalid. Also, we are talking about encoding here, str -> bytes, so who cares are resulting bytes ASCII compatible or not?
History
Date User Action Args
2015-08-11 11:28:28tmp12342setrecipients: + tmp12342, lemburg, gvanrossum, loewis, mjpieters, pitrou, vstinner, ezio.melotti, python-dev, tchrist, kennyluck, serhiy.storchaka
2015-08-11 11:28:28tmp12342setmessageid: <1439292508.02.0.41876431185.issue12892@psf.upfronthosting.co.za>
2015-08-11 11:28:28tmp12342linkissue12892 messages
2015-08-11 11:28:27tmp12342create