Author vstinner
Recipients ezio.melotti, inada.naoki, python-dev, r.david.murray, serhiy.storchaka, vstinner
Date 2015-09-24.14:26:48
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1443104809.07.0.136340877996.issue24870@psf.upfronthosting.co.za>
In-reply-to
Content
Serhiy wrote: "All other error handlers lose information and can't be used per se for transcoding bytes as string or string as bytes."

Well, it was very simple to implement replace and ignore in decoders. I believe that the error handlers are commonly used.

"(...) adding it can slow down common case (no errors). That is why I limit my patch for "surrogateescape" and "surrogatepass" only."

We can start with benchmarks and see if modifying Objects/stringlib/ has a real impact on performances, or if modifying the "slower" decoder in Objects/unicodeobject.c is enough. IMHO it's fine to implement many error handlers in Objects/unicodeobject.c: it's the "slow" path when at least one error occurred, so it doesn't impact the path to decode valid UTF-8 strings.
History
Date User Action Args
2015-09-24 14:26:49vstinnersetrecipients: + vstinner, ezio.melotti, r.david.murray, inada.naoki, python-dev, serhiy.storchaka
2015-09-24 14:26:49vstinnersetmessageid: <1443104809.07.0.136340877996.issue24870@psf.upfronthosting.co.za>
2015-09-24 14:26:49vstinnerlinkissue24870 messages
2015-09-24 14:26:48vstinnercreate