This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author ncoghlan
Recipients eric.araujo, eric.smith, ncoghlan, orsenthil, pitrou, r.david.murray, vstinner
Date 2010-10-05.10:32:17
SpamBayes Score 8.857415e-12
Marked as misclassified No
Message-id <AANLkTin2f-+DZGai3mBcH7f7DX-iPet6BpqSJrC42082@mail.gmail.com>
In-reply-to <201010050931.54349.victor.stinner@haypocalc.com>
Content
On Tue, Oct 5, 2010 at 5:32 PM, STINNER Victor <report@bugs.python.org> wrote:
>
> STINNER Victor <victor.stinner@haypocalc.com> added the comment:
>
>> If you were worried about performance, then surrogateescape is certainly
>> much slower than latin1.
>
> If you were really worried about performance, the bytes type is maybe faster
> than: decode bytes to str using latin-1, process str strings, encode str to
> bytes using latin-1.

I'm fairly resigned to the fact that I'm going to need some kind of
micro-benchmark to compare the different approaches. For example, the
bytes based approach has a lot of extra assignments to local variables
that the str based approach doesn't need.

The first step is to actually have a str-based patch to compare to the
existing bytes based patch. If the code ends up significantly clearer
(as I expect it will), we can probably sacrifice a certain amount of
speed for that benefit.
History
Date User Action Args
2010-10-05 10:32:20ncoghlansetrecipients: + ncoghlan, orsenthil, pitrou, vstinner, eric.smith, eric.araujo, r.david.murray
2010-10-05 10:32:18ncoghlanlinkissue9873 messages
2010-10-05 10:32:17ncoghlancreate