On Tue, Oct 5, 2010 at 5:32 PM, STINNER Victor <> wrote:
> STINNER Victor <> added the comment:
>> If you were worried about performance, then surrogateescape is certainly
>> much slower than latin1.
> If you were really worried about performance, the bytes type is maybe faster
> than: decode bytes to str using latin-1, process str strings, encode str to
> bytes using latin-1.

I'm fairly resigned to the fact that I'm going to need some kind of
micro-benchmark to compare the different approaches. For example, the
bytes based approach has a lot of extra assignments to local variables
that the str based approach doesn't need.

The first step is to actually have a str-based patch to compare to the
existing bytes based patch. If the code ends up significantly clearer
(as I expect it will), we can probably sacrifice a certain amount of
speed for that benefit.
