This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Optimize ascii and latin1 decoder with surrogateescape and surrogatepass error handlers
Type: performance Stage:
Components: Unicode Versions: Python 3.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: serhiy.storchaka Nosy List: ezio.melotti, methane, python-dev, r.david.murray, serhiy.storchaka, vstinner
Priority: normal Keywords: patch

Created on 2015-08-15 03:08 by methane, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
faster_surrogates_hadling.patch serhiy.storchaka, 2015-08-15 13:06 Preliminary patch review
faster-decode-ascii-surrogateescape.patch methane, 2015-08-17 06:42
faster-decode-ascii-surrogateescape.patch vstinner, 2015-09-10 19:18 review
bench.py vstinner, 2015-09-21 22:13
utf8.patch vstinner, 2015-09-21 23:06 review
utf8-2.patch vstinner, 2015-09-22 09:06 review
utf8-3.patch vstinner, 2015-09-22 09:37 review
bench_utf8.py vstinner, 2015-09-22 09:37
Messages (27)
msg248631 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2015-08-15 03:08
surrogateescape is recommended way to mix binary data in string protocol.
But surrogateescape is too slow and it cause usability problem.

One actual problem is: https://github.com/PyMySQL/PyMySQL/issues/366

surrogateescape is slow because errorhandler is called with UnicodeError object.
bs.decode('utf-8', 'surrogateescape') may produce len(bs)/2 error objects internally when bs is random bytes.

surrogateescape is used with ASCII and UTF-8 encoding in ordinal.
Specialized implementation can make it faster.

I want to Python 3.4 and Python 3.5 solve this issue since it's critical problem
for some people.
msg248632 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2015-08-15 03:13
On MacBook Pro (Core i5 2.6GHz), surrogateescape 1MB data takes 250ms.

In [1]: bs = bytes(range(256)) * (4 * 1024)

In [2]: len(bs)
Out[2]: 1048576

In [3]: %timeit x = bs.decode('ascii', 'surrogateescape')
1 loops, best of 3: 249 ms per loop
msg248636 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-08-15 12:09
Why are bytes being escaped in a binary blob? The reason to use surrogateescape is when you have data that is mostly text, should be processed as text, but can have occasional binary data.  That wouldn't seem to apply to a database binary blob.

But that aside, if you want to submit a patch to speed up surrogateescape without changing its functionality, I'm sure it would be considered.  It would certainly be useful for the email library, which currently does do the stupid thing of encoding binary message attachments using surrogateescape (and I'm guessing the reason pymysql does it is something similar to why email does it: the code would need to be significantly reorganized to do things right).
msg248638 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-08-15 12:56
Few months ago I wrote a patch that drastically speeds up encoding and decoding with surrogateescape and surrogatepass error handlers. However it causes 25% regression in decoding some UTF-8 data (U+0100-U+07FF if I remember correct) with strict error handler, so it needs some work. I hope that it is possible to rewrite UTF-8 decoder so that avoid a regression. The patch was postponed until 3.5 is released. In any case the patch is large and complex enough to be new feature that can appear only in 3.6.
msg248658 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-08-15 21:24
Serhiy: maybe we can start with ascii?
msg248660 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-08-15 21:26
Oh. I restored the old title because i replied by email with an old email.
msg248717 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2015-08-17 06:42
I've stripped Serhiy's patch for ascii.

Here is benchmark result:
https://gist.github.com/methane/2376ac5d20642c05a8b6#file-result-md

Is there chance for applying this patch to 3.5.1?
msg248820 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2015-08-19 08:12
> Why are bytes being escaped in a binary blob? The reason to use surrogateescape is when you have data that is mostly text, should be processed as text, but can have occasional binary data.  That wouldn't seem to apply to a database binary blob.

Since SQL may contain binary data.

data = bytes(range(256))
cursor.execute(u"INSERT INTO t (blob_col) values (%(data)s)", {"data": data})

DB driver should escape properly for SQL syntax, then decode with surrogateescape for % operator.

bytes in Python 3.5 supports % operator so I can use it instead of unicode %.
But I'll continue to support Python 3.4 for some years.
msg248832 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015-08-19 12:55
Since you already have to rewrite the string to do the escaping, I would judge it worth the extra effort to piece string together as binary, but I can understand wanting to use % notation.  The performance issue seems to prevent that, though, and there's no guarantee the proposed optimization will get applied to 3.4, or even 3.5.
msg250409 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-09-10 19:18
I rebased patch because faster-decode-ascii-surrogateescape.patch was generated in git format, and the git format is not accepted by Rietveld (the Review button).
msg251269 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015-09-21 22:11
New changeset 3c430259873e by Victor Stinner in branch 'default':
Issue #24870: Optimize the ASCII decoder for error handlers: surrogateescape,
https://hg.python.org/cpython/rev/3c430259873e
msg251270 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-09-21 22:13
I pushed a change to optimize the ASCII decoder.

Attached bench.py script: microbenchmark on the ASCII decoder. My results follows.

Common platform:
Platform: Linux-4.1.5-200.fc22.x86_64-x86_64-with-fedora-22-Twenty_Two
Bits: int=32, long=64, long long=64, size_t=64, void*=64
Timer: time.perf_counter
CFLAGS: -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes
CPU model: Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz
Python unicode implementation: PEP 393
Timer info: namespace(adjustable=False, implementation='clock_gettime(CLOCK_MONOTONIC)', monotonic=True, resolution=1e-09)

Platform of campaign before:
Timer precision: 57 ns
Date: 2015-09-21 23:48:00
SCM: hg revision=73867bf953e6 branch=default date="2015-09-21 22:40 +0200"
Python version: 3.6.0a0 (default:73867bf953e6, Sep 21 2015, 23:47:35) [GCC 5.1.1 20150618 (Red Hat 5.1.1-4)]

Platform of campaign after:
Timer precision: 55 ns
Date: 2015-09-21 23:52:55
Python version: 3.6.0a0 (default:bb0f55f1ec22+, Sep 21 2015, 23:52:43) [GCC 5.1.1 20150618 (Red Hat 5.1.1-4)]
SCM: hg revision=bb0f55f1ec22+ tag=tip branch=default date="2015-09-21 23:06 +0200"

------------------+-------------+---------------
ignore            |      before |          after
------------------+-------------+---------------
256 x 10**1 bytes |  314 us (*) | 5.25 us (-98%)
256 x 10**3 bytes | 31.9 ms (*) |  509 us (-98%)
256 x 10**2 bytes | 3.13 ms (*) | 50.2 us (-98%)
256 x 10**4 bytes |  315 ms (*) | 5.12 ms (-98%)
------------------+-------------+---------------
Total             |  351 ms (*) | 5.69 ms (-98%)
------------------+-------------+---------------

------------------+-------------+---------------
replace           |      before |          after
------------------+-------------+---------------
256 x 10**1 bytes |  352 us (*) | 6.32 us (-98%)
256 x 10**3 bytes | 35.2 ms (*) |  608 us (-98%)
256 x 10**2 bytes | 3.52 ms (*) | 60.9 us (-98%)
256 x 10**4 bytes |  354 ms (*) | 6.19 ms (-98%)
------------------+-------------+---------------
Total             |  393 ms (*) | 6.87 ms (-98%)
------------------+-------------+---------------

------------------+-------------+---------------
surrogateescape   |      before |          after
------------------+-------------+---------------
256 x 10**1 bytes |  369 us (*) | 5.92 us (-98%)
256 x 10**3 bytes | 36.8 ms (*) |  570 us (-98%)
256 x 10**2 bytes | 3.69 ms (*) | 56.9 us (-98%)
256 x 10**4 bytes |  371 ms (*) | 5.79 ms (-98%)
------------------+-------------+---------------
Total             |  412 ms (*) | 6.43 ms (-98%)
------------------+-------------+---------------

------------------+-------------+--------
backslashreplace  |      before |   after
------------------+-------------+--------
256 x 10**1 bytes |  357 us (*) |  361 us
256 x 10**3 bytes | 35.1 ms (*) | 36.1 ms
256 x 10**2 bytes | 3.52 ms (*) | 3.59 ms
256 x 10**4 bytes |  357 ms (*) |  365 ms
------------------+-------------+--------
Total             |  396 ms (*) |  405 ms
------------------+-------------+--------

-----------------+--------------+---------------
Summary          |       before |          after
-----------------+--------------+---------------
ignore           |   351 ms (*) | 5.69 ms (-98%)
replace          |   393 ms (*) | 6.87 ms (-98%)
surrogateescape  |   412 ms (*) | 6.43 ms (-98%)
backslashreplace |   396 ms (*) |         405 ms
-----------------+--------------+---------------
Total            | 1.55 sec (*) |  424 ms (-73%)
-----------------+--------------+---------------
msg251276 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015-09-21 23:00
New changeset 2cf85e2834c2 by Victor Stinner in branch 'default':
Issue #24870: Reuse the new _Py_error_handler enum
https://hg.python.org/cpython/rev/2cf85e2834c2

New changeset aa247150a8b1 by Victor Stinner in branch 'default':
Issue #24870: Add _PyUnicodeWriter_PrepareKind() macro
https://hg.python.org/cpython/rev/aa247150a8b1
msg251277 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-09-21 23:06
Ok, I prepared the code for the UTF-8 optimization.

@Serhiy: would you like to rebase your patch  faster_surrogates_hadling.patch?

Attached utf8.patch is a less optimal implementation which only changes PyUnicode_DecodeUTF8Stateful(). Maybe it's enough?

I would like to see a benchmark here to choose the good compromise between performance and code complexity.
msg251294 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015-09-22 08:50
New changeset 8317796ca004 by Victor Stinner in branch 'default':
Issue #24870: revert unwanted change
https://hg.python.org/cpython/rev/8317796ca004
msg251296 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-09-22 09:06
I pushed utf8.patch by mistake :-/ The advantage is that buildbots found bugs. Attached utf8-2.patch fixed bugs.

The bug was how the "s" variable was set in the error handler. It's now set with:

   s = starts + endinpos;

Bugs found by the buildbots:

======================================================================
FAIL: test_invalid_cb_for_3bytes_seq (test.test_unicode.UnicodeTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/opt/python/3.x.langa-ubuntu/build/Lib/test/test_unicode.py", line 1897, in test_invalid_cb_for_3bytes_seq
    'invalid continuation byte')
  File "/opt/python/3.x.langa-ubuntu/build/Lib/test/test_unicode.py", line 1772, in assertCorrectUTF8Decoding
    self.assertEqual(seq.decode('utf-8', 'replace'), res)
AssertionError: '��\x00' != '�\x00'
- ��
? -
+ �

======================================================================
FAIL: test_unquote_with_unicode (test.test_urllib.UnquotingTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/opt/python/3.x.langa-ubuntu/build/Lib/test/test_urllib.py", line 1016, in test_unquote_with_unicode
    "using unquote(): %r != %r" % (expect, result))
AssertionError: '�' != '��'
- �
+ ��
? +
 : using unquote(): '�' != '��'
msg251299 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-09-22 09:36
Ok, here is a patch which optimizes surrogatepass too.

Result of bench_utf8.py.

Common platform:
CFLAGS: -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes
Timer info: namespace(adjustable=False, implementation='clock_gettime(CLOCK_MONOTONIC)', monotonic=True, resolution=1e-09)
CPU model: Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz
Bits: int=32, long=64, long long=64, size_t=64, void*=64
Timer: time.perf_counter
Platform: Linux-4.1.6-200.fc22.x86_64-x86_64-with-fedora-22-Twenty_Two
Python unicode implementation: PEP 393

Platform of campaign before:
SCM: hg revision=8317796ca004 tag=tip branch=default date="2015-09-22 10:46 +0200"
Python version: 3.6.0a0 (default:8317796ca004, Sep 22 2015, 11:33:04) [GCC 5.1.1 20150618 (Red Hat 5.1.1-4)]
Date: 2015-09-22 11:33:12
Timer precision: 73 ns

Platform of campaign after:
SCM: hg revision=8317796ca004+ tag=tip branch=default date="2015-09-22 10:46 +0200"
Python version: 3.6.0a0 (default:8317796ca004+, Sep 22 2015, 11:31:04) [GCC 5.1.1 20150618 (Red Hat 5.1.1-4)]
Date: 2015-09-22 11:31:58
Timer precision: 54 ns

------------------+-------------+---------------
ignore            |      before |          after
------------------+-------------+---------------
100 x 10**1 bytes | 8.85 us (*) |  857 ns (-90%)
100 x 10**3 bytes |  780 us (*) | 45.5 us (-94%)
100 x 10**2 bytes | 78.7 us (*) | 4.81 us (-94%)
100 x 10**4 bytes | 8.13 ms (*) |  456 us (-94%)
------------------+-------------+---------------
Total             |    9 ms (*) |  507 us (-94%)
------------------+-------------+---------------

------------------+-------------+---------------
replace           |      before |          after
------------------+-------------+---------------
100 x 10**1 bytes |   10 us (*) |  939 ns (-91%)
100 x 10**3 bytes |  898 us (*) | 54.1 us (-94%)
100 x 10**2 bytes | 90.5 us (*) | 5.72 us (-94%)
100 x 10**4 bytes | 9.55 ms (*) |  536 us (-94%)
------------------+-------------+---------------
Total             | 10.6 ms (*) |  597 us (-94%)
------------------+-------------+---------------

------------------+-------------+---------------
surrogateescape   |      before |          after
------------------+-------------+---------------
100 x 10**1 bytes | 10.4 us (*) |  919 ns (-91%)
100 x 10**3 bytes |  930 us (*) | 55.2 us (-94%)
100 x 10**2 bytes | 93.1 us (*) | 5.84 us (-94%)
100 x 10**4 bytes | 9.85 ms (*) |  552 us (-94%)
------------------+-------------+---------------
Total             | 10.9 ms (*) |  614 us (-94%)
------------------+-------------+---------------

------------------+-------------+---------------
surrogatepass     |      before |          after
------------------+-------------+---------------
100 x 10**1 bytes | 4.83 us (*) |  963 ns (-80%)
100 x 10**3 bytes |  353 us (*) | 42.6 us (-88%)
100 x 10**2 bytes | 36.5 us (*) | 4.69 us (-87%)
100 x 10**4 bytes | 4.17 ms (*) |  424 us (-90%)
------------------+-------------+---------------
Total             | 4.56 ms (*) |  472 us (-90%)
------------------+-------------+---------------

------------------+-------------+--------------
backslashreplace  |      before |         after
------------------+-------------+--------------
100 x 10**1 bytes | 10.9 us (*) | 10.1 us (-7%)
100 x 10**3 bytes |  944 us (*) | 853 us (-10%)
100 x 10**2 bytes | 93.8 us (*) | 87.5 us (-7%)
100 x 10**4 bytes | 9.92 ms (*) | 9.04 ms (-9%)
------------------+-------------+--------------
Total             |   11 ms (*) | 9.99 ms (-9%)
------------------+-------------+--------------

-----------------+-------------+---------------
Summary          |      before |          after
-----------------+-------------+---------------
ignore           |    9 ms (*) |  507 us (-94%)
replace          | 10.6 ms (*) |  597 us (-94%)
surrogateescape  | 10.9 ms (*) |  614 us (-94%)
surrogatepass    | 4.56 ms (*) |  472 us (-90%)
backslashreplace |   11 ms (*) |  9.99 ms (-9%)
-----------------+-------------+---------------
Total            |   46 ms (*) | 12.2 ms (-73%)
-----------------+-------------+---------------
msg251517 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-09-24 12:51
I created the issue #25227 to optimize the ASCII and Latin1 *encoders* for surrogateescape.
msg251521 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015-09-24 13:11
I worked on UTF-16 and UTF-32 encoders, but now I'm off my developing computer. I'll provide updated patch soon.

I think that only "surrogateescape" and "surrogatepass" error handlers have need in optimization, because they are used to interpolate with other programs, including old Python versions. "strict" stops processing, an optimization is not needed here. All other error handlers lose information and can't be used per se for transcoding bytes as string or string as bytes. They are used together with other slow code (for example for encoding string in XML or HTML you first need to escape '&', '<' and quotes). It is easy to add fast handling for 'ignore' and 'replace', but these error handlers are used largely for produce human-readable output, and adding it can slow down common case (no errors). That is why I limit my patch for "surrogateescape" and "surrogatepass" only.
msg251529 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-09-24 14:26
Serhiy wrote: "All other error handlers lose information and can't be used per se for transcoding bytes as string or string as bytes."

Well, it was very simple to implement replace and ignore in decoders. I believe that the error handlers are commonly used.

"(...) adding it can slow down common case (no errors). That is why I limit my patch for "surrogateescape" and "surrogatepass" only."

We can start with benchmarks and see if modifying Objects/stringlib/ has a real impact on performances, or if modifying the "slower" decoder in Objects/unicodeobject.c is enough. IMHO it's fine to implement many error handlers in Objects/unicodeobject.c: it's the "slow" path when at least one error occurred, so it doesn't impact the path to decode valid UTF-8 strings.
msg252059 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-10-01 21:24
I just pushed my patch to optimize the UTF-8 encoder with error handlers: see the issue #25267. It's up to 70 times as fast. The patch was based on Serhiy's work: faster_surrogates_hadling.patch attached to this issue.
msg252118 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-10-02 14:45
I created issue #25301: "Optimize UTF-8 decoder with error handlers".
msg252662 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-10-09 21:15
Short summary.

Ok, I optimized ASCII, Latin1 and UTF-8 codecs (encoders and decoders) for the most common error handlers.

* ASCII and Latin1 encoders: surrogateescape, replace, ignore, backslashreplace, xmlcharrefreplace
* ASCII decoder: surrogateescape, replace, ignore
* (Latin1 decoder cannot fail)
* UTF-8 encoder: surrogateescape, surrogatepass, replace, ignore, backslashreplace, xmlcharrefreplace
* UTF-8 decoder: surrogateescape, replace, ignore

The code to handle other error handlers in encoders has also be optimized.

Surrogateescape has now an efficent implementation for ASCII, Latin1 and UTF-8 encoders and decoders.
msg252664 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-10-09 21:23
INADA Naoki: "I want to Python 3.4 and Python 3.5 solve this issue since it's critical problem for some people."

On microbenchmarks, the optimization that I just implemented in Python 3.6 are impressive. The problem is that the implementation is quite complex. If I understood correctly, you are asking to optimize decoders and encoders for ASCII and UTF-8 (modify 4 functions) for the surrogateescape error handler. Is that right? Would UTF-8 be enough or not?

I don't like backporting optimizations which are not well tested right now. To optimize encoders, I wrote a full new _PyBytesWriter API. We cannot backport this new API, even if it's private. So the backport may be more complex than the code in the default branch.
msg252738 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2015-10-10 19:15
UTF-8 and Latin1 are typical encoding for MySQL query.
When inserting BLOB:

# Decode binary data
x = data.decode('ascii', 'surrogateescape')
# %-format query
psql = sql % (escape(x),)  # sql is unicode
# Encode sql to connection encoding (latin1 or utf8)
query_bytes = psql.encode('utf-8', 'surrogateescape')

So decoder speedup is required only for ascii, but encoder speedup is required for utf-8 and latin1.

I'll consider other ways including creating speedup module and register it on PyPI.
msg257688 - (view) Author: Inada Naoki (methane) * (Python committer) Date: 2016-01-07 13:50
FYI, I found a workaround.
https://github.com/PyMySQL/PyMySQL/pull/409

_table = [chr(i) for i in range(128)] + [chr(i) for i in range(0xdc80, 0xdd00)]

def decode_surroundescape(s):
    return s.decode('latin1').translate(_table)

In [15]: data = b'\xff' * 1024 * 1024

In [16]: data.decode('ascii', 'surrogateescape') == decode_surroundescape(data)
Out[16]: True

In [17]: %timeit data.decode('ascii', 'surrogateescape')
1 loops, best of 3: 394 ms per loop

In [18]: %timeit decode_surroundescape(data)
10 loops, best of 3: 40 ms per loop
msg257689 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016-01-07 13:51
> In [18]: %timeit decode_surroundescape(data)
> 10 loops, best of 3: 40 ms per loop

Cool! Good job.
History
Date User Action Args
2022-04-11 14:58:19adminsetgithub: 69058
2016-10-06 07:45:27methanesetstatus: open -> closed
resolution: fixed
2016-01-07 13:51:51vstinnersetmessages: + msg257689
2016-01-07 13:50:12methanesetmessages: + msg257688
2015-10-10 19:15:53methanesetmessages: + msg252738
2015-10-09 21:23:57vstinnersetmessages: + msg252664
2015-10-09 21:15:08vstinnersetmessages: + msg252662
2015-10-02 14:45:19vstinnersetmessages: + msg252118
2015-10-01 21:24:45vstinnersetmessages: + msg252059
2015-09-24 14:26:49vstinnersetmessages: + msg251529
2015-09-24 14:18:10vstinnersettitle: Optimize coding with surrogateescape and surrogatepass error handlers -> Optimize ascii and latin1 decoder with surrogateescape and surrogatepass error handlers
2015-09-24 13:11:05serhiy.storchakasetmessages: + msg251521
2015-09-24 12:51:30vstinnersetmessages: + msg251517
2015-09-22 09:37:17vstinnersetfiles: + bench_utf8.py
2015-09-22 09:37:01vstinnersetfiles: + utf8-3.patch

messages: + msg251299
2015-09-22 09:06:56vstinnersetfiles: + utf8-2.patch

messages: + msg251296
2015-09-22 08:50:15python-devsetmessages: + msg251294
2015-09-21 23:06:41vstinnersetfiles: + utf8.patch

messages: + msg251277
2015-09-21 23:00:08python-devsetmessages: + msg251276
2015-09-21 22:13:03vstinnersetfiles: + bench.py

messages: + msg251270
2015-09-21 22:11:20python-devsetnosy: + python-dev
messages: + msg251269
2015-09-10 19:18:44vstinnersetfiles: + faster-decode-ascii-surrogateescape.patch

messages: + msg250409
2015-08-19 12:55:09r.david.murraysetmessages: + msg248832
2015-08-19 08:12:26methanesetmessages: + msg248820
2015-08-17 06:42:46methanesetfiles: + faster-decode-ascii-surrogateescape.patch

messages: + msg248717
2015-08-15 21:26:49vstinnersetmessages: + msg248660
title: surrogateescape is too slow -> Optimize coding with surrogateescape and surrogatepass error handlers
2015-08-15 21:24:55vstinnersetmessages: + msg248658
title: Optimize coding with surrogateescape and surrogatepass error handlers -> surrogateescape is too slow
2015-08-15 13:06:59serhiy.storchakasetfiles: + faster_surrogates_hadling.patch
keywords: + patch
title: surrogateescape is too slow -> Optimize coding with surrogateescape and surrogatepass error handlers
2015-08-15 12:56:05serhiy.storchakasetversions: - Python 3.4, Python 3.5
nosy: + serhiy.storchaka

messages: + msg248638

assignee: serhiy.storchaka
2015-08-15 12:09:23r.david.murraysetnosy: + r.david.murray

messages: + msg248636
versions: - Python 3.2, Python 3.3
2015-08-15 03:13:43methanesetmessages: + msg248632
2015-08-15 03:08:44methanecreate