This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author serhiy.storchaka
Recipients ezio.melotti, josh.r, pitrou, python-dev, serhiy.storchaka, vstinner
Date 2014-04-05.11:47:41
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1396698463.65.0.734385875942.issue21118@psf.upfronthosting.co.za>
In-reply-to
Content
fast_translate.patch works only with ASCII input string and ASCII 1:1 mapping. Is this actually typical case?

Here is a patch which uses different approach. It caches values for ASCII keys. It works with all types of input strings and mappings and can speed up more use cases, including non-ASCII data, deletion and enlarging.

translate_timing.py results:

                                unpatched           patched
Testing 1-1 translation
str.translate                   4.55125927699919    0.7898181750006188
str.translate from bytes trans  1.8910855210015143  0.779950579000797
Testing deletion
str.translate                   4.481863372000589   0.7718261509999138
Testing enlarging translations
str.translate                   4.421521270000085   0.9290620680003485

translate_script_ascii.py results:

---------------------------+---------------------------+-------------------------------
Tests                      | translate_script_ascii.34 | translate_script_ascii.cached3
---------------------------+---------------------------+-------------------------------
replace none, length=10    |           6.12 us (+176%) |                    2.22 us (*)
replace none, length=10**3 |           448 us (+1293%) |                    32.2 us (*)
replace none, length=10**6 |           474 ms (+1435%) |                    30.9 ms (*)
replace 10%, length=10     |           5.73 us (+133%) |                    2.46 us (*)
replace 10%, length=10**3  |           412 us (+1060%) |                    35.5 us (*)
replace 10%, length=10**6  |           442 ms (+1204%) |                    33.9 ms (*)
replace 50%, length=10     |            4.75 us (+85%) |                    2.57 us (*)
replace 50%, length=10**3  |            311 us (+552%) |                    47.7 us (*)
replace 50%, length=10**6  |            331 ms (+617%) |                    46.2 ms (*)
replace 90%, length=10     |            3.36 us (+29%) |                    2.59 us (*)
replace 90%, length=10**3  |            178 us (+250%) |                    50.8 us (*)
replace 90%, length=10**6  |            192 ms (+291%) |                    49.2 ms (*)
replace all, length=10     |            2.64 us (+28%) |                    2.06 us (*)
replace all, length=10**3  |            146 us (+189%) |                    50.3 us (*)
replace all, length=10**6  |            152 ms (+194%) |                    51.7 ms (*)
---------------------------+---------------------------+-------------------------------
Total                      |          1.59 sec (+651%) |                     212 ms (*)
---------------------------+---------------------------+-------------------------------
History
Date User Action Args
2014-04-05 11:47:44serhiy.storchakasetrecipients: + serhiy.storchaka, pitrou, vstinner, ezio.melotti, python-dev, josh.r
2014-04-05 11:47:43serhiy.storchakasetmessageid: <1396698463.65.0.734385875942.issue21118@psf.upfronthosting.co.za>
2014-04-05 11:47:43serhiy.storchakalinkissue21118 messages
2014-04-05 11:47:41serhiy.storchakacreate