This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vstinner
Recipients ezio.melotti, josh.r, pitrou, python-dev, serhiy.storchaka, vstinner
Date 2014-04-05.13:03:49
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1396703030.95.0.633790951252.issue21118@psf.upfronthosting.co.za>
In-reply-to
Content
bench_translate.py: benchmark ASCII 1:1 but also ASCII 1:1 with deletion. Results of the benchmark comparing tip (47b0c076e17d which includes my latest optimization on deletion) and 6a347c0ffbfc + translate_cached_2.patch.

Common platform:
Python unicode implementation: PEP 393
CFLAGS: -Wno-unused-result -Werror=declaration-after-statement -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes
Platform: Linux-3.12.8-300.fc20.x86_64-x86_64-with-fedora-20-Heisenbug
Bits: int=32, long=64, long long=64, size_t=64, void*=64
CPU model: Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz
Timer: time.perf_counter
Timer precision: 45 ns
Timer info: namespace(adjustable=False, implementation='clock_gettime(CLOCK_MONOTONIC)', monotonic=True, resolution=1e-09)

Platform of campaign remove:
SCM: hg revision=47b0c076e17d tag=tip branch=default date="2014-04-05 14:27 +0200"
Python version: 3.5.0a0 (default:47b0c076e17d, Apr 5 2014, 14:50:53) [GCC 4.8.2 20131212 (Red Hat 4.8.2-7)]
Date: 2014-04-05 14:51:55

Platform of campaign cache:
SCM: hg revision=6a347c0ffbfc+ branch=default date="2014-04-05 11:56 +0200"
Python version: 3.5.0a0 (default:6a347c0ffbfc+, Apr 5 2014, 14:53:02) [GCC 4.8.2 20131212 (Red Hat 4.8.2-7)]
Date: 2014-04-05 14:53:12

---------------------------+-------------+----------------
Tests                      |      remove |           cache
---------------------------+-------------+----------------
replace none, length=10    |  184 ns (*) |   275 ns (+50%)
replace none, length=10**3 | 1.06 us (*) |          1.1 us
replace none, length=10**6 |  827 us (*) |          792 us
replace 10%, length=10     |  207 ns (*) |   298 ns (+44%)
replace 10%, length=10**3  | 1.08 us (*) |         1.12 us
replace 10%, length=10**6  |  828 us (*) |          793 us
replace 50%, length=10     |  205 ns (*) |   298 ns (+46%)
replace 50%, length=10**3  | 1.08 us (*) |   1.17 us (+7%)
replace 50%, length=10**6  |  827 us (*) |          793 us
replace 90%, length=10     |  208 ns (*) |   298 ns (+44%)
replace 90%, length=10**3  | 1.09 us (*) |         1.13 us
replace 90%, length=10**6  |  850 us (*) |    793 us (-7%)
replace all, length=10     |  145 ns (*) |   226 ns (+56%)
replace all, length=10**3  | 1.03 us (*) |         1.04 us
replace all, length=10**6  |  827 us (*) |          792 us
remove none, length=10     |  184 ns (*) |   274 ns (+49%)
remove none, length=10**3  | 1.07 us (*) |         1.09 us
remove none, length=10**6  |  836 us (*) |    793 us (-5%)
remove 10%, length=10      |  223 ns (*) |   408 ns (+83%)
remove 10%, length=10**3   | 1.45 us (*) | 9.13 us (+531%)
remove 10%, length=10**6   | 1.08 ms (*) | 8.73 ms (+706%)
remove 50%, length=10      |  221 ns (*) |   407 ns (+84%)
remove 50%, length=10**3   | 1.23 us (*) | 8.28 us (+575%)
remove 50%, length=10**6   |  948 us (*) |  7.9 ms (+734%)
remove 90%, length=10      |  230 ns (*) |   375 ns (+63%)
remove 90%, length=10**3   | 1.57 us (*) | 3.86 us (+145%)
remove 90%, length=10**6   | 1.28 ms (*) | 3.49 ms (+173%)
remove all, length=10      |  139 ns (*) |   266 ns (+92%)
remove all, length=10**3   | 1.24 us (*) |  2.46 us (+99%)
remove all, length=10**6   | 1.07 ms (*) | 2.13 ms (+100%)
---------------------------+-------------+----------------
Total                      | 9.38 ms (*) |   27 ms (+188%)
---------------------------+-------------+----------------

You patch is always slower for the common case (ASCII => ASCII translation).

I implemented the most obvious optimization for the most common case (ASCII 1:1 and ASCII 1:1 with deletion). I consider that the current code is enough to close this issue.

@Josh Rosenberg: Thanks for the report. The current implementation should be almost as fast as bytes.translate() (the "60x" factor you mentionned in the title) for ASCII 1:1 mapping.

--

Serhiy: If you are interested to optimize str.translate() for the general case (larger charset), please open a new issue. It will probably require more complex "cache". You may take a look at charmap codec which has such more complex cache (cache with 3 levels), see my message msg215301.

IMO it's not interesting to invest time on optimizing str.translate(), it's not a common function. It took some years before an user run a benchmark on it :-)
History
Date User Action Args
2014-04-05 13:03:51vstinnersetrecipients: + vstinner, pitrou, ezio.melotti, python-dev, serhiy.storchaka, josh.r
2014-04-05 13:03:50vstinnersetmessageid: <1396703030.95.0.633790951252.issue21118@psf.upfronthosting.co.za>
2014-04-05 13:03:50vstinnerlinkissue21118 messages
2014-04-05 13:03:49vstinnercreate