Message 215602 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	vstinner
Recipients	ezio.melotti, josh.r, pitrou, python-dev, serhiy.storchaka, vstinner
Date	2014-04-05.13:03:49
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1396703030.95.0.633790951252.issue21118@psf.upfronthosting.co.za>
In-reply-to

Content
bench_translate.py: benchmark ASCII 1:1 but also ASCII 1:1 with deletion. Results of the benchmark comparing tip (47b0c076e17d which includes my latest optimization on deletion) and 6a347c0ffbfc + translate_cached_2.patch. Common platform: Python unicode implementation: PEP 393 CFLAGS: -Wno-unused-result -Werror=declaration-after-statement -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes Platform: Linux-3.12.8-300.fc20.x86_64-x86_64-with-fedora-20-Heisenbug Bits: int=32, long=64, long long=64, size_t=64, void=64 CPU model: Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz Timer: time.perf_counter Timer precision: 45 ns Timer info: namespace(adjustable=False, implementation='clock_gettime(CLOCK_MONOTONIC)', monotonic=True, resolution=1e-09) Platform of campaign remove: SCM: hg revision=47b0c076e17d tag=tip branch=default date="2014-04-05 14:27 +0200" Python version: 3.5.0a0 (default:47b0c076e17d, Apr 5 2014, 14:50:53) [GCC 4.8.2 20131212 (Red Hat 4.8.2-7)] Date: 2014-04-05 14:51:55 Platform of campaign cache: SCM: hg revision=6a347c0ffbfc+ branch=default date="2014-04-05 11:56 +0200" Python version: 3.5.0a0 (default:6a347c0ffbfc+, Apr 5 2014, 14:53:02) [GCC 4.8.2 20131212 (Red Hat 4.8.2-7)] Date: 2014-04-05 14:53:12 ---------------------------+-------------+---------------- Tests \| remove \| cache ---------------------------+-------------+---------------- replace none, length=10 \| 184 ns () \| 275 ns (+50%) replace none, length=10*3 \| 1.06 us () \| 1.1 us replace none, length=10*6 \| 827 us () \| 792 us replace 10%, length=10 \| 207 ns () \| 298 ns (+44%) replace 10%, length=103 \| 1.08 us () \| 1.12 us replace 10%, length=10*6 \| 828 us () \| 793 us replace 50%, length=10 \| 205 ns () \| 298 ns (+46%) replace 50%, length=103 \| 1.08 us () \| 1.17 us (+7%) replace 50%, length=10*6 \| 827 us () \| 793 us replace 90%, length=10 \| 208 ns () \| 298 ns (+44%) replace 90%, length=103 \| 1.09 us () \| 1.13 us replace 90%, length=10*6 \| 850 us () \| 793 us (-7%) replace all, length=10 \| 145 ns () \| 226 ns (+56%) replace all, length=103 \| 1.03 us () \| 1.04 us replace all, length=10*6 \| 827 us () \| 792 us remove none, length=10 \| 184 ns () \| 274 ns (+49%) remove none, length=103 \| 1.07 us () \| 1.09 us remove none, length=10*6 \| 836 us () \| 793 us (-5%) remove 10%, length=10 \| 223 ns () \| 408 ns (+83%) remove 10%, length=103 \| 1.45 us () \| 9.13 us (+531%) remove 10%, length=10*6 \| 1.08 ms () \| 8.73 ms (+706%) remove 50%, length=10 \| 221 ns () \| 407 ns (+84%) remove 50%, length=103 \| 1.23 us () \| 8.28 us (+575%) remove 50%, length=10*6 \| 948 us () \| 7.9 ms (+734%) remove 90%, length=10 \| 230 ns () \| 375 ns (+63%) remove 90%, length=103 \| 1.57 us () \| 3.86 us (+145%) remove 90%, length=10*6 \| 1.28 ms () \| 3.49 ms (+173%) remove all, length=10 \| 139 ns () \| 266 ns (+92%) remove all, length=103 \| 1.24 us () \| 2.46 us (+99%) remove all, length=10*6 \| 1.07 ms () \| 2.13 ms (+100%) ---------------------------+-------------+---------------- Total \| 9.38 ms (*) \| 27 ms (+188%) ---------------------------+-------------+---------------- You patch is always slower for the common case (ASCII => ASCII translation). I implemented the most obvious optimization for the most common case (ASCII 1:1 and ASCII 1:1 with deletion). I consider that the current code is enough to close this issue. @Josh Rosenberg: Thanks for the report. The current implementation should be almost as fast as bytes.translate() (the "60x" factor you mentionned in the title) for ASCII 1:1 mapping. -- Serhiy: If you are interested to optimize str.translate() for the general case (larger charset), please open a new issue. It will probably require more complex "cache". You may take a look at charmap codec which has such more complex cache (cache with 3 levels), see my message msg215301. IMO it's not interesting to invest time on optimizing str.translate(), it's not a common function. It took some years before an user run a benchmark on it :-)

bench_translate.py: benchmark ASCII 1:1 but also ASCII 1:1 with deletion. Results of the benchmark comparing tip (47b0c076e17d which includes my latest optimization on deletion) and 6a347c0ffbfc + translate_cached_2.patch.

Common platform:
Python unicode implementation: PEP 393
CFLAGS: -Wno-unused-result -Werror=declaration-after-statement -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes
Platform: Linux-3.12.8-300.fc20.x86_64-x86_64-with-fedora-20-Heisenbug
Bits: int=32, long=64, long long=64, size_t=64, void*=64
CPU model: Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz
Timer: time.perf_counter
Timer precision: 45 ns
Timer info: namespace(adjustable=False, implementation='clock_gettime(CLOCK_MONOTONIC)', monotonic=True, resolution=1e-09)

Platform of campaign remove:
SCM: hg revision=47b0c076e17d tag=tip branch=default date="2014-04-05 14:27 +0200"
Python version: 3.5.0a0 (default:47b0c076e17d, Apr 5 2014, 14:50:53) [GCC 4.8.2 20131212 (Red Hat 4.8.2-7)]
Date: 2014-04-05 14:51:55

Platform of campaign cache:
SCM: hg revision=6a347c0ffbfc+ branch=default date="2014-04-05 11:56 +0200"
Python version: 3.5.0a0 (default:6a347c0ffbfc+, Apr 5 2014, 14:53:02) [GCC 4.8.2 20131212 (Red Hat 4.8.2-7)]
Date: 2014-04-05 14:53:12

---------------------------+-------------+----------------
Tests                      |      remove |           cache
---------------------------+-------------+----------------
replace none, length=10    |  184 ns (*) |   275 ns (+50%)
replace none, length=10**3 | 1.06 us (*) |          1.1 us
replace none, length=10**6 |  827 us (*) |          792 us
replace 10%, length=10     |  207 ns (*) |   298 ns (+44%)
replace 10%, length=10**3  | 1.08 us (*) |         1.12 us
replace 10%, length=10**6  |  828 us (*) |          793 us
replace 50%, length=10     |  205 ns (*) |   298 ns (+46%)
replace 50%, length=10**3  | 1.08 us (*) |   1.17 us (+7%)
replace 50%, length=10**6  |  827 us (*) |          793 us
replace 90%, length=10     |  208 ns (*) |   298 ns (+44%)
replace 90%, length=10**3  | 1.09 us (*) |         1.13 us
replace 90%, length=10**6  |  850 us (*) |    793 us (-7%)
replace all, length=10     |  145 ns (*) |   226 ns (+56%)
replace all, length=10**3  | 1.03 us (*) |         1.04 us
replace all, length=10**6  |  827 us (*) |          792 us
remove none, length=10     |  184 ns (*) |   274 ns (+49%)
remove none, length=10**3  | 1.07 us (*) |         1.09 us
remove none, length=10**6  |  836 us (*) |    793 us (-5%)
remove 10%, length=10      |  223 ns (*) |   408 ns (+83%)
remove 10%, length=10**3   | 1.45 us (*) | 9.13 us (+531%)
remove 10%, length=10**6   | 1.08 ms (*) | 8.73 ms (+706%)
remove 50%, length=10      |  221 ns (*) |   407 ns (+84%)
remove 50%, length=10**3   | 1.23 us (*) | 8.28 us (+575%)
remove 50%, length=10**6   |  948 us (*) |  7.9 ms (+734%)
remove 90%, length=10      |  230 ns (*) |   375 ns (+63%)
remove 90%, length=10**3   | 1.57 us (*) | 3.86 us (+145%)
remove 90%, length=10**6   | 1.28 ms (*) | 3.49 ms (+173%)
remove all, length=10      |  139 ns (*) |   266 ns (+92%)
remove all, length=10**3   | 1.24 us (*) |  2.46 us (+99%)
remove all, length=10**6   | 1.07 ms (*) | 2.13 ms (+100%)
---------------------------+-------------+----------------
Total                      | 9.38 ms (*) |   27 ms (+188%)
---------------------------+-------------+----------------

You patch is always slower for the common case (ASCII => ASCII translation).

I implemented the most obvious optimization for the most common case (ASCII 1:1 and ASCII 1:1 with deletion). I consider that the current code is enough to close this issue.

@Josh Rosenberg: Thanks for the report. The current implementation should be almost as fast as bytes.translate() (the "60x" factor you mentionned in the title) for ASCII 1:1 mapping.

--

Serhiy: If you are interested to optimize str.translate() for the general case (larger charset), please open a new issue. It will probably require more complex "cache". You may take a look at charmap codec which has such more complex cache (cache with 3 levels), see my message msg215301.

IMO it's not interesting to invest time on optimizing str.translate(), it's not a common function. It took some years before an user run a benchmark on it :-)

History
Date	User	Action	Args
2014-04-05 13:03:51	vstinner	set	recipients: + vstinner, pitrou, ezio.melotti, python-dev, serhiy.storchaka, josh.r
2014-04-05 13:03:50	vstinner	set	messageid: <1396703030.95.0.633790951252.issue21118@psf.upfronthosting.co.za>
2014-04-05 13:03:50	vstinner	link	issue21118 messages
2014-04-05 13:03:49	vstinner	create