Message203322
Attached patch modify dict_repr() function to use the _PyUnicodeWriter API instead of building a list of short strings with PyUnicode_AppendAndDel() and calling PyUnicode_Join() at the end to join the list. PyUnicode_Append() is inefficient because it has to allocate a new string instead of reusing the same buffer.
_PyUnicodeWriter API has a different design. It overallocates a buffer to write Unicode characters and shrink the buffer at the end. It is faster according to my micro benchmark.
$ ./python ~/prog/HG/misc/python/benchmark.py compare_to pyaccu writer
Common platform:
CPU model: Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz
Python unicode implementation: PEP 393
CFLAGS: -Wno-unused-result -Werror=declaration-after-statement -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes
Timer precision: 40 ns
Timer info: namespace(adjustable=False, implementation='clock_gettime(CLOCK_MONOTONIC)', monotonic=True, resolution=1e-09)
Platform: Linux-3.9.4-200.fc18.x86_64-x86_64-with-fedora-18-Spherical_Cow
Bits: int=32, long=64, long long=64, size_t=64, void*=64
Timer: time.perf_counter
Platform of campaign pyaccu:
Date: 2013-11-18 21:37:44
Python version: 3.4.0a4+ (default:fc7ceb001eec, Nov 18 2013, 21:29:41) [GCC 4.7.2 20121109 (Red Hat 4.7.2-8)]
SCM: hg revision=fc7ceb001eec tag=tip branch=default date="2013-11-18 21:11 +0100"
Platform of campaign writer:
Date: 2013-11-18 22:10:40
Python version: 3.4.0a4+ (default:fc7ceb001eec+, Nov 18 2013, 22:10:12) [GCC 4.7.2 20121109 (Red Hat 4.7.2-8)]
SCM: hg revision=fc7ceb001eec+ tag=tip branch=default date="2013-11-18 21:11 +0100"
--------------------------------------+-------------+--------------
Tests | pyaccu | writer
--------------------------------------+-------------+--------------
{"a": 1} | 603 ns (*) | 496 ns (-18%)
dict(zip("abc", range(3))) | 1.05 us (*) | 904 ns (-14%)
{"%03d":"abc" for k in range(10)} | 631 ns (*) | 501 ns (-21%)
{"%100d":"abc" for k in range(10)} | 660 ns (*) | 484 ns (-27%)
{k:"a" for k in range(10**3)} | 235 us (*) | 166 us (-30%)
{k:"abc" for k in range(10**3)} | 245 us (*) | 177 us (-28%)
{"%100d":"abc" for k in range(10**3)} | 668 ns (*) | 478 ns (-28%)
{k:"a" for k in range(10**6)} | 258 ms (*) | 186 ms (-28%)
{k:"abc" for k in range(10**6)} | 265 ms (*) | 184 ms (-31%)
{"%100d":"abc" for k in range(10**6)} | 652 ns (*) | 489 ns (-25%)
--------------------------------------+-------------+--------------
Total | 523 ms (*) | 369 ms (-29%)
--------------------------------------+-------------+-------------- |
|
Date |
User |
Action |
Args |
2013-11-18 21:15:06 | vstinner | set | recipients:
+ vstinner, ezio.melotti, serhiy.storchaka |
2013-11-18 21:15:06 | vstinner | set | messageid: <1384809306.12.0.437955519329.issue19646@psf.upfronthosting.co.za> |
2013-11-18 21:15:06 | vstinner | link | issue19646 messages |
2013-11-18 21:15:05 | vstinner | create | |
|