Issue25349
This issue tracker has been migrated to GitHub,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2015-10-09 00:50 by vstinner, last changed 2022-04-11 14:58 by admin. This issue is now closed.
Files | ||||
---|---|---|---|---|
File name | Uploaded | Description | Edit | |
bytes_format.patch | vstinner, 2015-10-09 00:50 | review | ||
bench_bytes_format.py | vstinner, 2015-10-09 10:20 | |||
bytes_formatlong.patch | vstinner, 2015-10-09 16:57 | review | ||
bench_bytes_int.py | vstinner, 2015-10-09 20:40 |
Messages (9) | |||
---|---|---|---|
msg252577 - (view) | Author: STINNER Victor (vstinner) * | Date: 2015-10-09 00:50 | |
Attached patch is a work-in-progress patch to use the new private _PyBytesWriter API in bytes % args. The usage of the _PyBytesWriter API will allow further optimization. For example, it avoids the creation of a temporary bytes object to format b'%f' % 1.2. The _PyBytesWriter API allocates a small buffer of 512 bytes on the stack to delay the allocation of the final bytes objects. It can avoid the need to call _PyBytes_Resize() completly, or at least reduce the number of calls. See also the issue #25318 which added the _PyBytesWriter API. |
|||
msg252578 - (view) | Author: STINNER Victor (vstinner) * | Date: 2015-10-09 00:51 | |
See also the PEP 461 "Adding % formatting to bytes and bytearray". FYI bytes % args is tested by test_format (good to know to test quickly changes). |
|||
msg252596 - (view) | Author: STINNER Victor (vstinner) * | Date: 2015-10-09 10:20 | |
bench_bytes_format.py: micro-benchmark testing a few formats. Some tests are focused on the implementation of _PyBytesWriter to ensure that the optimization is efficient. Except of a single test (which is not really revelant, it takes less than 500 nanoseconds), all tests are faster. The b"xxxxxx %s" % b"y" test confirms that the optimization disabling overallocation for the last write is effective. Results: Common platform: Timer info: namespace(adjustable=False, implementation='clock_gettime(CLOCK_MONOTONIC)', monotonic=True, resolution=1e-09) Python unicode implementation: PEP 393 CPU model: Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz Platform: Linux-4.1.6-200.fc22.x86_64-x86_64-with-fedora-22-Twenty_Two CFLAGS: -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes Timer: time.perf_counter Bits: int=32, long=64, long long=64, size_t=64, void*=64 Platform of campaign orig: SCM: hg revision=1aae9b6a6929 tag=tip branch=default date="2015-10-09 01:34 -0400" Timer precision: 64 ns Python version: 3.6.0a0 (default:1aae9b6a6929, Oct 9 2015, 11:33:56) [GCC 5.1.1 20150618 (Red Hat 5.1.1-4)] Date: 2015-10-09 11:34:11 Platform of campaign writer: SCM: hg revision=fc2c11a19ae1+ tag=tip branch=default date="2015-10-09 11:48 +0200" Timer precision: 61 ns Python version: 3.6.0a0 (default:fc2c11a19ae1+, Oct 9 2015, 12:16:16) [GCC 5.1.1 20150618 (Red Hat 5.1.1-4)] Date: 2015-10-09 12:16:31 ---------------------------+------------+-------------- use smaller buffer | orig | writer ---------------------------+------------+-------------- b"hello %s" % b"world" | 13 ns (*) | 12 ns (-5%) b"hello %-100s" % b"world" | 158 ns (*) | 98 ns (-38%) b"x=%d" % 123 | 13 ns (*) | 12 ns b"x=%f" % 1.2 | 13 ns (*) | 13 ns b"x=%100d" % 123 | 156 ns (*) | 166 ns (+7%) ---------------------------+------------+-------------- Total | 353 ns (*) | 301 ns (-15%) ---------------------------+------------+-------------- -------------------------------------------------+-------------+--------------- "hello %s" % long_string | orig | writer -------------------------------------------------+-------------+--------------- fmt = b"hello %s"; arg = b"x" * 10; fmt % arg | 98 ns (*) | 86 ns (-12%) fmt = b"hello %s"; arg = b"x" * 100; fmt % arg | 85 ns (*) | 87 ns fmt = b"hello %s"; arg = b"x" * 10**3; fmt % arg | 298 ns (*) | 208 ns (-30%) fmt = b"hello %s"; arg = b"x" * 10**5; fmt % arg | 4.8 us (*) | 4.39 us (-9%) -------------------------------------------------+-------------+--------------- Total | 5.28 us (*) | 4.77 us (-10%) -------------------------------------------------+-------------+--------------- ---------------------------------------+-------------+--------------- b"xxxxxx %s" % b"y" | orig | writer ---------------------------------------+-------------+--------------- fmt = b"x" * 10 + b"%s"; fmt % b"y" | 99 ns (*) | 81 ns (-18%) fmt = b"x" * 100 + b"%s"; fmt % b"y" | 189 ns (*) | 87 ns (-54%) fmt = b"x" * 10**3 + b"%s"; fmt % b"y" | 1.12 us (*) | 209 ns (-81%) fmt = b"x" * 10**5 + b"%s"; fmt % b"y" | 88.4 us (*) | 8.49 us (-90%) ---------------------------------------+-------------+--------------- Total | 89.8 us (*) | 8.87 us (-90%) ---------------------------------------+-------------+--------------- ----------------------------------------------------------+-------------+--------------- %f | orig | writer ----------------------------------------------------------+-------------+--------------- n = 200; fmt = b"%f" * n; arg = tuple([1.2]*n); fmt % arg | 37.2 us (*) | 29.6 us (-21%) ----------------------------------------------------------+-------------+--------------- ------------------------------------------------------------+-------------+--------------- %i | orig | writer ------------------------------------------------------------+-------------+--------------- n = 200; fmt = b"%f" * n; arg = tuple([12345]*n); fmt % arg | 49.4 us (*) | 42.8 us (-13%) ------------------------------------------------------------+-------------+--------------- -------------------------+-------------+--------------- Summary | orig | writer -------------------------+-------------+--------------- use smaller buffer | 353 ns (*) | 301 ns (-15%) "hello %s" % long_string | 5.28 us (*) | 4.77 us (-10%) b"xxxxxx %s" % b"y" | 89.8 us (*) | 8.87 us (-90%) %f | 37.2 us (*) | 29.6 us (-21%) %i | 49.4 us (*) | 42.8 us (-13%) -------------------------+-------------+--------------- Total | 182 us (*) | 86.3 us (-53%) -------------------------+-------------+--------------- |
|||
msg252597 - (view) | Author: Roundup Robot (python-dev) | Date: 2015-10-09 10:21 | |
New changeset b2f3cbdc0f2d by Victor Stinner in branch 'default': Issue #25349: Optimize bytes % args using the new private _PyBytesWriter API https://hg.python.org/cpython/rev/b2f3cbdc0f2d |
|||
msg252629 - (view) | Author: STINNER Victor (vstinner) * | Date: 2015-10-09 16:57 | |
bytes_formatlong.patch: Fast-path for b'%d' % int and other integer formatters. It avoids the creation of a temporary bytes object, it writes directly into the writer, as '%d' % int (Unicode). |
|||
msg252650 - (view) | Author: STINNER Victor (vstinner) * | Date: 2015-10-09 20:40 | |
I wrote bench_bytes_int.py micro-benchmark, results are below. Oh, I did'n expected a real difference even for simple code like b'%d' % 12345 (32% faster). So I consider that it's enough to apply the optimization. Common platform: Timer: time.perf_counter Bits: int=32, long=64, long long=64, size_t=64, void*=64 CPU model: Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz Platform: Linux-4.1.6-200.fc22.x86_64-x86_64-with-fedora-22-Twenty_Two Python unicode implementation: PEP 393 Timer info: namespace(adjustable=False, implementation='clock_gettime(CLOCK_MONOTONIC)', monotonic=True, resolution=1e-09) CFLAGS: -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes Platform of campaign orig: SCM: hg revision=576128c0d068 tag=tip branch=default date="2015-10-09 10:20 -0400" Python version: 3.6.0a0 (default:576128c0d068, Oct 9 2015, 22:36:21) [GCC 5.1.1 20150618 (Red Hat 5.1.1-4)] Date: 2015-10-09 22:36:36 Timer precision: 62 ns Platform of campaign writer: Python version: 3.6.0a0 (default:576128c0d068+, Oct 9 2015, 22:28:09) [GCC 5.1.1 20150618 (Red Hat 5.1.1-4)] Date: 2015-10-09 22:34:53 SCM: hg revision=576128c0d068+ tag=tip branch=default date="2015-10-09 10:20 -0400" Timer precision: 65 ns ------------------------------------------------------------+-------------+--------------- %i | orig | writer ------------------------------------------------------------+-------------+--------------- n = 1; fmt = b"%d" * n; arg = tuple([12345]*n); fmt % arg | 155 ns (*) | 105 ns (-32%) n = 5; fmt = b"%d" * n; arg = tuple([12345]*n); fmt % arg | 546 ns (*) | 306 ns (-44%) n = 10; fmt = b"%d" * n; arg = tuple([12345]*n); fmt % arg | 1.03 us (*) | 543 ns (-47%) n = 25; fmt = b"%d" * n; arg = tuple([12345]*n); fmt % arg | 2.49 us (*) | 1.27 us (-49%) n = 100; fmt = b"%d" * n; arg = tuple([12345]*n); fmt % arg | 10.1 us (*) | 5.25 us (-48%) n = 200; fmt = b"%d" * n; arg = tuple([12345]*n); fmt % arg | 20.5 us (*) | 10.8 us (-47%) n = 500; fmt = b"%d" * n; arg = tuple([12345]*n); fmt % arg | 48.8 us (*) | 24.6 us (-50%) ------------------------------------------------------------+-------------+--------------- Total | 83.6 us (*) | 42.9 us (-49%) ------------------------------------------------------------+-------------+--------------- ---------------------------------------------------------------+-------------+--------------- x=%i | orig | writer ---------------------------------------------------------------+-------------+--------------- n = 1; fmt = b"x=%d " * n; arg = tuple([12345]*n); fmt % arg | 173 ns (*) | 123 ns (-29%) n = 5; fmt = b"x=%d " * n; arg = tuple([12345]*n); fmt % arg | 602 ns (*) | 372 ns (-38%) n = 10; fmt = b"x=%d " * n; arg = tuple([12345]*n); fmt % arg | 1.14 us (*) | 668 ns (-42%) n = 25; fmt = b"x=%d " * n; arg = tuple([12345]*n); fmt % arg | 2.8 us (*) | 1.56 us (-44%) n = 100; fmt = b"x=%d " * n; arg = tuple([12345]*n); fmt % arg | 11.1 us (*) | 6.12 us (-45%) n = 200; fmt = b"x=%d " * n; arg = tuple([12345]*n); fmt % arg | 21.5 us (*) | 12.1 us (-44%) n = 500; fmt = b"x=%d " * n; arg = tuple([12345]*n); fmt % arg | 53.5 us (*) | 29.8 us (-44%) ---------------------------------------------------------------+-------------+--------------- Total | 90.8 us (*) | 50.7 us (-44%) ---------------------------------------------------------------+-------------+--------------- ------------------------------------------------------------+-------------+--------------- %x | orig | writer ------------------------------------------------------------+-------------+--------------- n = 1; fmt = b"%d" * n; arg = tuple([12345]*n); fmt % arg | 155 ns (*) | 105 ns (-32%) n = 5; fmt = b"%d" * n; arg = tuple([12345]*n); fmt % arg | 545 ns (*) | 306 ns (-44%) n = 10; fmt = b"%d" * n; arg = tuple([12345]*n); fmt % arg | 1.03 us (*) | 543 ns (-47%) n = 25; fmt = b"%d" * n; arg = tuple([12345]*n); fmt % arg | 2.49 us (*) | 1.26 us (-49%) n = 100; fmt = b"%d" * n; arg = tuple([12345]*n); fmt % arg | 9.9 us (*) | 5.07 us (-49%) n = 200; fmt = b"%d" * n; arg = tuple([12345]*n); fmt % arg | 19.8 us (*) | 10.1 us (-49%) n = 500; fmt = b"%d" * n; arg = tuple([12345]*n); fmt % arg | 48.9 us (*) | 24.5 us (-50%) ------------------------------------------------------------+-------------+--------------- Total | 82.8 us (*) | 41.9 us (-49%) ------------------------------------------------------------+-------------+--------------- ------------------------------------------------------------------+-------------+--------------- x=%x | orig | writer ------------------------------------------------------------------+-------------+--------------- n = 1; fmt = b"x=%d " * n; arg = tuple([0xabcdef]*n); fmt % arg | 183 ns (*) | 132 ns (-28%) n = 5; fmt = b"x=%d " * n; arg = tuple([0xabcdef]*n); fmt % arg | 651 ns (*) | 419 ns (-36%) n = 10; fmt = b"x=%d " * n; arg = tuple([0xabcdef]*n); fmt % arg | 1.23 us (*) | 761 ns (-38%) n = 25; fmt = b"x=%d " * n; arg = tuple([0xabcdef]*n); fmt % arg | 2.96 us (*) | 1.79 us (-40%) n = 100; fmt = b"x=%d " * n; arg = tuple([0xabcdef]*n); fmt % arg | 11.9 us (*) | 7.13 us (-40%) n = 200; fmt = b"x=%d " * n; arg = tuple([0xabcdef]*n); fmt % arg | 23.5 us (*) | 14 us (-41%) n = 500; fmt = b"x=%d " * n; arg = tuple([0xabcdef]*n); fmt % arg | 58.3 us (*) | 34.3 us (-41%) ------------------------------------------------------------------+-------------+--------------- Total | 98.6 us (*) | 58.5 us (-41%) ------------------------------------------------------------------+-------------+--------------- --------------------------------------------+-------------+-------------- large int: %i | orig | writer --------------------------------------------+-------------+-------------- fmt = b"%i"; arg = 10 ** 0 - 1; fmt % arg | 115 ns (*) | 74 ns (-36%) fmt = b"%i"; arg = 10 ** 50 - 1; fmt % arg | 288 ns (*) | 242 ns (-16%) fmt = b"%i"; arg = 10 ** 100 - 1; fmt % arg | 538 ns (*) | 494 ns (-8%) fmt = b"%i"; arg = 10 ** 150 - 1; fmt % arg | 865 ns (*) | 812 ns (-6%) fmt = b"%i"; arg = 10 ** 200 - 1; fmt % arg | 1.33 us (*) | 1.28 us --------------------------------------------+-------------+-------------- Total | 3.14 us (*) | 2.9 us (-8%) --------------------------------------------+-------------+-------------- ----------------------------------------------+-------------+--------------- large int: x=%i | orig | writer ----------------------------------------------+-------------+--------------- fmt = b"x=%i"; arg = 10 ** 0 - 1; fmt % arg | 140 ns (*) | 100 ns (-28%) fmt = b"x=%i"; arg = 10 ** 50 - 1; fmt % arg | 298 ns (*) | 249 ns (-16%) fmt = b"x=%i"; arg = 10 ** 100 - 1; fmt % arg | 548 ns (*) | 502 ns (-8%) fmt = b"x=%i"; arg = 10 ** 150 - 1; fmt % arg | 874 ns (*) | 822 ns (-6%) ----------------------------------------------+-------------+--------------- Total | 1.86 us (*) | 1.67 us (-10%) ----------------------------------------------+-------------+--------------- -------------------+-------------+--------------- Summary | orig | writer -------------------+-------------+--------------- %i | 83.6 us (*) | 42.9 us (-49%) x=%i | 90.8 us (*) | 50.7 us (-44%) %x | 82.8 us (*) | 41.9 us (-49%) x=%x | 98.6 us (*) | 58.5 us (-41%) large int: %i | 3.14 us (*) | 2.9 us (-8%) large int: x=%i | 1.86 us (*) | 1.67 us (-10%) -------------------+-------------+--------------- Total | 363 us (*) | 201 us (-45%) -------------------+-------------+--------------- |
|||
msg252655 - (view) | Author: Roundup Robot (python-dev) | Date: 2015-10-09 21:04 | |
New changeset d9a89c9137d2 by Victor Stinner in branch 'default': Issue #25349: Optimize bytes % int https://hg.python.org/cpython/rev/d9a89c9137d2 New changeset 4d46d1588629 by Victor Stinner in branch 'default': Issue #25349: Add fast path for b'%c' % int https://hg.python.org/cpython/rev/4d46d1588629 |
|||
msg252657 - (view) | Author: STINNER Victor (vstinner) * | Date: 2015-10-09 21:06 | |
Ok, I implemented all optimizations which were already implemented in str % args. I close the issue. |
|||
msg264246 - (view) | Author: Roundup Robot (python-dev) | Date: 2016-04-26 10:36 | |
New changeset 090502a0c69c by Victor Stinner in branch 'default': Issue #25349, #26249: Fix memleak in formatfloat() https://hg.python.org/cpython/rev/090502a0c69c |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-04-11 14:58:22 | admin | set | github: 69536 |
2016-04-26 10:36:43 | python-dev | set | messages: + msg264246 |
2015-10-09 21:06:38 | vstinner | set | status: open -> closed resolution: fixed messages: + msg252657 |
2015-10-09 21:04:48 | python-dev | set | messages: + msg252655 |
2015-10-09 20:40:32 | vstinner | set | files:
+ bench_bytes_int.py messages: + msg252650 |
2015-10-09 16:57:51 | vstinner | set | files:
+ bytes_formatlong.patch messages: + msg252629 |
2015-10-09 10:21:59 | python-dev | set | nosy:
+ python-dev messages: + msg252597 |
2015-10-09 10:20:43 | vstinner | set | files:
+ bench_bytes_format.py messages: + msg252596 |
2015-10-09 00:51:01 | vstinner | set | messages: + msg252578 |
2015-10-09 00:50:25 | vstinner | create |