Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize bytearray % args #69585

Closed
vstinner opened this issue Oct 14, 2015 · 3 comments
Closed

Optimize bytearray % args #69585

vstinner opened this issue Oct 14, 2015 · 3 comments
Labels
performance Performance or resource usage

Comments

@vstinner
Copy link
Member

BPO 25399
Nosy @vstinner, @serhiy-storchaka
Files
  • bytearray_format.patch
  • bench_bytes_format.py
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2015-10-14.09:05:21.812>
    created_at = <Date 2015-10-14.00:42:28.308>
    labels = ['performance']
    title = 'Optimize bytearray % args'
    updated_at = <Date 2015-10-14.09:05:21.811>
    user = 'https://github.com/vstinner'

    bugs.python.org fields:

    activity = <Date 2015-10-14.09:05:21.811>
    actor = 'vstinner'
    assignee = 'none'
    closed = True
    closed_date = <Date 2015-10-14.09:05:21.812>
    closer = 'vstinner'
    components = []
    creation = <Date 2015-10-14.00:42:28.308>
    creator = 'vstinner'
    dependencies = []
    files = ['40776', '40778']
    hgrepos = []
    issue_num = 25399
    keywords = ['patch']
    message_count = 3.0
    messages = ['252970', '252977', '252978']
    nosy_count = 3.0
    nosy_names = ['vstinner', 'python-dev', 'serhiy.storchaka']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = None
    status = 'closed'
    superseder = None
    type = 'performance'
    url = 'https://bugs.python.org/issue25399'
    versions = ['Python 3.6']

    @vstinner
    Copy link
    Member Author

    Optimize bytearray % args

    Don't create temporary bytes objects: modify _PyBytes_Format() to create work
    directly on bytearray objects.

    • _PyBytesWriter: add use_bytearray attribute to use a bytearray buffer
    • Rename _PyBytes_Format() to _PyBytes_FormatEx() just in case if something
      outside CPython uses it
    • _PyBytes_FormatEx() now uses (char*, Py_ssize_t) for the input string, so
      bytearray_format() doesn't need tot create a temporary input bytes object
    • Add use_bytearray parameter to _PyBytes_FormatEx() which is passed to
      _PyBytesWriter, to create a bytearray buffer instead of a bytes buffer

    @vstinner vstinner added the performance Performance or resource usage label Oct 14, 2015
    @vstinner
    Copy link
    Member Author

    Microbenchmark result below.

    Most operations are now between 2.5 and 5 times faster. %f is as-fast, probably because formatting a float is more expensive than copying bytes (raw estimation: 150 ns to format a single floating pointer number).

    Common platform:
    Timer: time.perf_counter
    CPU model: Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz
    Bits: int=32, long=64, long long=64, size_t=64, void*=64
    CFLAGS: -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes
    Platform: Linux-4.1.6-200.fc22.x86_64-x86_64-with-fedora-22-Twenty_Two
    Timer info: namespace(adjustable=False, implementation='clock_gettime(CLOCK_MONOTONIC)', monotonic=True, resolution=1e-09)
    Python unicode implementation: PEP-393

    Platform of campaign orig:
    Python version: 3.6.0a0 (default:af34d0626fb4, Oct 14 2015, 09:51:04) [GCC 5.1.1 20150618 (Red Hat 5.1.1-4)]
    Timer precision: 64 ns
    Date: 2015-10-14 09:51:20
    SCM: hg revision=af34d0626fb4 branch=default date="2015-10-14 09:47 +0200"

    Platform of campaign no_copy:
    Timer precision: 59 ns
    Python version: 3.6.0a0 (default:2e9d9873d2be, Oct 14 2015, 09:49:28) [GCC 5.1.1 20150618 (Red Hat 5.1.1-4)]
    Date: 2015-10-14 09:49:44
    SCM: hg revision=2e9d9873d2be tag=tip branch=default date="2015-10-14 09:41 +0200"

    -------------------------------------------------+-------------+--------------
    use smaller buffer | orig | no_copy
    -------------------------------------------------+-------------+--------------

    fmt = bytearray(b"hello %s"); fmt % b"world"     |  656 ns (*) |  93 ns (-86%)
    fmt = bytearray(b"hello %-100s"); fmt % b"world" |  686 ns (*) | 105 ns (-85%)
    fmt = bytearray(b"x=%d"); fmt % 123              |  689 ns (*) | 112 ns (-84%)
    fmt = bytearray(b"x=%f"); fmt % 1.2              |  976 ns (*) | 216 ns (-78%)
    fmt = bytearray(b"x=%100d"); fmt % 123           |  870 ns (*) | 172 ns (-80%)
    -------------------------------------------------+-------------+

    Total | 3.88 us (*) | 698 ns (-82%)
    -------------------------------------------------+-------------+--------------

    ------------------------------------------------------------+-------------+---------------
    "hello %s" % long_string | orig | no_copy
    ------------------------------------------------------------+-------------+---------------

    fmt = bytearray(b"hello %s"); arg = b"x" * 10; fmt % arg    |  661 ns (*) |   93 ns (-86%)
    fmt = bytearray(b"hello %s"); arg = b"x" * 100; fmt % arg   |  667 ns (*) |   93 ns (-86%)
    fmt = bytearray(b"hello %s"); arg = b"x" * 10**3; fmt % arg |  982 ns (*) |  186 ns (-81%)
    fmt = bytearray(b"hello %s"); arg = b"x" * 10**5; fmt % arg | 10.2 us (*) | 4.42 us (-57%)
    ------------------------------------------------------------+-------------+

    Total | 12.5 us (*) | 4.8 us (-62%)
    ------------------------------------------------------------+-------------+---------------

    --------------------------------------------------+-------------+---------------
    b"xxxxxx %s" % b"y" | orig | no_copy
    --------------------------------------------------+-------------+---------------

    fmt = bytearray(b"x" * 10 + b"%s"); fmt % b"y"    |  653 ns (*) |   88 ns (-86%)
    fmt = bytearray(b"x" * 100 + b"%s"); fmt % b"y"   |  674 ns (*) |   94 ns (-86%)
    fmt = bytearray(b"x" * 10**3 + b"%s"); fmt % b"y" | 1.09 us (*) |  213 ns (-80%)
    fmt = bytearray(b"x" * 10**5 + b"%s"); fmt % b"y" | 21.4 us (*) | 8.47 us (-60%)
    --------------------------------------------------+-------------+

    Total | 23.8 us (*) | 8.87 us (-63%)
    --------------------------------------------------+-------------+---------------

    ---------------------------------------------------------------------+-------------+--------
    %f | orig | no_copy
    ---------------------------------------------------------------------+-------------+--------

    n = 200; fmt = bytearray(b"%f" * n); arg = tuple([1.2]*n); fmt % arg | 32.2 us (*) | 32.3 us
    ---------------------------------------------------------------------+-------------+

    -----------------------------------------------------------------------+-------------+---------------
    %i | orig | no_copy
    -----------------------------------------------------------------------+-------------+---------------

    n = 1; fmt = bytearray(b"%d" * n); arg = tuple([12345]*n); fmt % arg   |  678 ns (*) |  105 ns (-85%)
    n = 5; fmt = bytearray(b"%d" * n); arg = tuple([12345]*n); fmt % arg   |  884 ns (*) |  296 ns (-66%)
    n = 10; fmt = bytearray(b"%d" * n); arg = tuple([12345]*n); fmt % arg  | 1.13 us (*) |  531 ns (-53%)
    n = 25; fmt = bytearray(b"%d" * n); arg = tuple([12345]*n); fmt % arg  | 1.85 us (*) | 1.24 us (-33%)
    n = 100; fmt = bytearray(b"%d" * n); arg = tuple([12345]*n); fmt % arg | 5.62 us (*) |  4.8 us (-15%)
    n = 200; fmt = bytearray(b"%d" * n); arg = tuple([12345]*n); fmt % arg | 10.6 us (*) |        10.8 us
    n = 500; fmt = bytearray(b"%d" * n); arg = tuple([12345]*n); fmt % arg | 25.1 us (*) |  26.8 us (+7%)
    -----------------------------------------------------------------------+-------------+

    Total | 45.9 us (*) | 44.6 us
    -----------------------------------------------------------------------+-------------+---------------

    --------------------------------------------------------------------------+-------------+---------------
    x=%i | orig | no_copy
    --------------------------------------------------------------------------+-------------+---------------

    n = 1; fmt = bytearray(b"x=%d " * n); arg = tuple([12345]*n); fmt % arg   |  699 ns (*) |  123 ns (-82%)
    n = 5; fmt = bytearray(b"x=%d " * n); arg = tuple([12345]*n); fmt % arg   |  943 ns (*) |  364 ns (-61%)
    n = 10; fmt = bytearray(b"x=%d " * n); arg = tuple([12345]*n); fmt % arg  | 1.22 us (*) |  655 ns (-47%)
    n = 25; fmt = bytearray(b"x=%d " * n); arg = tuple([12345]*n); fmt % arg  | 2.08 us (*) | 1.52 us (-27%)
    n = 100; fmt = bytearray(b"x=%d " * n); arg = tuple([12345]*n); fmt % arg | 6.86 us (*) |        6.79 us
    n = 200; fmt = bytearray(b"x=%d " * n); arg = tuple([12345]*n); fmt % arg | 12.6 us (*) |  13.3 us (+6%)
    n = 500; fmt = bytearray(b"x=%d " * n); arg = tuple([12345]*n); fmt % arg | 29.7 us (*) |  32.4 us (+9%)
    --------------------------------------------------------------------------+-------------+

    Total | 54.1 us (*) | 55.2 us
    --------------------------------------------------------------------------+-------------+---------------

    -----------------------------------------------------------------------+-------------+---------------
    %x | orig | no_copy
    -----------------------------------------------------------------------+-------------+---------------

    n = 1; fmt = bytearray(b"%d" * n); arg = tuple([12345]*n); fmt % arg   |  677 ns (*) |  105 ns (-85%)
    n = 5; fmt = bytearray(b"%d" * n); arg = tuple([12345]*n); fmt % arg   |  886 ns (*) |  297 ns (-67%)
    n = 10; fmt = bytearray(b"%d" * n); arg = tuple([12345]*n); fmt % arg  | 1.13 us (*) |  530 ns (-53%)
    n = 25; fmt = bytearray(b"%d" * n); arg = tuple([12345]*n); fmt % arg  | 1.85 us (*) | 1.24 us (-33%)
    n = 100; fmt = bytearray(b"%d" * n); arg = tuple([12345]*n); fmt % arg | 5.64 us (*) | 4.82 us (-15%)
    n = 200; fmt = bytearray(b"%d" * n); arg = tuple([12345]*n); fmt % arg | 10.7 us (*) |        10.8 us
    n = 500; fmt = bytearray(b"%d" * n); arg = tuple([12345]*n); fmt % arg | 25.2 us (*) |  26.8 us (+7%)
    -----------------------------------------------------------------------+-------------+

    Total | 46.1 us (*) | 44.6 us
    -----------------------------------------------------------------------+-------------+---------------

    -----------------------------------------------------------------------------+-------------+---------------
    x=%x | orig | no_copy
    -----------------------------------------------------------------------------+-------------+---------------

    n = 1; fmt = bytearray(b"x=%x " * n); arg = tuple([0xabcdef]*n); fmt % arg   |  685 ns (*) |  120 ns (-82%)
    n = 5; fmt = bytearray(b"x=%x " * n); arg = tuple([0xabcdef]*n); fmt % arg   |  916 ns (*) |  342 ns (-63%)
    n = 10; fmt = bytearray(b"x=%x " * n); arg = tuple([0xabcdef]*n); fmt % arg  | 1.19 us (*) |  609 ns (-49%)
    n = 25; fmt = bytearray(b"x=%x " * n); arg = tuple([0xabcdef]*n); fmt % arg  | 1.99 us (*) | 1.41 us (-29%)
    n = 100; fmt = bytearray(b"x=%x " * n); arg = tuple([0xabcdef]*n); fmt % arg | 6.64 us (*) |        6.43 us
    n = 200; fmt = bytearray(b"x=%x " * n); arg = tuple([0xabcdef]*n); fmt % arg | 11.9 us (*) |  12.7 us (+7%)
    n = 500; fmt = bytearray(b"x=%x " * n); arg = tuple([0xabcdef]*n); fmt % arg | 28.2 us (*) |  30.5 us (+8%)
    -----------------------------------------------------------------------------+-------------+

    Total | 51.5 us (*) | 52.1 us
    -----------------------------------------------------------------------------+-------------+---------------

    -------------------------------------------------------+-------------+---------------
    large int: %i | orig | no_copy
    -------------------------------------------------------+-------------+---------------

    fmt = bytearray(b"%i"); arg = 10 ** 0 - 1; fmt % arg   |  651 ns (*) |   75 ns (-89%)
    fmt = bytearray(b"%i"); arg = 10 ** 50 - 1; fmt % arg  |  810 ns (*) |  245 ns (-70%)
    fmt = bytearray(b"%i"); arg = 10 ** 100 - 1; fmt % arg | 1.06 us (*) |  496 ns (-53%)
    fmt = bytearray(b"%i"); arg = 10 ** 150 - 1; fmt % arg | 1.38 us (*) |  819 ns (-41%)
    fmt = bytearray(b"%i"); arg = 10 ** 200 - 1; fmt % arg | 1.87 us (*) | 1.28 us (-32%)
    -------------------------------------------------------+-------------+

    Total | 5.78 us (*) | 2.91 us (-50%)
    -------------------------------------------------------+-------------+---------------

    ---------------------------------------------------------+-------------+---------------
    large int: x=%i | orig | no_copy
    ---------------------------------------------------------+-------------+---------------

    fmt = bytearray(b"x=%i"); arg = 10 ** 0 - 1; fmt % arg   |  674 ns (*) |  103 ns (-85%)
    fmt = bytearray(b"x=%i"); arg = 10 ** 50 - 1; fmt % arg  |  820 ns (*) |  254 ns (-69%)
    fmt = bytearray(b"x=%i"); arg = 10 ** 100 - 1; fmt % arg | 1.07 us (*) |  503 ns (-53%)
    fmt = bytearray(b"x=%i"); arg = 10 ** 150 - 1; fmt % arg |  1.4 us (*) |  824 ns (-41%)
    ---------------------------------------------------------+-------------+

    Total | 3.96 us (*) | 1.68 us (-58%)
    ---------------------------------------------------------+-------------+---------------

    -------------------------+-------------+---------------
    Summary | orig | no_copy
    -------------------------+-------------+---------------
    use smaller buffer | 3.88 us () | 698 ns (-82%)
    "hello %s" % long_string | 12.5 us (
    ) | 4.8 us (-62%)
    b"xxxxxx %s" % b"y" | 23.8 us () | 8.87 us (-63%)
    %f | 32.2 us (
    ) | 32.3 us
    %i | 45.9 us () | 44.6 us
    x=%i | 54.1 us (
    ) | 55.2 us
    %x | 46.1 us () | 44.6 us
    x=%x | 51.5 us (
    ) | 52.1 us
    large int: %i | 5.78 us () | 2.91 us (-50%)
    large int: x=%i | 3.96 us (
    ) | 1.68 us (-58%)
    -------------------------+-------------+---------------
    Total | 280 us (*) | 248 us (-11%)
    -------------------------+-------------+---------------

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Oct 14, 2015

    New changeset 03646293f1b3 by Victor Stinner in branch 'default':
    Fix long_format_binary()
    https://hg.python.org/cpython/rev/03646293f1b3

    New changeset 6fe0050a2f52 by Victor Stinner in branch 'default':
    Add use_bytearray attribute to _PyBytesWriter
    https://hg.python.org/cpython/rev/6fe0050a2f52

    New changeset f369b79c0153 by Victor Stinner in branch 'default':
    Optimize bytearray % args
    https://hg.python.org/cpython/rev/f369b79c0153

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    performance Performance or resource usage
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant