Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bytes.hex(sep, bytes_per_sep) is many times slower than manually inserting the separators #84493

Closed
anntzer mannequin opened this issue Apr 17, 2020 · 5 comments
Closed
Assignees
Labels
3.9 only security fixes performance Performance or resource usage stdlib Python modules in the Lib dir

Comments

@anntzer
Copy link
Mannequin

anntzer mannequin commented Apr 17, 2020

BPO 40313
Nosy @gpshead, @vstinner, @anntzer, @miss-islington, @sweeneyde
PRs
  • bpo-40313: speed up bytes.hex() #19594
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/gpshead'
    closed_at = <Date 2020-04-21.00:32:21.521>
    created_at = <Date 2020-04-17.20:58:29.978>
    labels = ['library', '3.9', 'performance']
    title = 'bytes.hex(sep, bytes_per_sep) is many times slower than manually inserting the separators'
    updated_at = <Date 2020-04-21.00:32:21.518>
    user = 'https://github.com/anntzer'

    bugs.python.org fields:

    activity = <Date 2020-04-21.00:32:21.518>
    actor = 'vstinner'
    assignee = 'gregory.p.smith'
    closed = True
    closed_date = <Date 2020-04-21.00:32:21.521>
    closer = 'vstinner'
    components = ['Library (Lib)']
    creation = <Date 2020-04-17.20:58:29.978>
    creator = 'Antony.Lee'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 40313
    keywords = ['patch']
    message_count = 5.0
    messages = ['366678', '366761', '366770', '366903', '366904']
    nosy_count = 5.0
    nosy_names = ['gregory.p.smith', 'vstinner', 'Antony.Lee', 'miss-islington', 'Dennis Sweeney']
    pr_nums = ['19594']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'performance'
    url = 'https://bugs.python.org/issue40313'
    versions = ['Python 3.9']

    @anntzer
    Copy link
    Mannequin Author

    anntzer mannequin commented Apr 17, 2020

    Consider the following example, linewrapping 10^4 bytes in hex form to 128 characters per line, on Py 3.8.2 (Arch Linux repo package):

    In [1]: import numpy as np, math
    
    In [2]: data = np.random.randint(0, 256, (100, 100), dtype=np.uint8).tobytes()                  
    
    In [3]: %timeit data.hex("\n", -64)
    123 µs ± 5.8 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
    
    In [4]: %timeit h = data.hex(); "\n".join([h[n * 128 : (n+1) * 128] for n in range(math.ceil(len(h) / 128))])
    45.4 µs ± 746 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
    
    In [5]: h = data.hex(); "\n".join([h[n * 128 : (n+1) * 128] for n in range(math.ceil(len(h) / 128))]) == data.hex("\n", -64)                                                                       
    Out[5]: True
    

    (the last line checks the validity of the code.)

    It appears that a naive manual wrap is nearly 3x faster than the builtin functionality.

    @anntzer anntzer mannequin added 3.8 only security fixes stdlib Python modules in the Lib dir labels Apr 17, 2020
    @sweeneyde
    Copy link
    Member

    I replicated this behavior. This looks like the relevant loop in pystrhex.c:

        for (i=j=0; i < arglen; ++i) {
            assert((j + 1) < resultlen);
            unsigned char c;
            c = (argbuf[i] >> 4) & 0x0f;
            retbuf[j++] = Py_hexdigits[c];
            c = argbuf[i] & 0x0f;
            retbuf[j++] = Py_hexdigits[c];
            if (bytes_per_sep_group && i < arglen - 1) {
                Py_ssize_t anchor;
                anchor = (bytes_per_sep_group > 0) ? (arglen - 1 - i) : (i + 1);
                if (anchor % abs_bytes_per_sep == 0) {
                    retbuf[j++] = sep_char;
                }
            }
        }

    It looks like this can be refactored a bit for a tighter inner loop with fewer if-tests. I can work on a PR.

    @sweeneyde sweeneyde added 3.9 only security fixes and removed 3.8 only security fixes labels Apr 19, 2020
    @sweeneyde
    Copy link
    Member

    ========== Master ==========

    .\python.bat -m pyperf timeit -s "import random, math; data=random.getrandbits(8*10_000_000).to_bytes(10_000_000, 'big')" "temp = data.hex(); '\n'.join(temp[n:n+128] for n in range(0, len(temp), 128))"

    Mean +- std dev: 74.3 ms +- 1.1 ms

    .\python.bat -m pyperf timeit -s "import random; data=random.getrandbits(8*10_000_000).to_bytes(10_000_000, 'big')" "data.hex('\n', -64)"

    Mean +- std dev: 44.0 ms +- 0.3 ms

    ========== PR 19594 ==========

    .\python.bat -m pyperf timeit -s "import random, math; data=random.getrandbits(8*10_000_000).to_bytes(10_000_000, 'big')" "temp = data.hex(); '\n'.join(temp[n:n+128] for n in range(0, len(temp), 128))"

    Mean +- std dev: 65.2 ms +- 0.6 ms

    .\python.bat -m pyperf timeit -s "import random; data=random.getrandbits(8*10_000_000).to_bytes(10_000_000, 'big')" "data.hex('\n', -64)"

    Mean +- std dev: 18.1 ms +- 0.1 ms

    @sweeneyde sweeneyde added performance Performance or resource usage labels Apr 19, 2020
    @miss-islington
    Copy link
    Contributor

    New changeset 6a9e80a by sweeneyde in branch 'master':
    bpo-40313: speed up bytes.hex() (GH-19594)
    6a9e80a

    @vstinner
    Copy link
    Member

    Thanks Dennis for the optimization!

    FYI I also pushed another optimization recently:

    commit 455df97
    Author: Victor Stinner <vstinner@python.org>
    Date: Wed Apr 15 14:05:24 2020 +0200

    Optimize _Py_strhex_impl() (GH-19535)
    
    Avoid a temporary buffer to create a bytes string: use
    PyBytes_FromStringAndSize() to directly allocate a bytes object.
    

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.9 only security fixes performance Performance or resource usage stdlib Python modules in the Lib dir
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants