Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Py_MEMCPY: Use memcpy on Windows? #72313

Closed
tiran opened this issue Sep 13, 2016 · 15 comments
Closed

Py_MEMCPY: Use memcpy on Windows? #72313

tiran opened this issue Sep 13, 2016 · 15 comments
Labels
3.7 (EOL) end of life OS-windows performance Performance or resource usage

Comments

@tiran
Copy link
Member

tiran commented Sep 13, 2016

BPO 28126
Nosy @pfmoore, @vstinner, @tiran, @tjguk, @zware, @serhiy-storchaka, @zooba, @Andronuxx27
PRs
  • [Do Not Merge] Convert Misc/NEWS so that it is managed by towncrier #552
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2016-09-13.18:23:10.765>
    created_at = <Date 2016-09-13.12:32:23.695>
    labels = ['3.7', 'OS-windows', 'performance']
    title = 'Py_MEMCPY: Use memcpy on Windows?'
    updated_at = <Date 2018-04-17.23:39:13.884>
    user = 'https://github.com/tiran'

    bugs.python.org fields:

    activity = <Date 2018-04-17.23:39:13.884>
    actor = 'gitinit.py@gmail.com'
    assignee = 'none'
    closed = True
    closed_date = <Date 2016-09-13.18:23:10.765>
    closer = 'christian.heimes'
    components = ['Windows']
    creation = <Date 2016-09-13.12:32:23.695>
    creator = 'christian.heimes'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 28126
    keywords = []
    message_count = 15.0
    messages = ['276262', '276271', '276306', '276309', '276310', '276313', '276314', '276317', '276413', '276426', '276433', '276435', '276436', '276464', '315427']
    nosy_count = 9.0
    nosy_names = ['paul.moore', 'vstinner', 'christian.heimes', 'tim.golden', 'python-dev', 'zach.ware', 'serhiy.storchaka', 'steve.dower', 'gitinit.py@gmail.com']
    pr_nums = ['552']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'performance'
    url = 'https://bugs.python.org/issue28126'
    versions = ['Python 3.6', 'Python 3.7']

    @tiran
    Copy link
    Member Author

    tiran commented Sep 13, 2016

    Py_MEMCPY() has a special case for small blocks on Windows to work around an ancient performance issue in MSVC. Can we safely assume that recent MSVC properly optimize memcpy()? See bpo-28055

    /* Py_MEMCPY can be used instead of memcpy in cases where the copied blocks

    • are often very short. While most platforms have highly optimized code for
    • large transfers, the setup costs for memcpy are often quite high. MEMCPY
    • solves this by doing short copies "in line".
      */
    #if defined(_MSC_VER)
    #define Py_MEMCPY(target, source, length) do {                          \
            size_t i_, n_ = (length);                                       \
            char *t_ = (void*) (target);                                    \
            const char *s_ = (void*) (source);                              \
            if (n_ >= 16)                                                   \
                memcpy(t_, s_, n_);                                         \
            else                                                            \
                for (i_ = 0; i_ < n_; i_++)                                 \
                    t_[i_] = s_[i_];                                        \
        } while (0)
    #else
    #define Py_MEMCPY memcpy
    #endif

    @tiran tiran added 3.7 (EOL) end of life OS-windows performance Performance or resource usage labels Sep 13, 2016
    @zooba
    Copy link
    Member

    zooba commented Sep 13, 2016

    Yes, memcpy (or preferably memcpy_s, which includes the size of the destination) are basically intrinsics that will inline short copies and call out to a range of implementations depending on alignment/overlap/etc.

    @tiran
    Copy link
    Member Author

    tiran commented Sep 13, 2016

    Perfect, thanks! Should we keep the Py_MEMCPY() macro for now and define it to memcpy()? Or should I get rid of it and just define it for b/w compatibility?

    @zooba
    Copy link
    Member

    zooba commented Sep 13, 2016

    I'm okay with removing it completely, if you're willing to change that much code. (Most of my additions already uses memcpy or memcpy_s directly, since I wasn't even aware of this macro :) )

    @tiran
    Copy link
    Member Author

    tiran commented Sep 13, 2016

    Behold the power of Unix! :)

    $ find -name '*.[ch]' | xargs sed -i s/Py_MEMCPY/memcpy/g

    Victor, are you fine with the change? I'm going to keep Py_MEMCPY around.

    @vstinner
    Copy link
    Member

    The macro looks public, you cannot remove it.

    Make it an alias to memcpy(), but explain that it's only kept for backward
    compatibility with a reference to this issue.

    @tiran
    Copy link
    Member Author

    tiran commented Sep 13, 2016

    As I said, I'm going to keep Py_MEMCPY around but replace it everywhere else.

    /* Py_MEMCPY is kept for backwards compatibility,
     * see https://bugs.python.org/issue28126 */
    #define Py_MEMCPY memcpy

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Sep 13, 2016

    New changeset bedce61ae0a0 by Christian Heimes in branch '3.6':
    Issue bpo-28126: Replace Py_MEMCPY with memcpy(). Visual Studio can properly optimize memcpy().
    https://hg.python.org/cpython/rev/bedce61ae0a0

    New changeset f5d32ed0f9c2 by Christian Heimes in branch 'default':
    Issue bpo-28126: Replace Py_MEMCPY with memcpy(). Visual Studio can properly optimize memcpy().
    https://hg.python.org/cpython/rev/f5d32ed0f9c2

    @tiran tiran closed this as completed Sep 13, 2016
    @serhiy-storchaka
    Copy link
    Member

    Shouldn't the NEWS entity be in the C API section?

    @tiran
    Copy link
    Member Author

    tiran commented Sep 14, 2016

    Isn't the C API section reserved for C API changes? I haven't changed the C API but rather optimized the core.

    @serhiy-storchaka
    Copy link
    Member

    The Py_MEMCPY name doesn't start by an underscore, and it is not inside the "#ifndef Py_LIMITED_API" block. But it is not documented either. I don't know what is the status of this macro. If this is a part of public API, this change should be documented in the C API section. If it is inner macro, this change shouldn't be mentioned in Misc/NEWS.

    @vstinner
    Copy link
    Member

    I don't think that it's very useful to discuss the status of the
    macro. Documenting the change doesn't harm ;-)

    @tiran
    Copy link
    Member Author

    tiran commented Sep 14, 2016

    My change doesn't break or change 3rd party application. They might get a tiny bit faster on Windows, if they have used Py_MEMCPY() for up to 16 bytes. That's it.

    @zooba
    Copy link
    Member

    zooba commented Sep 14, 2016

    They might get a tiny bit faster on Windows, if they have used Py_MEMCPY() for up to 16 bytes

    Even that's unlikely as the loop in the macro would have been unrolled in practically every case.

    This is about removing an unnecessary macro. Certainly no API change unless we delete it entirely (which I'm okay with), and IMHO if it has an issue number it should go in NEWS. Maybe build is the right section?

    @Andronuxx27
    Copy link
    Mannequin

    Andronuxx27 mannequin commented Apr 17, 2018

    Py_MEMCPY() has a special case for small blocks on Windows to work around an ancient performance issue in MSVC. Can we safely assume that recent MSVC properly optimize memcpy()? See bpo-28055

    /* Py_MEMCPY can be used instead of memcpy in cases where the copied blocks

    • are often very short. While most platforms have highly optimized code for
    • large transfers, the setup costs for memcpy are often quite high. MEMCPY
    • solves this by doing short copies "in line".
      */
    #if defined(_MSC_VER)
    #define Py_MEMCPY(target, source, length) do {                          \
            size_t i_, n_ = (length);                                       \
            char *t_ = (void*) (target);                                    \
            const char *s_ = (void*) (source);                              \
            if (n_ >= 16)                                                   \
                memcpy(t_, s_, n_);                                         \
            else                                                            \
                for (i_ = 0; i_ < n_; i_++)                                 \
                    t_[i_] = s_[i_];                                        \
        } while (0)
    #else
    #define Py_MEMCPY memcpy
    #endif

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life OS-windows performance Performance or resource usage
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants