msg276262 - (view) |
Author: Christian Heimes (christian.heimes) *  |
Date: 2016-09-13 12:32 |
Py_MEMCPY() has a special case for small blocks on Windows to work around an ancient performance issue in MSVC. Can we safely assume that recent MSVC properly optimize memcpy()? See #28055
/* Py_MEMCPY can be used instead of memcpy in cases where the copied blocks
* are often very short. While most platforms have highly optimized code for
* large transfers, the setup costs for memcpy are often quite high. MEMCPY
* solves this by doing short copies "in line".
*/
#if defined(_MSC_VER)
#define Py_MEMCPY(target, source, length) do { \
size_t i_, n_ = (length); \
char *t_ = (void*) (target); \
const char *s_ = (void*) (source); \
if (n_ >= 16) \
memcpy(t_, s_, n_); \
else \
for (i_ = 0; i_ < n_; i_++) \
t_[i_] = s_[i_]; \
} while (0)
#else
#define Py_MEMCPY memcpy
#endif
|
msg276271 - (view) |
Author: Steve Dower (steve.dower) *  |
Date: 2016-09-13 13:45 |
Yes, memcpy (or preferably memcpy_s, which includes the size of the destination) are basically intrinsics that will inline short copies and call out to a range of implementations depending on alignment/overlap/etc.
|
msg276306 - (view) |
Author: Christian Heimes (christian.heimes) *  |
Date: 2016-09-13 17:19 |
Perfect, thanks! Should we keep the Py_MEMCPY() macro for now and define it to memcpy()? Or should I get rid of it and just define it for b/w compatibility?
|
msg276309 - (view) |
Author: Steve Dower (steve.dower) *  |
Date: 2016-09-13 17:25 |
I'm okay with removing it completely, if you're willing to change that much code. (Most of my additions already uses memcpy or memcpy_s directly, since I wasn't even aware of this macro :) )
|
msg276310 - (view) |
Author: Christian Heimes (christian.heimes) *  |
Date: 2016-09-13 17:44 |
Behold the power of Unix! :)
$ find -name '*.[ch]' | xargs sed -i s/Py_MEMCPY/memcpy/g
Victor, are you fine with the change? I'm going to keep Py_MEMCPY around.
|
msg276313 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2016-09-13 18:11 |
The macro looks public, you cannot remove it.
Make it an alias to memcpy(), but explain that it's only kept for backward
compatibility with a reference to this issue.
|
msg276314 - (view) |
Author: Christian Heimes (christian.heimes) *  |
Date: 2016-09-13 18:15 |
As I said, I'm going to keep Py_MEMCPY around but replace it everywhere else.
/* Py_MEMCPY is kept for backwards compatibility,
* see https://bugs.python.org/issue28126 */
#define Py_MEMCPY memcpy
|
msg276317 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2016-09-13 18:22 |
New changeset bedce61ae0a0 by Christian Heimes in branch '3.6':
Issue #28126: Replace Py_MEMCPY with memcpy(). Visual Studio can properly optimize memcpy().
https://hg.python.org/cpython/rev/bedce61ae0a0
New changeset f5d32ed0f9c2 by Christian Heimes in branch 'default':
Issue #28126: Replace Py_MEMCPY with memcpy(). Visual Studio can properly optimize memcpy().
https://hg.python.org/cpython/rev/f5d32ed0f9c2
|
msg276413 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2016-09-14 10:37 |
Shouldn't the NEWS entity be in the C API section?
|
msg276426 - (view) |
Author: Christian Heimes (christian.heimes) *  |
Date: 2016-09-14 12:50 |
Isn't the C API section reserved for C API changes? I haven't changed the C API but rather optimized the core.
|
msg276433 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2016-09-14 13:07 |
The Py_MEMCPY name doesn't start by an underscore, and it is not inside the "#ifndef Py_LIMITED_API" block. But it is not documented either. I don't know what is the status of this macro. If this is a part of public API, this change should be documented in the C API section. If it is inner macro, this change shouldn't be mentioned in Misc/NEWS.
|
msg276435 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2016-09-14 13:12 |
I don't think that it's very useful to discuss the status of the
macro. Documenting the change doesn't harm ;-)
|
msg276436 - (view) |
Author: Christian Heimes (christian.heimes) *  |
Date: 2016-09-14 13:15 |
My change doesn't break or change 3rd party application. They might get a tiny bit faster on Windows, if they have used Py_MEMCPY() for up to 16 bytes. That's it.
|
msg276464 - (view) |
Author: Steve Dower (steve.dower) *  |
Date: 2016-09-14 15:19 |
> They might get a tiny bit faster on Windows, if they have used Py_MEMCPY() for up to 16 bytes
Even that's unlikely as the loop in the macro would have been unrolled in practically every case.
This is about removing an unnecessary macro. Certainly no API change unless we delete it entirely (which I'm okay with), and IMHO if it has an issue number it should go in NEWS. Maybe build is the right section?
|
msg315427 - (view) |
Author: Andro Nuxx (gitinit.py@gmail.com) |
Date: 2018-04-17 23:39 |
Py_MEMCPY() has a special case for small blocks on Windows to work around an ancient performance issue in MSVC. Can we safely assume that recent MSVC properly optimize memcpy()? See #28055
/* Py_MEMCPY can be used instead of memcpy in cases where the copied blocks
* are often very short. While most platforms have highly optimized code for
* large transfers, the setup costs for memcpy are often quite high. MEMCPY
* solves this by doing short copies "in line".
*/
#if defined(_MSC_VER)
#define Py_MEMCPY(target, source, length) do { \
size_t i_, n_ = (length); \
char *t_ = (void*) (target); \
const char *s_ = (void*) (source); \
if (n_ >= 16) \
memcpy(t_, s_, n_); \
else \
for (i_ = 0; i_ < n_; i_++) \
t_[i_] = s_[i_]; \
} while (0)
#else
#define Py_MEMCPY memcpy
#endif
|
|
Date |
User |
Action |
Args |
2022-04-11 14:58:36 | admin | set | github: 72313 |
2018-04-17 23:39:13 | gitinit.py@gmail.com | set | nosy:
+ gitinit.py@gmail.com messages:
+ msg315427
|
2017-03-31 16:36:18 | dstufft | set | pull_requests:
+ pull_request924 |
2016-09-14 15:19:37 | steve.dower | set | messages:
+ msg276464 |
2016-09-14 13:15:19 | christian.heimes | set | messages:
+ msg276436 |
2016-09-14 13:12:50 | vstinner | set | messages:
+ msg276435 |
2016-09-14 13:07:57 | serhiy.storchaka | set | messages:
+ msg276433 |
2016-09-14 12:50:16 | christian.heimes | set | messages:
+ msg276426 |
2016-09-14 10:37:18 | serhiy.storchaka | set | nosy:
+ serhiy.storchaka messages:
+ msg276413
|
2016-09-13 18:23:10 | christian.heimes | set | status: open -> closed resolution: fixed stage: needs patch -> resolved |
2016-09-13 18:22:44 | python-dev | set | nosy:
+ python-dev messages:
+ msg276317
|
2016-09-13 18:15:36 | christian.heimes | set | messages:
+ msg276314 |
2016-09-13 18:11:51 | vstinner | set | messages:
+ msg276313 |
2016-09-13 17:44:16 | christian.heimes | set | messages:
+ msg276310 |
2016-09-13 17:25:32 | steve.dower | set | messages:
+ msg276309 |
2016-09-13 17:19:38 | christian.heimes | set | messages:
+ msg276306 |
2016-09-13 13:45:18 | steve.dower | set | messages:
+ msg276271 |
2016-09-13 12:32:23 | christian.heimes | create | |