Message 310146 - Python tracker

➜

This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	vstinner
Recipients	jeethu, pitrou, rhettinger, serhiy.storchaka, vstinner
Date	2018-01-17.09:45:25
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1516182326.27.0.467229070634.issue32534@psf.upfronthosting.co.za>
In-reply-to

Content
https://gist.github.com/jeethu/19430d802aa08e28d1cb5eb20a47a470 Mean +- std dev: 10.5 us +- 1.4 us => Mean +- std dev: 9.68 us +- 0.89 us It's 1.08x faster (-7.8%). It's small for a microbenchmark, usually an optimization should make a microbenchmark at least 10% faster. For this optimization, I have no strong opinion. Using memmove() for large copy is a good idea. The main question is the "if (n <= INS1_MEMMOVE_THRESHOLD)" test. Is it slower if we always call memmove()? Previously, Python had a Py_MEMCPY() macro which also had such threshold. Basically, it's a workaround for compiler performance issues: #if defined(_MSC_VER) #define Py_MEMCPY(target, source, length) do { \ size_t i_, n_ = (length); \ char t_ = (void) (target); \ const char s_ = (void) (source); \ if (n_ >= 16) \ memcpy(t_, s_, n_); \ else \ for (i_ = 0; i_ < n_; i_++) \ t_[i_] = s_[i_]; \ } while (0) #else #define Py_MEMCPY memcpy #endif Hopefully, the macro now just became: /* Py_MEMCPY is kept for backwards compatibility, * see https://bugs.python.org/issue28126 / #define Py_MEMCPY memcpy And it's no more used. I recall a performance issues with GCC memcmp() builtin function (replacing the libc function during the compilation): https://bugs.python.org/issue17628#msg186012 See also: https://bugs.python.org/issue13134 * https://bugs.python.org/issue29782

https://gist.github.com/jeethu/19430d802aa08e28d1cb5eb20a47a470

Mean +- std dev: 10.5 us +- 1.4 us => Mean +- std dev: 9.68 us +- 0.89 us

It's 1.08x faster (-7.8%). It's small for a microbenchmark, usually an optimization should make a *microbenchmark* at least 10% faster.

For this optimization, I have no strong opinion.


Using memmove() for large copy is a good idea. The main question is the "if (n <= INS1_MEMMOVE_THRESHOLD)" test. Is it slower if we always call memmove()?


Previously, Python had a Py_MEMCPY() macro which also had such threshold. Basically, it's a workaround for compiler performance issues:

#if defined(_MSC_VER)
#define Py_MEMCPY(target, source, length) do {                          \
        size_t i_, n_ = (length);                                       \
        char *t_ = (void*) (target);                                    \
        const char *s_ = (void*) (source);                              \
        if (n_ >= 16)                                                   \
            memcpy(t_, s_, n_);                                         \
        else                                                            \
            for (i_ = 0; i_ < n_; i_++)                                 \
                t_[i_] = s_[i_];                                        \
    } while (0)
#else
#define Py_MEMCPY memcpy
#endif

Hopefully, the macro now just became:

/* Py_MEMCPY is kept for backwards compatibility,
 * see https://bugs.python.org/issue28126 */
#define Py_MEMCPY memcpy

And it's no more used.


I recall a performance issues with GCC memcmp() builtin function (replacing the libc function during the compilation): https://bugs.python.org/issue17628#msg186012

See also:
* https://bugs.python.org/issue13134
* https://bugs.python.org/issue29782

History
Date	User	Action	Args
2018-01-17 09:45:26	vstinner	set	recipients: + vstinner, rhettinger, pitrou, serhiy.storchaka, jeethu
2018-01-17 09:45:26	vstinner	set	messageid: <1516182326.27.0.467229070634.issue32534@psf.upfronthosting.co.za>
2018-01-17 09:45:26	vstinner	link	issue32534 messages
2018-01-17 09:45:25	vstinner	create